Trace Process Issues with strace

This week and last, we have been discussing ways to monitor server processes, applications, and network activity. Presumably, these tools will tell you that your system is running fine or lead you to the software causing the problem. If it is software you are intentionally running that is causing the problem, you may need to troubleshoot it and run diagnostics to find out why it is giving your dedicated server a hard time.

One very useful Linux tool for software diagnostics and debugging is strace. With it, you can gather detailed information about the system calls and signals received by a program. This is particularly useful for a program that crashes but does not provide useful error information in its output.

The typical command string for strace will look like:

$ strace -o trace-info.txt /bin/command

In the example “trace-info.txt” is the name of the file that will store the output of system calls left by the program, and you would replace “/bin/command” with that path to a real command.

You can also strace an individual program process (useful for Apache web server child processes) using a command like the following:

$ strace -p 12345 -o /home/admin/apache-debug

strace output typically appears in the following format:

fstat64(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC)
getdents64(3, /* 18 entries */, 4096) = 496
getdents64(3, /* 0 entries */, 4096) = 0

Each line in the strace output will have the system call name, which is followed in parentheses by the arguments and return value. You can have an strace printed on the screen, but since the output can often be long, it is easier to output it to a file. As you can see by the example, the output may not mean anything to you, but with free and open source software, you can often pass that information on to the developer, who can then fix the possible bug or give you a solution.