The Jprof Profiler


Introduction

Jprof is a profiling tool. I am writing it because I need to find out where mozilla is spending its time, and there do not seem to be any profilers for Linux that can handle threads and/or shared libraries. This code is based heavily on Kipp Hickman's leaky.

Operation

Jprof operates by installing a timer which periodically interrupts mozilla. When this timer goes off, the jprof code inside mozilla walks the function call stack to determine which code was executing. By collecting a large number of these call stacks, it is possible to deduce where mozilla is spending its time.

Use

First, check out the jprof source code since it is not a part of the default SeaMonkeyAll CVS tag. To do this do:
cvs co mozilla/tools/jprof

Next, enable jprof and rebuild mozilla:

configure --enable-jprof
make
Now you can run jprof. To do this the JPROF_FLAGS environment variable must be set. As a simple example:
cd dist/bin
env JPROF_FLAGS=JP_START LD_LIBRARY_PATH=. ./mozilla-bin

The JPROF_FLAGS environment variable can be composed of several substrings which have the following meanings:

For example you could say:
setenv JPROF_FLAGS "JP_START JP_FIRST=3 JP_PERIOD=0.025"

When mozilla runs with jprof enabled, it writes to a file called jprof-log for each timer signal and it writes to jprof-map when it exits. Sending a SIGUSR1 to mozilla (kill -USR1) will cause the timer signals to stop and jprof-map to be written, but it will not close jprof-log. Combining SIGUSR1 with the JP_DEFER option allows profiling of one sequence of actions. (After a SIGUSR1, sending another timer signal (SIGPROF or SIGALRM) could be used to continue writing data to the same output, although this has not been tested.)

The jprof executable is used to turn these files into readable output. To do this jprof needs the name of the mozilla binary and the log file. It deduces the name of the map file:

./jprof mozilla-bin ./jprof-log > tmp.html
This will generate the file tmp.html which you should view in a web browser.

Interpretation

The Jprof output is split into a flat portion and a hierarchical portion. There are links to each section at the top of the page. It is typically easier to analyze the profile by starting with the flat output and following the links contained in the flat output up to the hierarchical output.

Flat output

The flat portion of the profile indicates which functions were executing when the timer was going off. It is displayed as a list of functions names on the right and the number of times that function was interrupted on the left. The list is sorted by decreasing interrupt count. For example:
Count   Function Name
 36     _init
 30     SelectorMatches(nsIPresContext *, nsCSSSelector *, nsIContent *, int)
 22     _init
 22     _dl_lookup_symbol
In general, the functions with the highest count are the functions which are taking the most time.

The function names are linked to the entry for that function in the hierarchical profile, which is described in the next section.

Hierarchical output

The hierarchical output is divided up into sections, with each section corresponding to one function. A typical section looks something like this:
                639 nsWidget::OnMotionNotifySignal(_GdkEventMotion *)
                 11 nsWidget::OnButtonReleaseSignal(_GdkEventButton *)
                  2 nsWidget::OnButtonPressSignal(_GdkEventButton *)
 76787   0      652 nsWidget::DispatchMouseEvent(nsMouseEvent &)
                652 nsWidget::DispatchWindowEvent(nsGUIEvent *)
This block corresponds to the function nsWidget::DispatchMouseEvent. You can determine this because it is preceded by three numbers while the other function names are preceded by only one number. These three numbers have the following meaning. The number on the left (76787) is the index number, and is not important. The center number (0) is the number of times this function was interrupted by the timer. The last number (652) is the number of times this function was in the call stack when the timer went off. For our example we can see that our function was in the call stack for 652 interrupt ticks, but we were never the function that was running when the interrupt arrived.

The functions listed above the line for nsWidget::DispatchMouseEvent, are the functions which are calling this function. The numbers to the left of these function names are the number of times these functions were in the call stack as callers of nsWidget::DispatchMouseEvent. In our example, we were called 11 times by the function nsWidget::OnButtonReleaseSignal.

The functions listed below the list for nsWidget::DispatchMouseEvent, are the functions this function was calling when the timer went off. The numbers to the left of the function names are the number of times it was in the call stack as a callee of our function when the timer went off. In our example, each of the 652 times the timer went off and nsWidget::DispatchMouseEvent was in the call stack it was calling nsWidget::DispatchWindowEvent.

Bugs

Jprof has only been tested under Red Hat Linux 6.0, 6.1, and 6.2. It does not work under 6.0, though it is possible hack up the source code and make it work there. The way I determine the stack trace from inside the signal handler is tightly bound to the version of glibc that is running. If you know of a more portable way to get this information please let me know.

Update

Ben Bucksch reports that installing the Red Hat 6.1 glibc rpms on a Red Hat 6.0 system allows jprof to work, and does not seem to break anything except gdm (the Gnome login program), and that can be fixed by installing the RH 6.1 gdb rpm.

Update

David Baron reports that jprof works under RedHat 6.0 if one uncomments the #define JPROF_PTHREAD_HACK near the beginning of libmalloc.cpp.