Profiling Tools

Profiling Tools

The following open source packages are either related to, or perform sampled profiling on Linux.


This is the standard Linux profiler. It can generate approximate call-graph profiles. It doesn’t appear to interact well with threads or dynamic libraries. Requires relinking for flat profile and recompilation for call-graph profile.


An analogous but separate facility for displaying shared library profiles.


 A thread-aware profiler for Linux based on gcc-based code instrumentation. A while ago we found it nontrivial to get running on many Linux platforms, but its maintenance status has recently improved.


 A system wide profiling tool. Requires a kernel module.


Another system-wide profiler. Based on the Oprofile kernel module.

Perfmon and pfmon tool

A library and command to access hardware profile counters on Itanium. We rely on this for hardware event support. By itself, it can be used to count hardware events in a program region, etc.


This is a set of profiling utilities, currently targeting only linux. It includes a simple command line profiling tool, with the following characteristics:

  • It is intended to be easy to install and use. No kernel modules or changes are required for basic use. It can be installed and used without root access.
  • It supports profiling of dynamically linked code and includes information on time spent in dynamic libraries.
  • It supports profiling of multithreaded applications.
  • It generates profiles for all subprocesses started from a shell. Thus it easily can be used to profile application with multiple processes.
  • It tries to generate symbolic output. This is usually successful for the main program, if that has debug information, i.e. was compiled with -g. If not, you may need a debugger to fully interpret the results. However the raw output will often give you a rough idea of where processor time is spent.
  • It currently generates “flat” profiles. The output tells you roughly how much time was spent in a given instruction, line, or function f. By default this does not include time spent in functions called by f, but on platforms supported by libunwind a possible alternative is to include callees in profile counts, thus recovering some gprof-like functionality.
  • Linux kernel functions are not profiled separately. By default, time spent in the kernel is credited to the library function which made the kernel call.
  • On Itanium, it can be used to generate hardware-event-based profiles. For example, it can tell you were most of the cache misses occur.


This site uses Akismet to reduce spam. Learn how your comment data is processed.