Example 3 Example 5 Back to GPTL home page

Example 4: Auto-instrumentation of a C++ code

This example is a C++ code compiled with auto-instrumentation enabled. Note that the constructor and destructor for X are inside the class definition, and outside the class definition for Y.

prof.cc:

#include <gptl.h> #include <myclasses.h> int main () { X *x; Y *y; int ret; ret = GPTLinitialize (); ret = GPTLstart ("total"); x = new (X); x->func (1.2); x->func (1); delete (x); y = new (Y); y->func (1.2); y->func (1); delete (y); ret = GPTLstop ("total"); ret = GPTLpr (0); }
myclasses.h:
class Base
{
 public:
  Base ();
  ~Base ();
};

Base::Base ()
{
}

Base::~Base ()
{
}

class X: Base
{
 public:
  X () 
  {
  }
  ~X() 
  {
  }
  void func (int x)
  {
  }
  void func (double x)
  {
  }
};

class Y: Base
{
 public:
  Y ();
  ~Y();
  void func (int);
  void func (double);
};

Y::Y ()
{
}

Y::~Y()
{
}

void Y::func (int x)
{
}

void Y::func (double x)
{
}
Compile prof.cc with auto-instrumentation, then link and run. We need to link with -lunwind because GPTL was built to use libunwind to automatically unwind the call stack. Configuring with --disable-libunwind can also work and requires no extra "-l" arguments. GPTL will then use the system function backtrace() to unwind the call stack. But libunwind is more modern and tends to produce a more accurate backtrace. Adding the link flag -rdynamic in either case helps produce a more accurate backtrace.
% g++ -finstrument-functions prof.cc -o prof -I${GPTL}/include -L${GPTL}/lib -lgptl -lunwind -rdynamic
% ./prof
Output file timing.0 looks like this:
Stats for thread 0: Called Recurse Wall max min selfOH parentOH total 1 - 3.46e-04 3.46e-04 3.46e-04 0.000 0.000 _ZN1XC1Ev 1 - 3.00e-05 3.00e-05 3.00e-05 0.000 0.000 * _ZN4BaseC1Ev 2 - 0.00e+00 0.00e+00 0.00e+00 0.000 0.000 _ZN1X4funcEd 1 - 0.00e+00 0.00e+00 0.00e+00 0.000 0.000 _ZN1X4funcEi 1 - 0.00e+00 0.00e+00 0.00e+00 0.000 0.000 _ZN1XD1Ev 1 - 2.50e-05 2.50e-05 2.50e-05 0.000 0.000 * _ZN4BaseD1Ev 2 - 0.00e+00 0.00e+00 0.00e+00 0.000 0.000 _ZN1YC1Ev 1 - 1.00e-06 1.00e-06 1.00e-06 0.000 0.000 _ZN1Y4funcEd 1 - 0.00e+00 0.00e+00 0.00e+00 0.000 0.000 _ZN1Y4funcEi 1 - 0.00e+00 0.00e+00 0.00e+00 0.000 0.000 _ZN1YD1Ev 1 - 0.00e+00 0.00e+00 0.00e+00 0.000 0.000 Overhead sum = 9.75e-07 wallclock seconds Total calls = 13 thread 0 long name translations (empty when no auto-instrumentation): Multiple parent info for thread 0: Columns are count and name for the listed child Rows are each parent, with their common child being the last entry, which is indented. Count next to each parent is the number of times it called the child. Count next to child is total number of times it was called by the listed parents. 1 _ZN1XC1Ev 1 _ZN1YC1Ev 2 _ZN4BaseC1Ev 1 _ZN1XD1Ev 1 _ZN1YD1Ev 2 _ZN4BaseD1Ev

Explanation of the above output

Note that the entries are presented in name-mangled form. It is possible to use an API to demangle the names, but it was decided to leave the entries in their mangled form because often the demangled names can be extremely long. Printing the fully demangled form can therefore make reading the output difficult due to wrapped lines. We recommend that instead users employ the ubiquitous tool c++filt to demangle the names.

Otherwise the output is similar to other examples presented here. Parent/child relationships are preserved via indentation. A "*" in column 1 means means that the region (routine in this case) has multiple parents. The multiple parent information is presented below the main output.


Example 3 Example 5 Back to GPTL home page