Example 1 Example 3 Back to GPTL home page

Example 2: Auto-instrumentation

This example is a C code compiled with auto-instrumentation enabled. It uses PAPI to count total instructions, and instructions per cycle. Note that function B has multiple parents, and GPTL reports the multiple parent information in the output produced by the call to GPTLpr_file().

main.c:

#include <gptl.h> #include <papi.h> int main () { void do_work (void); int i, ret; ret = GPTLsetoption (GPTL_IPC, 1); // Count instructions per cycle ret = GPTLsetoption (PAPI_TOT_INS, 1); // Print total instructions ret = GPTLsetoption (GPTLoverhead, 0); // Don't print overhead estimate ret = GPTLinitialize (); // Initialize GPTL ret = GPTLstart ("main"); // Start a manual timer do_work (); // Do some work ret = GPTLstop ("main"); // Stop the manual timer ret = GPTLpr_file ("outfile"); // Write output to "outfile" }
subs.c:
#include <unistd.h>

extern void A(void);
extern void AA(void);
extern void B(void);

void do_work ()
{
  A ();
  AA ();
  B ();
}

void A ()
{
  B ();
}

void AA ()
{
}

void B ()
{
  sleep (1);
}
Compile all but main.c with auto-instrumentation, then link and run. Useful auto-instrumentation of the main program is not possible, because the call to GPTLinitialize() must be done manually and needs to preceed all calls to GPTLstart and GPTLstop. Note that we need to link to the PAPI library due to use of the PAPI interface. The flag -rdynamic is added to link flags to assist in resolving auto-profiled names.
% gcc -c -g main.c -I${GPTL}/include
% gcc -g -finstrument-functions subs.c main.o -I${GPTL}/include -L${GPTL}/lib -lgptl -lpapi -rdynamic
% ./a.out
Important parts of the output file outfile look like this:
PAPI event multiplexing was OFF Description of printed events (PAPI and derived): GPTL_IPC: Instructions per cycle Instr completed PAPI events enabled (including those required for derived events): PAPI_TOT_INS PAPI_TOT_CYC Underlying timing routine was gettimeofday. Per-call utr overhead est: 1.65e-06 sec. Per-call PAPI overhead est: 3.4e-07 sec. If overhead stats are printed, roughly half the estimated number is embedded in the wallclock stats for each timer. Print method was most_frequent. If a '%_of' field is present, it is w.r.t. the first timer for thread 0. If a 'e6_per_sec' field is present, it is in millions of PAPI counts per sec. A '*' in column 1 below means the timer had multiple parents, though the values printed are for all calls. Further down the listing is more detailed information about multiple parents. Look for 'Multiple parent info' Stats for thread 0: Called Recurse Wallclock max min IPC TOT_INS e6_/_sec main 1 - 2.000 2.000 2.000 2.81e-01 18060 0.01 do_work 1 - 2.000 2.000 2.000 2.61e-01 12547 0.01 A 1 - 1.000 1.000 1.000 3.01e-01 4958 0.00 * B 2 - 2.000 1.000 1.000 1.09e-01 2812 0.00 AA 1 - 0.000 0.000 0.000 7.77e-01 488 244.00 Total calls = 6 Total recursive calls = 0 Multiple parent info (if any) for thread 0: Columns are count and name for the listed child Rows are each parent, with their common child being the last entry, which is indented Count next to each parent is the number of times it called the child Count next to child is total number of times it was called by the listed parents 1 A 1 do_work 2 B

Explanation of the above output

PAPI event "Total instructions executed" (PAPI_TOT_INS) and derived event "Instructions per cycle" (GPTL_IPC) were enabled. To compute instructions per cycle, GPTL made the PAPI library call to count total cycles (PAPI_TOT_CYC) in addition to the already-enabled event PAPI_TOT_INS. When GPTLpr_file() was called, it computed:
      GPTL_IPC = PAPI_TOT_INS / PAPI_TOT_CYC;

Note the asterisk in front of region "B". This indicates that region "B" had multiple parents. It is presented as a child of region "A" because that is the first region that invoked it. Information about other parents is presented after the main call tree. It shows that region "B" had two parents, "A", and "do_work". Each parent invoked "B" once, for a total of 2 calls.


Example 1 Example 3 Back to GPTL home page