Next: B. Appendix: PCL Counters
Up: ADAPTOR Profiling Guide
Previous: Bibliography
  Contents
The configuration file for performance monitoring (e.g. pm_config)
can be set for profiling in the following way:
export PM=pm_config
<executable> -pm=pm_config <executable>
export PM=pm_config
mpirun -np 4 <executable> -pm=profile mpirun -np 4 <executable>
The configuration file specifies which
performance events should be counted and how its value
will be printed in the summary file pm.out.
Table 15 lists the possible entries
in the configuration file.
Table 15:
Possilbe entries in the configuration file.
# ... |
comment line |
CN = ... |
definition of a new counter |
CN = C1 + C2 |
add counter values |
CN = C1 - C2 |
subtract counter values |
CN = C1 * C2 |
multiply counter values |
CN = C1 / C2 |
divide counter values |
CN = C1 # C2 |
CN = C1 / (C1 + C2) |
CN = C1 ~ C2 |
CN = (C2 - C1) / C2 |
CN = C1 ' |
CN = C1 / WALL_TIME |
CN = C1 " |
CN = C1 / CPU_TIME |
<x>CN <kind> <width> <precision> |
enabling counting |
SAMPLING <file> <line> <rate> <size> |
enables data profiling |
|
- BRUTTO specifies that the counted values
are taken for the whole subprogram regardless whether
other subprograms are called.
- NETTO specifies that the counting is done
exclusively for the subprogram.
Table 16:
Rating of counter values
COUNTER |
|
COUNTER |
mCOUNTER |
milli |
COUNTER * 1000.0 |
kCOUNTER |
kilo |
COUNTER / 1024.0 |
MCOUNTER |
Mega |
COUNTER / (1024.0 * 1024.0) |
%COUNTER |
percent |
COUNTER * 100.0 |
|
- The counter prefixed with a m specifies that
it will be rated for milli.
- The counter prefixed with a k specifies that
it will be rated for kilo.
- The counter prefixed with a M specifies that
it will be rated for Mega.
- The counter prefixed with a % specifies that
it will be rated for percent.
- field-width and precision specify how the rated value
is printed (corresponds to width:precision in
C).
Table 17 lists the
default events that are supported by ADAPTOR. For these events
the ADAPTOR runtime system can do the counting on its own.
Counters for sending and receiving are only available for
HPF programs compiled by ADAPTOR for the distributed memory model.
Table 17:
Default events supported by ADAPTOR.
CALLS |
call of regions, subprograms |
WALL_TIME |
walltime ticks |
SEND_CALLS |
number of sends |
RECV_CALLS |
number of receives |
SEND_BYTES |
number of sent bytes |
RECV_BYTES |
number of received bytes |
|
Table 18 lists the
hardware performance counters that are defined by default.
These performance counters should always be known
regardless whether PAPI, PCL or perfctr is used
as PM library.
Table 18:
General performance counters of ADAPTOR.
TOT_CYC |
total cycles |
TOT_INS |
total instructions |
LD_INS |
load instructions |
SR_INS |
store instructions |
FP_INS |
floating-point instructions |
L1_MISS |
Level-1 data cache misses |
L2_MISS |
Level-2 data cache misses |
TLB_MISS |
TLB misses |
|
Table 19 summarizes the derived
counters that are defined by default.
Table 19:
General derived counters of ADAPTOR.
CPU_TIME |
derived from TOT_CYC |
FLOPS |
derived from FP_INS |
|
Next: B. Appendix: PCL Counters
Up: ADAPTOR Profiling Guide
Previous: Bibliography
  Contents
Thomas Brandes
2004-03-19