Intel® Trace Collector 9.1 Update 2 User and Reference Guide

Collecting Lightweight Statistics

Intel® Trace Collector can gather and store statistics about the function calls and their communication. These statistics are gathered even if no trace data is collected, so it is a good starting point for trying to understand an unknown application that might produce an unmanageable trace.

Usage Instructions

To collect this light-weight statistics for your application, set the following environment variables before tracing:

$ export VT_STATISTICS=ON
$ export VT_PROCESS=OFF

Alternatively, set the VT_CONFIG environment variable to point to the configuration file:

# Enable statistics gathering
STATISTICS ON
# Do not gather trace data
PROCESS 0:N OFF
$ export VT_CONFIG=<configuration_file_path>/config.conf

The statistics is written into the *.stf file. Use the stftool to convert the data to the ASCII text with --print-statistics. For example:

$ stftool tracefile.stf --print-statistics

TIP

The resulting output has easy-to-process format, so you can use text processing programs and scripts such as awk*, perl*, and Microsoft Excel* for better readability. A perl script convert-stats with this capability is provided in the examples folder.

Output Format

Each line contains the following information:

And the following statistics:

Within each line the fields are separated by colons.

Receiver is set to 0xffffffff for file operations and to 0xfffffffe for collective operations. If message size equals 0xffffffff the only defined value is 0xfffffffe to mark it as a collective operation.

The message size is the number of bytes sent or received per single message. With collective operations the following values (buckets of message size) are used for individual instances:

Value

Process-local bucket

Is the same value on all processes?

MPI_Barrier

0

Yes

MPI_Bcast

Broadcast bytes

Yes

MPI_Gather

Bytes sent

Yes

MPI_Gatherv

Bytes sent

No

MPI_Scatter

Bytes received

Yes

MPI_Scatterv

Bytes received

No

MPI_Allgather

Bytes sent + received

Yes

MPI_Allgatherv

Bytes sent + received

No

MPI_Alltoall

Bytes sent + received

Yes

MPI_Alltoallv

Bytes sent + received

No

MPI_Reduce

Bytes sent

Yes

MPI_Allreduce

Bytes sent + received

Yes

MPI_Reduce_Scatter

Bytes sent + received

Yes

MPI_Scan

Bytes sent + received

Yes

Message is set to 0xffffffff if no message was sent, for example, for non-MPI functions or functions like MPI_Comm_rank.

If more than one communication event (message or collective operation) occur in the same function call (for example in MPI_Waitall, MPI_Waitany, MPI_Testsome, MPI_Sendrecv etc.), the time in that function is evenly distributed over all communications and counted once for each message or collective operation. Therefore, it is impossible to compute a correct traditional function profile from the data referring to such function instances (for example, those that are involved in more than one message per actual function call). Only the Total execution time including callee times and the Total execution time excluding callee times can be interpreted similar to the traditional function profile in all cases.

The number of involved processes is negative for received messages. If messages were received from a different process/thread it is -2.

Statistics are gathered on the thread level for all MPI functions, and for all functions instrumented through the API or compiler instrumentation.

See Also

Tracing User Defined Events
Using stftool
Intel® Trace Collector API