Intel® Inspector Help

MPI Analysis Workflow

To analyze the performance and correctness of an MPI application at the inter-process level, use the Intel® Trace Analyzer and Collector tool (located at <installdir>/itac directory after installation). The Intel Trace Analyzer and Collector attaches to the application through linkage (statically, dynamically, also through LD_PRELOAD or via the Intel Compiler -tcollect and -tcollect-filter options), or by using the itcpin tool. The tools collect information about events at the MPI level between processes and allow analyzing the performance and correctness of the MPI calls, deadlock detection, data layout errors, as well as risky or incorrect MPI constructs. The Intel Trace Analyzer and Collector data is correlated and aggregated across all processes and all nodes that participated in the execution run.

Beyond the inter-process level of MPI parallelism, the processes that make up the applications on a modern cluster often also use fork-join threading through OpenMP and Intel TBB. This is where the VTune Amplifier and the Intel Inspector should respectively be used to analyze the performance and correctness of an MPI application.

At the high level the analysis workflow consists of three steps:

  1. Use the amplxe-cl and inspxe-cl command-line tools to collect data about an application. By default, all processes are analyzed, but it is possible (and sometimes required for VTune Amplifier - there are certain collection technology limitations) to filter the data collection to limit it to a subset of processes. An individual result directory is created for each spawned MPI application process that was analyzed with MPI process rank value captured.

  2. Post-process the result, which is also called finalization or symbol resolution. This is done automatically for each result directory once the collection has finished.

  3. Open the content of each result directory through the GUI standalone viewer to analyze the data for the specific process. The GUI viewers are independent: VTune Amplifier and Intel Inspector have their own user-interfaces.

Note

MPI Analysis Limitations

There are certain limitations in the current MPI profiling support provided by the VTune Amplifier / Intel Inspector:

See Also