Tutorial: Where to Add Parallelism with Intel® Advisor 2015 and a C/C++ Sample

Predict Maximum Parallel Performance Speedup

To predict the maximum parallel performance speedup of your target based on the added Intel Advisor annotations:

Each step is described more fully below.

Collect Suitability Data

Under 3. Check Suitability in the Advisor Workflow, click the Collect Suitability Data button button to collect Suitability data while the target executes.

During the Suitability analysis, the Intel Advisor displays a window similar to the following.
Collect Suitability data window

Note

You can ignore any warnings about missing debugging symbols during this tutorial.

View Suitability Report

After the Suitability tool finalizes the data, the Intel Advisor displays a window similar to the following.
Suitability Report window

1

The Maximum Program Gain For All Sites value shows the predicted maximum speedup of our target based on Intel Advisor annotations and currently selected modeling parameters. Over a 6x speedup is good!

2

This grid shows various metrics for each parallel site based on currently selected modeling parameters, including the site's Impact to Program Gain. Our target has a single parallel site - the solve parallel site, as identified in the Site Label column.

3

Use these modeling parameter drop-downs to experiment with different hardware configurations and parallel frameworks.

Drop-Down

Set to This

Target System

  • CPU to model predicted maximum speedup when executing all parallel sites on host CPUs

  • Intel Xeon Phi to model predicted maximum speedup when executing all parallel sites on Intel® Xeon™ Phi coprocessors

  • Offload to Intel Xeon Phi to model predicted maximum speedup when executing:

    • Serial parts of our target on host CPUs

    • Parallel sites, on a site-by-site basis, on host CPUs or Intel Xeon Phi coprocessors

Threading Model

Intel TBB, Intel Cilk Plus, OpenMP, Microsoft TPL, or Other to model predicted maximum speedup using the parallel framework

If

Then

Target System Drop-Down = This

And Offload to Intel Xeon Phi Checkbox in 2 = This

CPU Count Drop-Down = This

And Coprocessor Threads Drop-Down =This

CPU

Hidden

A modeling number of CPUs that will work in parallel for all parallel sites in your target

Hidden

Intel Xeon Phi

Hidden

Hidden

A modeling number of coprocessor threads that will work in parallel for all parallel sites in your target

Offload to Intel Xeon Phi

Selected

A modeling number of CPUs that will work in parallel for this parallel site and all other sites not selected for offload.

A modeling number of coprocessor threads that will work on the Intel Xeon Phi coprocessor for this parallel site and all other parallel sites selected for offload

Deselected

A modeling number of CPUs that will work in parallel for this parallel site and all other sites not selected for offload.

A modeling number of coprocessor threads that will work on the Intel Xeon Phi coprocessor for this parallel site and all other parallel sites selected for offload

4

The Scalability of Maximum Site Gain diagram graphically shows the predicted maximum speedup for the solve parallel site in different scaling scenarios based on currently selected modeling parameters.

A Bulls-Eye in This Area

Means This

Red

Parallelization is not beneficial - and may even cause performance degradation. Consider removing or modifying annotations, or significantly refactoring the corresponding hotspot if you want to parallelize it at any cost.

Yellow

The predicted maximum speedup may not be enough to justify the effort needed to refactor and maintain your application. Consider investigating.

Green

Parallel performance - and power efficiency - may improve significantly.

5

Use the Loop Iterations (Task) Modeling sliders and the Apply button to experiment with different iteration counts and instance durations.

6

Use the Runtime Modeling checkboxes to experiment with predicted maximum speedup if you plan to use parallel framework code constructs to address parallel overhead, lock contention, or task chunking; or if you plan to tune parallel code after you implement parallelism.

7

This area shows issues that generally prevent better parallel performance. A green bar is good; it means this issue is not negatively impacting predicted maximum speedup. A yellow or red bar is not good.

8

The Site Details area shows information about the solve parallel site and the setQueen task within that parallel site.

Notice how your screen changes if you choose a Target System of Intel Xeon Phi or Offload to Intel Xeon Phi.
Suitability Report window

1

The Scalability of Maximum Site Gain diagram graphically shows the predicted performance of the manycore parallel coprocessor and its host CPUs. For many applications, the number of task instances does not scale enough to fully utilize the many cores of the parallel coprocessor. An application that is ready for an Intel Xeon Phi coprocessing system has a bulls-eye in the green part of the diagram. A bulls-eye in the gray part of the diagram indicates an application that is not ready for an Intel Xeon Phi coprocessing system; in such cases, try modeling another type of Target System.

2

Use the Intel Xeon Phi Advanced Modeling checkbox, fields, and the Apply button to model the expected speedup if you plan to modify your parallel code to improve vector parallel execution.

Tip

These modeling parameters are fully explained in Intel Advisor Help.

Try experimenting now to see the impact of various modeling parameters on predicted maximum speedup throughout the Suitability Report.

View Summary Window

Click Summary on the navigation toolbar to re-open the Summary window. Notice the Intel Advisor added more data to this dashboard.
Summary window

1

This area summarizes the maximum parallel performance speedup. It also provides easy access to the Suitability Report window and your sources. Try clicking the Maximum Site Gain link now. Then return to the Summary window and try clicking the Parallel Site link.

The question marks for detected Correctness Problems mean you have not yet collected any Correctness data.

2

In addition to the newly acquired information from the Suitability Report, the dashboard still shows data from the Survey Report.

3

You now have collection data from two of the three Intel Advisor analysis tools.

Key Terms

annotations, parallel site, target, task

Next Step

Predict Parallel Data Sharing Problems


Submit feedback on this help topic