Tutorial: Where to Add Parallelism with Intel® Advisor 2015 and a C/C++ Sample

Mark Best Parallel Opportunities With Annotations

Intel Advisor annotations are either subroutine calls or macros, depending on the programming language. Annotations can be processed by your current compiler but do not change the computations of your application.

Use them to mark places in serial parts of your application that are good candidates for later replacement with parallel framework code that enables parallel execution.

The main types of Intel Advisor annotations mark the location of:

Intel Advisor provides example annotated source code for you (accessible in the Survey Report and Survey Source windows) that you can copy directly into your editor:

Annotation Code Snippet

Purpose

Iteration Loop, Single Task

Create a simple loop structure, where the task code includes the entire loop body. This common task structure is useful when only a single task is needed within a parallel site.

Loop, One or More Tasks

Create loops where the task code does not include all of the loop body, or complex loops or code that requires specific task begin-end boundaries, including multiple task end annotations. This structure is also useful when multiple tasks are needed within a parallel site.

Function, One or More Tasks

Create code that calls multiple tasks within a parallel site.

Pause/Resume Collection

Temporarily pause data collection and later resume it, so you can skip uninteresting parts of target execution to minimize collected data and speed up analysis of large applications. Add these annotations outside a parallel site.

Build Settings

Set build (compiler and linker) settings specific to the language in use.

Tip

Add Parallel Site and Task Annotations

Because we are trying to keep this tutorial short, we already added parallel site and task annotations to the sample code for you. All you need to do is uncomment them.

  1. Click Survey Report on the navigation toolbar to re-open the Survey Report.

  2. Right-click the data row with the first hot loop and choose Edit Source to open the nqueens_serial.cpp source file in an editor.

    // [DESCRIPTION]
    // Solve the nqueens problem  - how many positions of queens can fit on a chess
    // board of a given size without attacking each other.
    //
    // [RUN]
    // To set the board size in Visual Studio, right click on the project,
    // select Properies > Configuration Properties > General > Debugging.  Set
    // Command Arguments to the desired value.  14 has been set as the default.
    //
    // [EXPECTED OUTPUT]
    // Depends upon the board size.
    //
    // Board Size   Number of Solutions
    //     4                2
    //     5               10
    //     6                4
    //     7               40
    //     8               92
    //     9              352
    //    10              724
    //    11             2680
    //    12            14200
    //    13            73712
    //    14           365596
    //    15          2279184
    
    #include <iostream>
    #include <cstdlib>
    
    #ifdef _WIN32
    #include <windows.h>
    #include <mmsystem.h>
    #define TimeType        DWORD
    #define GET_TIME(t)     t = timeGetTime()
    #define TIME_IN_MS(t)   (t)
    #else
    #include <sys/time.h>
    #define TimeType        struct timeval
    #define GET_TIME(t)     gettimeofday((&t), NULL)
    #define TIME_IN_MS(t)   (((t).tv_sec * 1000000 + (t).tv_usec) / 1000)
    #endif
    
    #include <cilk/cilk.h>
    #include <cilk/reducer_opadd.h>
    //ADVISOR COMMENT: This is a Cilk version of the nqueens application
    //ADVISOR SUITABILITY EDIT: Uncomment the #include <advisor-annotate.h> line to
    //                          use Advisor annotations.
    //#include <advisor-annotate.h>
    
    using namespace std;
    
    cilk::reducer_opadd<int> nrOfSolutions; // Counts the number of solutions.
    int size = 0;              // The board-size; read from command-line
    
    // The number of correct solutions for each board size.
    const int correctSolution[16] = {     0,     1,      0,       0, //  0 -  3
                                          2,    10,      4,      40, //  4 -  7
                                         92,   352,    724,    2680, //  8 - 11
                                      14200, 73712, 365596, 2279184  // 12 - 15
    };
    
    
    /*
     * Recursive function to find all solutions on a board, represented by the
     * argument "queens", when we place the next queen at location (row, col).
     *
     * On Return: nrOfSolutions has been increased by the number of solutions for
     *            this board.
     */
    void setQueen(int queens[], int row, int col) {
        //ADVISOR COMMENT: The accesses to the "queens" array in this function
        //                 create an incidental sharing correctness issue.
        //ADVISOR COMMENT: Each task should have its own copy of the queens array.
        //ADVISOR COMMENT: Look at the solve() function to see how to fix this.
    
        // Check all previously placed rows for attacks.
        for (int i=0; i < row; i++) {
            // Check vertical attacks.
            if (queens[i] == col) {
                return;
            }
            // Check diagonal attacks.
            if (abs(queens[i] - col) == (row - i) ) {
                return;
            }
        }
    
        // Column is ok, set the queen.
        //ADVISOR COMMENT: See comment at top of function.
        queens[row]=col;
    
        if (row == (size - 1)) {
            //ADVISOR CORRECTNESS EDIT: Uncomment the following two LOCK
            //         annotations to lock the access to nrOfSolutions and
            //         eliminate the race condition.
            //ANNOTATE_LOCK_ACQUIRE(0);
    
            //ADVISOR COMMENT: This is a race condition because multiple tasks may
            //                 try and increment nrOfSolutions at the same time.
            nrOfSolutions++;  // Placed final queen, found a solution!
    
            //ANNOTATE_LOCK_RELEASE(0);
        } else {
            // Try to fill next row.
            for (int i=0; i < size; i++) {
                setQueen(queens, row+1, i);
            }
        }
    }
    
    
    /*
     * Find all solutions for nQueens problem on size x size chessboard.
     *
     * On Return: nrOfSolutions = number of solutions for size x size chessboard.
     */
    void solve() {
    
        //ADVISOR COMMENT: When surveying, this is the top function below main.
        //                 This for() loop is a candidate for parallelization.
    
        //ADVISOR CORRECTNESS EDIT: Comment out the following declaration of the
        //                          queens array.
        //int *queens = new int[size]; // Array of queens on the board.
    
        //ADVISOR SUITABILITY EDIT: Uncomment the three annotations below to model
        //                          parallelizing the body of this for() loop.
        //ANNOTATE_SITE_BEGIN(solve);
        cilk_for (int i=0; i < size; i++) {
            //ANNOTATE_ITERATION_TASK(setQueen);
    
            //ADVISOR CORRECTNESS EDIT: Uncomment the declaration of queens.  This
            //                          creates a separate array for each recursion
            //                          eliminating the incidental sharing.
            int * queens = new int[size]; // Array of queens on the chess board.
    
            //ADVISOR COMMENT: The call below exhibits incidental sharing when all
            //                 of the tasks use the same copy of "queens".
            // Try all positions in first row.
            setQueen(queens, 0, i);
    
            //ADVISOR CORRECTNESS EDIT: Uncomment the deletion of the queens array.
            delete [] queens;
        }
        //ANNOTATE_SITE_END();
    
        //ADVISOR CORRECTNESS EDIT: Comment out the deletion of the queens array.
        //delete [] queens;
    }
  3. Search for ADVISOR SUITABILITY EDIT and follow the directions in the sample code. Make four total edits: Uncomment the #include line near the top and three annotation lines.

    Tip

    Now is also a good time to simply explore our fully commented sample code.

  4. Save your edits.

Rebuild Target in Release Mode

In the terminal session:

  1. Change directory to the nqueens_Advisor/ directory (where the zipped sample files were extracted to).

  2. Type make 1_nqueens_serial to rebuild the target.

Key Terms

annotations, parallel site, synchronization, target, task

Next Step

Predict Maximum Parallel Performance Speedup


Submit feedback on this help topic