Tutorial: Where to Add Parallelism with Intel® Advisor 2015 and a Fortran Sample

Mark Best Parallel Opportunities With Annotations

Intel Advisor annotations are either subroutine calls or macros, depending on the programming language. Annotations can be processed by your current compiler but do not change the computations of your application.

Use them to mark places in serial parts of your application that are good candidates for later replacement with parallel framework code that enables parallel execution.

The main types of Intel Advisor annotations mark the location of:

Intel Advisor provides example annotated source code for you (accessible in the Survey Report and Survey Source windows) that you can copy directly into your editor:

Annotation Code Snippet

Purpose

Iteration Loop, Single Task

Create a simple loop structure, where the task code includes the entire loop body. This common task structure is useful when only a single task is needed within a parallel site.

Loop, One or More Tasks

Create loops where the task code does not include all of the loop body, or complex loops or code that requires specific task begin-end boundaries, including multiple task end annotations. This structure is also useful when multiple tasks are needed within a parallel site.

Function, One or More Tasks

Create code that calls multiple tasks within a parallel site.

Pause/Resume Collection

Temporarily pause data collection and later resume it, so you can skip uninteresting parts of target execution to minimize collected data and speed up analysis of large applications. Add these annotations outside a parallel site.

Build Settings

Set build (compiler and linker) settings specific to the language in use.

Tip

Add Parallel Site and Task Annotations

Because we are trying to keep this tutorial short, we already added parallel site and task annotations to the sample code for you. All you need to do is uncomment them.

  1. Click Survey Report on the navigation toolbar to re-open the Survey Report.

  2. Right-click the data row with the first hot loop and choose Edit Source to open the nqueens_serial.f90 source file in an editor.

    program NQueens
    
    ! Solve the nqueens problem - How many ways can you put 'n' queens on an 
    !    n-by-n chess board without them being able to attack each other?
    !    Read http://en.wikipedia.org/wiki/Nqueens for background
    !
    ! Original C++ code by Ralf Ratering & Mario Deilmann
    ! Fortran version by Steve Lionel & others
    !
    ! To set command line argument in Visual Studio:
    !   1) Right click on the project name and select 'Properties'.
    !   2) Under 'Debugging', enter the argument (board size) in the 
    !      'Command Arguments' field.
    
    !ADVISOR SUITABILITY EDIT: To use the Advisor Annotations:
    !ADVISOR SUITABILITY EDIT: Uncomment the "use advisor_annotate" line below
    !use advisor_annotate
    
    implicit none
    
    integer :: nrOfSolutions = 0    ! Counts the number of solutions.
    integer :: size = 0             ! The board size; read from the command line.
    
    ! The number of correct solutions for each board size (1-15).
    integer, parameter, dimension(15) :: correct_solution = (/               &
                                                       1,      0,       0,   &
                                                2,    10,      4,      40,   &
                                               92,   352,    724,    2680,   &
                                            14200, 73712, 365596, 2279184 /)
    character(200) :: cmd_name                   ! Command/Program Name
    character(400) :: cmd_line                   ! The full command line
    integer        :: cmd_len                    ! The command-line length
    character(2)   :: cmd_arg                    ! The command arguments
    integer :: stat                              ! Library call status value
    integer :: time_start, time_end, count_rate  ! Timing variables
    integer :: nthreads = 1                      ! Number of threads to use.
    
    100 format(A,A,A)   
    101 format(A,I0,A,I0,A)
        
    ! Get the board size from the command line argument.
    if (command_argument_count() < 1) then
        call get_command_argument(0, cmd_name, cmd_len, status=stat)
        print 100, "Usage: ", cmd_name(1:cmd_len), " boardSize"
        size = 14
        print *, "Using default size of 14"
    else
        call get_command_argument(1, cmd_arg, status=stat)
        read(cmd_arg, *, iostat=stat) size
        ! Limit the board size.  If it is too small, the program may finish before
        ! suitability or other analyses can produce an accurate result.  If the
        ! board is too large, the program will take a long time.
        if ((stat /= 0) .or. (size < 4) .or. (size > 15)) then
            print *, "Error: boardSize must be between 4 and 15; resetting to 14"
            size = 14
        end if
    endif
    
    ! Time how long it takes to find all solution boards.
    print 101, "Starting nqueens solver for size ", size, " with ", nthreads, &
               " thread(s)."
    call system_clock(time_start)
    call solve()
    call system_clock(time_end, count_rate)
    
    ! Evaluate and report the result.
    print 101, "Number of solutions: ", nrOfSolutions
    if (nrOfSolutions == correct_solution(size)) then
        print *, "Correct Result!"
    else
        print *, "Incorrect Result!"
    end if
    print 101, "Calculations took ", (time_end-time_start) / (count_rate/1000), &
               "ms."
    ! End of Main Program
    
    contains
        ! Recursive routine to find all solutions on the board (the array 'queens')
        !   when we place the next queen at location (row, col). 
        ! This increments the global nrOfSolutions with each solution found.
        !
        ! Although the recusive call in this function may appear several times in 
        ! the survey results, the solve() function is a better-performing
        ! parallelization candidate, due to its coarser granularity.
        !
        !ADVISOR CORRECTNESS EDIT: In order to avoid data races and correctness
        !ADVISOR CORRECTNESS EDIT:    issues on the 'queens' array, we have to make
        !ADVISOR CORRECTNESS EDIT:    a private copy of it.  
        !ADVISOR CORRECTNESS EDIT: So rename 'queens' to 'queens_in' in the next two
        !ADVISOR CORRECTNESS EDIT:    lines
        recursive subroutine setQueen(queens, row, col)
            integer, intent(inout) :: queens(:)
            integer, intent(in)    :: row, col
            integer :: i
            integer, volatile :: j
    
            !ADVISOR CORRECTNESS EDIT: Uncomment the declaration of queens, and the
            !ADVISOR CORRECTNESS EDIT:   assignment statement, which will creates
            !ADVISOR CORRECTNESS EDIT:   a private copy of in_queens.
            !integer :: queens(ubound(queens_in, dim=1))
            !queens = queens_in
    
            do i = 1, row-1
               ! Check for vertical attacks.
               if (queens(i) == col) return
               ! Check for diagonal attacks.
               if (abs(queens(i)-col) == (row-i)) return
            end do
    
            ! Position is safe; set the queen.
            queens(row) = col
    
            if (row == size) then
               !ADVISOR CORRECTNESS EDIT: Uncomment the following 2 lock annotations
               !ADVISOR CORRECTNESS EDIT:   to avoid a datarace on nrOfSolutions.
               !call annotate_lock_acquire(0)
               nrOfSolutions = nrOfSolutions + 1
               !call annotate_lock_release(0)
            else
               ! Try to fill next row.
               do j = 1, size
                  call setQueen(queens, row+1, j)
               end do
            end if
        end subroutine SetQueen
    
    
        ! Find all solutions for the nQueens problem on a size x size chessboard.
        ! On return, nrOfSolutions = number of solutions.
        !
        !ADVISOR COMMENT: When surveying, this is the top CPU-consuming function 
        !ADVISOR COMMENT:   below the main function.  This subroutine's do loop is
        !ADVISOR CONTENT:   an excellent candidate for parallelization.
        subroutine solve()
          integer :: i
          integer, allocatable :: queens(:)   ! Array representing the chess board.
    
          allocate(queens(size))
          queens = 0
    
          !ADVISOR SUITABILITY EDIT: Uncomment the three annotation calls below to
          !ADVISOR SUITABILITY EDIT:   model parallelizing the body of this do loop.
          !call annotate_site_begin("solve")
          do i = 1, size
             !call annotate_iteration_task("setQueen")
             ! Try all positions in first row.
             call SetQueen(queens, 1, i)
          end do
          !call annotate_site_end()
    
          deallocate(queens)
        end subroutine solve
    
    end program
    
  3. Search for ADVISOR SUITABILITY EDIT and follow the directions in the sample code. Make four total edits: Uncomment the !use advisor_annotate line near the top and three annotation lines.

    Tip

    Now is also a good time to simply explore our fully commented sample code.

  4. Save your edits.

Rebuild Target in Release Mode

In the terminal session:

  1. Change directory to the nqueens/ directory (where the zipped sample files were extracted to).

  2. Type make 1_nqueens_serial to rebuild the target.

Key Terms

annotations, parallel site, synchronization, target, task

Next Step

Predict Maximum Parallel Performance Speedup


Submit feedback on this help topic