Tutorial: Where to Add Parallelism with Intel® Advisor 2015 and a Fortran Sample
Intel Advisor annotations are either subroutine calls or macros, depending on the programming language. Annotations can be processed by your current compiler but do not change the computations of your application.
Use them to mark places in serial parts of your application that are good candidates for later replacement with parallel framework code that enables parallel execution.
The main types of Intel Advisor annotations mark the location of:
A parallel site. A parallel site is a region of code that contains one or more tasks that may execute in parallel. An effective parallel site typically contains a hotspot that consumes application execution time. To distribute these frequently executed instructions to different tasks that can run at the same time, the best parallel site is not usually located at the hotspot, but higher in the call tree.
One or more parallel tasks within a parallel site. A task is a portion of time-consuming code with data that can be executed in one or more parallel threads to distribute work.
Locking synchronization, where mutual exclusion of data access must occur in the parallel application.
Intel Advisor provides example annotated source code for you (accessible in the Survey Report and Survey Source windows) that you can copy directly into your editor:
Annotation Code Snippet |
Purpose |
---|---|
Iteration Loop, Single Task |
Create a simple loop structure, where the task code includes the entire loop body. This common task structure is useful when only a single task is needed within a parallel site. |
Loop, One or More Tasks |
Create loops where the task code does not include all of the loop body, or complex loops or code that requires specific task begin-end boundaries, including multiple task end annotations. This structure is also useful when multiple tasks are needed within a parallel site. |
Function, One or More Tasks |
Create code that calls multiple tasks within a parallel site. |
Pause/Resume Collection |
Temporarily pause data collection and later resume it, so you can skip uninteresting parts of target execution to minimize collected data and speed up analysis of large applications. Add these annotations outside a parallel site. |
Build Settings |
Set build (compiler and linker) settings specific to the language in use. |
Annotations are fully explained in Intel Advisor Help.
When adding annotations to your own application, remember to include the annotations definitions, such as advisor-annotate for Fortran programs.
In your own application, choosing where to add task annotations may require some experimentation. If your parallel site has nested loops and the computation time used by the innermost loop is small, consider adding task annotations around the next outermost loop.
Because we are trying to keep this tutorial short, we already added parallel site and task annotations to the sample code for you. All you need to do is uncomment them.
Click Survey Report on the navigation toolbar to re-open the Survey Report.
Right-click the data row with the first hot loop and choose Edit Source to open the nqueens_serial.f90 source file in an editor.
program NQueens ! Solve the nqueens problem - How many ways can you put 'n' queens on an ! n-by-n chess board without them being able to attack each other? ! Read http://en.wikipedia.org/wiki/Nqueens for background ! ! Original C++ code by Ralf Ratering & Mario Deilmann ! Fortran version by Steve Lionel & others ! ! To set command line argument in Visual Studio: ! 1) Right click on the project name and select 'Properties'. ! 2) Under 'Debugging', enter the argument (board size) in the ! 'Command Arguments' field. !ADVISOR SUITABILITY EDIT: To use the Advisor Annotations: !ADVISOR SUITABILITY EDIT: Uncomment the "use advisor_annotate" line below !use advisor_annotate implicit none integer :: nrOfSolutions = 0 ! Counts the number of solutions. integer :: size = 0 ! The board size; read from the command line. ! The number of correct solutions for each board size (1-15). integer, parameter, dimension(15) :: correct_solution = (/ & 1, 0, 0, & 2, 10, 4, 40, & 92, 352, 724, 2680, & 14200, 73712, 365596, 2279184 /) character(200) :: cmd_name ! Command/Program Name character(400) :: cmd_line ! The full command line integer :: cmd_len ! The command-line length character(2) :: cmd_arg ! The command arguments integer :: stat ! Library call status value integer :: time_start, time_end, count_rate ! Timing variables integer :: nthreads = 1 ! Number of threads to use. 100 format(A,A,A) 101 format(A,I0,A,I0,A) ! Get the board size from the command line argument. if (command_argument_count() < 1) then call get_command_argument(0, cmd_name, cmd_len, status=stat) print 100, "Usage: ", cmd_name(1:cmd_len), " boardSize" size = 14 print *, "Using default size of 14" else call get_command_argument(1, cmd_arg, status=stat) read(cmd_arg, *, iostat=stat) size ! Limit the board size. If it is too small, the program may finish before ! suitability or other analyses can produce an accurate result. If the ! board is too large, the program will take a long time. if ((stat /= 0) .or. (size < 4) .or. (size > 15)) then print *, "Error: boardSize must be between 4 and 15; resetting to 14" size = 14 end if endif ! Time how long it takes to find all solution boards. print 101, "Starting nqueens solver for size ", size, " with ", nthreads, & " thread(s)." call system_clock(time_start) call solve() call system_clock(time_end, count_rate) ! Evaluate and report the result. print 101, "Number of solutions: ", nrOfSolutions if (nrOfSolutions == correct_solution(size)) then print *, "Correct Result!" else print *, "Incorrect Result!" end if print 101, "Calculations took ", (time_end-time_start) / (count_rate/1000), & "ms." ! End of Main Program contains ! Recursive routine to find all solutions on the board (the array 'queens') ! when we place the next queen at location (row, col). ! This increments the global nrOfSolutions with each solution found. ! ! Although the recusive call in this function may appear several times in ! the survey results, the solve() function is a better-performing ! parallelization candidate, due to its coarser granularity. ! !ADVISOR CORRECTNESS EDIT: In order to avoid data races and correctness !ADVISOR CORRECTNESS EDIT: issues on the 'queens' array, we have to make !ADVISOR CORRECTNESS EDIT: a private copy of it. !ADVISOR CORRECTNESS EDIT: So rename 'queens' to 'queens_in' in the next two !ADVISOR CORRECTNESS EDIT: lines recursive subroutine setQueen(queens, row, col) integer, intent(inout) :: queens(:) integer, intent(in) :: row, col integer :: i integer, volatile :: j !ADVISOR CORRECTNESS EDIT: Uncomment the declaration of queens, and the !ADVISOR CORRECTNESS EDIT: assignment statement, which will creates !ADVISOR CORRECTNESS EDIT: a private copy of in_queens. !integer :: queens(ubound(queens_in, dim=1)) !queens = queens_in do i = 1, row-1 ! Check for vertical attacks. if (queens(i) == col) return ! Check for diagonal attacks. if (abs(queens(i)-col) == (row-i)) return end do ! Position is safe; set the queen. queens(row) = col if (row == size) then !ADVISOR CORRECTNESS EDIT: Uncomment the following 2 lock annotations !ADVISOR CORRECTNESS EDIT: to avoid a datarace on nrOfSolutions. !call annotate_lock_acquire(0) nrOfSolutions = nrOfSolutions + 1 !call annotate_lock_release(0) else ! Try to fill next row. do j = 1, size call setQueen(queens, row+1, j) end do end if end subroutine SetQueen ! Find all solutions for the nQueens problem on a size x size chessboard. ! On return, nrOfSolutions = number of solutions. ! !ADVISOR COMMENT: When surveying, this is the top CPU-consuming function !ADVISOR COMMENT: below the main function. This subroutine's do loop is !ADVISOR CONTENT: an excellent candidate for parallelization. subroutine solve() integer :: i integer, allocatable :: queens(:) ! Array representing the chess board. allocate(queens(size)) queens = 0 !ADVISOR SUITABILITY EDIT: Uncomment the three annotation calls below to !ADVISOR SUITABILITY EDIT: model parallelizing the body of this do loop. !call annotate_site_begin("solve") do i = 1, size !call annotate_iteration_task("setQueen") ! Try all positions in first row. call SetQueen(queens, 1, i) end do !call annotate_site_end() deallocate(queens) end subroutine solve end program
Search for ADVISOR SUITABILITY EDIT and follow the directions in the sample code. Make four total edits: Uncomment the !use advisor_annotate line near the top and three annotation lines.
Now is also a good time to simply explore our fully commented sample code.
Save your edits.
In the terminal session:
Change directory to the nqueens/ directory (where the zipped sample files were extracted to).
Type make 1_nqueens_serial to rebuild the target.