Next: Bibliography Up: ADAPTOR OpenMP Programmers Guide Previous: 6 Work Sharing Contents

Subsections

7 The ADAPTOR OpenMP Compiler

This section describes in more detail how ADAPTOR compiles OpenMP programs.

7.1 Translation of Parallel Regions

The following example shows a parallel region.

!$OMP parallel private (INODE)
      INODE = omp_get_thread_num()
      write (6,'('' hello world from '',I3)') INODE
!$OMP end parallel

The general idea of the ADAPTOR translation is to create a new subroutine that will be called by all the threads of a team. The team itself will be created by the runtime function DALIB_pthreads.

!     runtime call to create threads executing HELLO1
      call DALIB_pthreads (HELLO1, ...)

      subroutine HELLO1 ()
      integer INODE
      external DALIB_get_thread_num
      integer DALIB_get_thread_num
      INODE = DALIB_get_thread_num()
      write (6,'('' hello world from '',I3)') INODE
      end subroutine HELLO1

Hint: ADAPTOR supports nested parallelism.

7.2 Data Environment

C$OMP PARALLEL DO PRIVATE (v), SHARED(w)
C$OMP+REDUCTION (+:gsum)
      DO i = 1, n
        v = (i - 0.5d0 ) * w
        v = 4.0d0 / (1.0d0 + v * v)
        gsum = gsum + v
      END DO

      call DALIB_pthreads (CALC_PI1,DALIB_0,DALIB_0,6,GSUM,W,N,DALIB_0,D
     &ALIB_0,DALIB_0)

      subroutine CALC_PI1 (GSUM, W, N, ...) 
      double precision GSUM_TMP, V, W, GSUM
      integer I, N
      integer IK_STOP1
      integer IK_START1
      GSUM_TMP = 0.0
      call DALIB_do_static_bsched (1,N,1,IK_START1,IK_STOP1)
      do I=IK_START1,IK_STOP1
         V = (REAL(I,8)-0.5d0)*W
         V = 4.0d0/(1.0d0+V*V)
         GSUM_TMP = GSUM_TMP+V
      end do
      call DALIB_enter_critical ()
      GSUM = GSUM+GSUM_TMP
      call DALIB_leave_critical ()

Private variables become local variables in the generated subroutine. By this way they will be allocated on each thread's stack.
Shared variables will be passed by reference as dummy arguments.
Reduction variables are treated as shared variables but become an additional private incarnation.

7.3 Work Sharing

!$omp do schedule (STATIC)
      do I = 1, N, IS
         A(I) = IT + 1
      end do

      call DALIB_do_static_bsched (1,N,IS,IK_START1,IK_STOP1)
      do I=IK_START1,IK_STOP1,IS
         A(A_ZERO+I) = IT+1
      end do

!$omp do schedule (dynamic, ICHUNK), lastprivate (PLAST)
      do I = 1, N, IS
         A(I) = IT + 1
         PLAST = I
      end do

      call DALIB_do_dynamic_sched_init (1,N,IS,ICHUNK)
      do while (DALIB_do_dynamic_sched_next(IK_START1,IK_STOP1))
         do I=IK_START1,IK_STOP1,IS
            A(A_ZERO+I) = IT+1
            PLAST_TMP = I
         end do
      end do
      if (DALIB_is_mp_last()) then
         PLAST = PLAST_TMP
      end if

7.4 SECTIONS

Each section is executed by a different thread/processor in a dynamic manner. Internally, parallel sections are handled in the following ways:

!$omp parallel private (IT)
      IT = OMP_GET_THREAD_NUM ()
!$omp sections
!$omp section
      call XAXIS (AXIS)
!$omp section
      call YAXIS (AXIS)
!$omp section
      call ZAXIS (AXIS)
!$omp end sections
!$omp end parallel

!$omp do dynamic
      do I0 = 1, 3
         if (I0 .eq. 1) then
            call XAXIS (AXIS,AXIS_DSP)
          else if (I0 .eq. 2) then
            call YAXIS (AXIS,AXIS_DSP)
          else if (I0 .eq. 3) then
            call ZAXIS (AXIS,AXIS_DSP)
         end if
      end do

7.5 SINGLE Directive

The code within a SINGLE region will be executed by only one thread.

!$omp single
      X = X + 10
      Y = Y + 100
!$omp end single

      call DALIB_mp_barrier ()
      if (DALIB_is_mp_single()) then
         X = X+10
         Y = Y+100
      end if
      call DALIB_mp_barrier ()

Only one thread does the work, but it is unspecified which thread actually it is.

7.6 Synchronization

7.7 CRITICAL Sections

!$omp atomic
         X(INDEX(I)) = X(INDEX(I)) + XLOCAL

A critical section is enclosed with lock primitives.

7.8 ATOMIC Update

!$omp atomic
         X(INDEX(I)) = X(INDEX(I)) + XLOCAL

The general strategy is similar to the critical section. The assignment will be enclosed with lock primitives that are less restrictive.

         call DALIB_atomic_lock (X(INDEX(I)))
         X(INDEX(I)) = X(INDEX(I))+XLOCAL
         call DALIB_atomic_unlock (X(INDEX(I)))

7.9 `ORDERED` Clause

!$omp do ordered
      do I = 2, N
         call WORK (I, X, N)
      end do

      call DALIB_init_ordered (2,1)
      call DALIB_do_static_bsched (2,N,1,IK_START1,IK_STOP1)
      do I=IK_START1,IK_STOP1
         call DALIB_set_loop_id (I)
         call WORK (I,X(X_ZERO+1),N,DALIB_0,X_DSP,DALIB_0)
      end do

      subroutine WORK (I, X, N)
      integer I
      integer N
      integer X (N)
!$omp ordered
      X (I) = X(I-1) + I
!$omp end ordered
      end

      call DALIB_enter_ordered ()
      X(X_ZERO+I) = X(X_ZERO+(I-1))+I
      call DALIB_leave_ordered ()

Next: Bibliography Up: ADAPTOR OpenMP Programmers Guide Previous: 6 Work Sharing Contents

Thomas Brandes 2004-03-18