In contrary to the structured communication, there is also communication necessary to set up the communication schedule. This communication is needed to ask for the needed data or to inform the processor about data that will be sent.
Gathering is the unstructured reading of values from a distributed array. The following code shows a typical example for gathering data from a distributed array that is well supported by ADAPTOR.
real, dimension (M1,M2,...,Ml) :: B ! array with indirect access real, dimension (N1,N2,...,Nk) :: A ! values to be gathered integer, dimension (N1,N2,...,Nk) :: P1, ..., Pl ! indirection arrays logical, dimension (N1,N2,...,Nk) :: MASK ! mask !hpf$ distribute A (...) !hpf$ distribute B (...) !hpf$ align with A :: P1, ..., Pl, MASK ! very important forall (J1=LB1:UB1, ..., Jk = LBk:UBk, MASK(J1,...,Jk)) & A(J1,...,Jk) = B(P1(J1,...,Jk), ..., Pl(J1,...,Jk))
ADAPTOR will generate rather efficient parallel code if the following points are observed:
The indirect addressing of a distributed array requires complex runtime support. A communication pattern must be computed to access the needed data. In a first step, every processor will ask the other processors for the needed data. In the second step, the processors send the required values to the corresponding processors. This functionality must be available in the runtime system and has usually a very high overhead.
In case of a shared array this runtime support is not necessary. Especially the owner evaluation is no longer necessary as all global addresses remain unchanged.
The gathering of data is also possible with array statements.
real, dimension (M) :: B real, dimension (N) :: A integer, dimension (N) :: P !hpf$ distribute (...) :: A !hpf$ distribute (...) :: B !hpf$ align with A :: P A = B(P) ! gathering via indirection array
Scattering is the unstructured write into a distributed array. Similar to the gathering, it is supported well by ADAPTOR in the following way:
<type>, dimension (M1,M2,...,Ml) :: B ! array with indirect access <type>, dimension (N1,N2,...,Nk) :: A ! values to be scattered integer, dimension (N1,N2,...,Nk) :: P1, ..., Pl ! indirection arrays logical, dimension (N1,N2,...,Nk) :: MASK ! mask !hpf$ distribute A (...) !hpf$ distribute B (...) !hpf$ align with A :: P1, ..., Pl, MASK ! very important forall (J1=LB1:UB1, ..., Jk = LBk:UBk, MASK(J1,...,Jk)) & B(P1(J1,...,Jk), ..., Pl(J1,...,Jk)) = A(J1,...,Jk)
In this case, the FORALL statement can be replaced with a call to the COPY_SCATTER function of the HPF Libarary. ADAPTOR generates the same code.
B = copy_scatter (A, B, P1, ..., Pl, A, MASK)
The following HPF intrinsic functions can be used to scatter data from an array A to an array B:
B = xxx_SCATTER (A, B, P1, ..., Pl, A, MASK)
The MASK parameter is optional. The allowed values for the reduction function are ALL, ANY, COUNT, IALL, IANY, IPARITY, SUM, PRODUCT, PARITY, MINVAL and MAXVAL.
The following has to be observed:
If k is the rank of the arrays A, P1, ..., Pl and MASK, and (low1:up1,...,lowk:upk) the shape, the semantic of the scatter operation can be described by the following loop nesting:
do J1 = LB1, UB1 do J2 = LB2, UB2 ... do JK = LBk, UBk if (MASK(J1,...,Jk) then B(P1(J1,...,Jk), ..., Pl(J1,...,Jk)) = & red_f (B(P1(J1,...,Jk), ..., Pl(J1,...,Jk)), A(J1,...,Jk)) end if end do ... end do end do
The requirement for tracing is described by the user directive !ADP$ TRACE. This directive has a Fortran attribute semantics, and follows Fortran 90 rules of scoping. By default, arrays are not traced.
An array must be flagged dirty when it is changed. The syntactic constructs that can modify an array are limited, and can be analyzed at compile-time: assignments with this array as a left-hand-side, redistributions, allocations and deallocations. Thus, the trace attribute is compatible with the DYNAMIC and ALLOCATABLE attributes, but not with the POINTER or TARGET attribute, because the compiler will not be able to record modification of such arrays.
integer, dimension (N) :: P !adp$ trace :: P
Communication schedules can be reused if indirection array has not changed. In the following example, the communication schedule will be reused for the integer array P as it is traced and not modified between the different calls of the subroutine TIMING.
subroutine INDIRECT real, dimension (N) :: A, B integer, dimension (N) :: P, Q !hpf$ distribute (block) :: A !hpf$ align with A :: B, P, Q !adp$ trace P call INIT (P) call INIT (Q) do J = 1, M ! serial loop call TIMING (A, B, P) end do do J = 1, M ! serial loop call TIMING (A, B, Q) end do end subroutine indirect subroutine TIMING (A, B, L) ... !adp$ trace L forall (I=1:N) A(I) = B(L(I)) end subroutine TIMING