The HPF halo library allows the explicit use of halos for distributed arrays that are indirectly accessed. It provides the highest flexibility as the user has not to rely on the capabilities of the compiler to use halos automatically.
use HPF_HALO_LIBRARY
Halos can only be defined for halo arrays that are distributed along one dimension. This is not a serious restriction as this is the case for all unstructured applications with indirect addressing considered so far.
The halo structure is completely hidden and will only be identified by an integer value. A halo is not directly related to the array for which it has been defined. So it can also be used for all arrays that have the same shape and the same mapping in the distributed dimension.
From the users point of view, the following topics are very important for a successful use of the halo library:
The following subroutine sets up internal data structures for a halo where ARRAY is the indirectly accessed array and INDEXES is an array of global indexes used for the distributed dimension. The halo data structures are not explicitly available but are referred to by an integer value.
subroutine HALO_DEFINE (HALO_ID, ARRAY, INDEXES [,MASK] [,LB] [,UB])
The HALO_DEFINE routine has not any side effects on the ARRAY and INDEXES arguments. The argument ARRAY gives the routine only the information how to find out which of the indexes are global or local. Shadow edges must not be available at this time. If the array is distributed but not allocated yet, the future dimensions can be specified. The distribution information of the ARRAY and the boundary values will be used to establish an internal template reflecting the halo arrays.
The halo can be used later for all arrays that have the same shape and the same mapping in the distributed dimension. But updates and reductions on the halo require that the corresponding arrays have sufficient shadow area.
In all of the following operations, HALO_ID is an INTENT(IN) argument of type default integer. Its value must refer to a defined halo structure.
The internal data structure for the halo can be freed with the HALO_FREE routine.
subroutine HALO_FREE (HALO_ID)
A halo should be freed if it is no more used or if the integer array with the halo indexes has been modified. If the indirection array used for the definition of the halo has the TRACE attribute, the internal structures built for the halo will only be freed when the indirection array changes. This allows the reuse of halos in case of an automatic approach.
The following subroutine returns the maximal size of shadow needed on one processor when using the halo for a distributed array. Its value is like the whole halo structure dependent on the number of active processors. When the size of the shadow is known, the user might define new arrays with the appropriate shadow.
subroutine HALO_MAX_SIZE (HALO_ID, SIZE)
The following subroutine can be used to verify if there is sufficient shadow available for the ARRAY. This routine might be useful when the user has defined a shadow via the SHADOW directive of HPF.
subroutine HALO_VERIFY_SHADOW (HALO_ID, ARRAY, OKAY)
If ARRAY is dynamic and not sufficient shadow is available, the following routine guarantees a reallocation of the array that will have sufficient shadow size. This subroutine returns immediately if sufficient shadow is available for ARRAY.
subroutine HALO_CREATE_SHADOW (HALO_ID, ARRAY)
The following routines transform global indexes to local indexes pointing to the shadow area and vice versa.
subroutine HALO_LOCAL_INDEXES (HALO_ID, INDEXES) subroutine HALO_GLOBAL_INDEXES (HALO_ID, INDEXES)
Attention: It is absolutely left to the user to know whether the indirection array contains global indexes or shadow indexes. If the localization of indexes is called twice, shadow indexes might be considered as global indexes resulting in completely undefined indexes.
Instead of localizing indexes in place, it is also possible to use a new temporary array with localized indexes. This might avoid confusion between global and shadow indexes, but it also becomes necessary if the global indirection array is not only used for the halo array. The routine HALO_NEW_LOCAL_INDEXES transforms all global indexes to a new array with local indexes pointing to the shadow area.
subroutine HALO_NEW_LOCAL_INDEXES (HALO_ID, G_INDEXES, L_INDEXES) subroutine HALO_FREE_LOCAL_INDEXES (HALO_ID, L_INDEXES)
The temporary array will be allocated within the routine. Therefore the argument L_INDEXES must have the pointer attribute. The temporary array with the local indexes is part of the halo structure and is also reused when the halo is reused.
The routine HALO_ALL_DEFINE can be used as an abbreviation for the usual definition of a halo structure with creating the shadow area for the halo array and translating the global indexes to local indexes.
subroutine HALO_ALL_DEFINE (HALO_ID, ARRAY, INDEXES [, MASK] [, HEAP_FLAG]) ! implements call HALO_DEFINE (HALO_ID, ARRAY, INDEXES [, MASK]) call HALO_CREATE_SHADOW (HALO_ID, ARRAY [, HEAP_FLAG]) call HALO_LOCAL_INDEXES (HALO_ID, INDEXES)
The routine HALO_AUTO_DEFINE is nearly the same but uses a temporary array for the localized indexes.
subroutine HALO_AUTO_DEFINE (HALO_ID, ARRAY, INDEXES, TMP_INDEXES [,MASK]) ! implements call HALO_DEFINE (HALO_ID, ARRAY, INDEXES [, MASK]) call HALO_CREATE_SHADOW (HALO_ID, ARRAY) call HALO_NEW_LOCAL_INDEXES (HALO_ID, INDEXES, TMP_INDEXES)
The routine HALO_AUTO_FREE does not only free the halo data structure but also the temporary indexes.
subroutine HALO_AUTO_FREE (HALO_ID, TMP_INDEXES) ! implements call HALO_FREE_LOCAL_INDEXES (HALO_ID, TMP_INDEXES) call HALO_FREE (HALO_ID)
The update routine makes sure that all non-local copies in the shadows on other processors are updated with the actual value of the owning processor.
subroutine HALO_UPDATE (HALO_ID, ARRAY)
A reduction for a halo reduces all non-local copies in the shadows on other processors to a single value on the owning processor. Like the XXX_SCATTER operations of the HPF library, there are corresponding HALO_REDUCE_XXX operations for the different reduction operations in the halo library.
subroutine HALO_REDUCE_COPY (HALO_ID, ARRAY) subroutine HALO_REDUCE_SUM (HALO_ID, ARRAY) subroutine HALO_REDUCE_PRODUCT (HALO_ID, ARRAY) subroutine HALO_REDUCE_MINVAL (HALO_ID, ARRAY) subroutine HALO_REDUCE_MAXVAL (HALO_ID, ARRAY) subroutine HALO_REDUCE_ALL (HALO_ID, ARRAY) subroutine HALO_REDUCE_ANY (HALO_ID, ARRAY) subroutine HALO_REDUCE_PARITY (HALO_ID, ARRAY) subroutine HALO_REDUCE_IALL (HALO_ID, ARRAY) subroutine HALO_REDUCE_IANY (HALO_ID, ARRAY) subroutine HALO_REDUCE_IPARITY (HALO_ID, ARRAY)
Reductions need an initialization of the reduction variable with the zero element. The unstructured reduction needs also an initialization of the elements in the shadow area. This can be done by the following routine that involves no communication:
subroutine HALO_INIT (HALO_ID, ARRAY, VALUE)