Process Grid and scoped operations

The processes of a parallel machine with P processes are often presented to the user as a linear array of process IDs, labeled 0 through (P - 1). For reasons described below, it is often more convenient to map this 1-D array of P processes into a logical two dimensional process mesh, or grid. This grid will have R process rows and C process columns, where R * C = G <= P. A process can now be referenced by its coordinates within the grid (indicated by the notation {i, j}, where 0 <= i < R, and 0 <= j < C), rather than a single number. An example of such a mapping is shown in below.
    

An operation which involves more than just a sender and a receiver is called a scoped operation. All processes that participate in a scoped operation are said to be within the operation's scope.

On a system using a linear array of processes, the only natural scope is all processes. Using a 2-D grid, we have 3 natural scopes, as shown in the following table.

SCOPE                    MEANING
------   ----------------------------------------------
Row      All processes in a process row participate.
Column   All processes in a process column participate.
All      All processes in the process grid participate.
These groupings of processes are of particular interest to the linear algebra programmer, since distributed data decompositions of a 2D array (a linear algebra matrix) tend to follow this process mapping. For instance, all of a distributed matrix row can be found on a process row, etc.

Viewing the rows/columns of the process grid as essentially autonomous subsystems provides the programmer with additional levels of parallelism. Of course, how independent these rows and columns actually are will depend upon the underlying machine. For instance, if the grid's processors are connected via ethernet, we can see that the only gain will be in ease of programming. Speed is unlikely to increase, since if one processor is communicating, no others can. If this is the case, process rows or columns will not be able to perform different distributed tasks at the same time. Fortunately, most modern supercomputer interconnection networks are at least as rich as a 2D grid, so that these additional levels of parallelism can be exploited.

Contexts are closely related to process grids in the BLACS.