Multipartitioning is a strategy for partitioning multi-dimensional arrays among a collection of processors so that line-sweep computations can be performed efficiently. The principal property of a multipartitioned array is that for a line sweep along any array dimension, all processors have the same number of tiles to compute at each step in the sweep. This property results in full, balanced parallelism. A secondary benefit of multipartitionings is that they induce only coarse-grain communication.
Previously, computing a d-dimensional multipartitioning required that p^(1/d-1) be integral, where p is the number of processors. Here, we describe an algorithm to compute a d-dimensional multipartitioning of an array of rho dimensions for an arbitrary number of processors, for any d, 2 <= d <= rho. When using a multipartitioning to parallelize a line sweep computation, the best partitioning is the one that exploits all of the processors and has the smallest communication volume. To compute the best multipartitioning of a rho-dimensional array, we describe a cost model for selecting d, the dimensionality of the best partitioning, and the number of cuts along each partitioned dimension. In practice, our technique will choose a 3-dimensional multipartitioning for a 3-dimensional line-sweep computation, except when p is a prime; previously, a 3-dimensional multipartitioning could be applied only when sqrt(p) is integral.
We describe an implementation of multipartitioning in the Rice dHPF compiler and performance results obtained to parallelize a line sweep computation on a range of different numbers of processors.