Newsgroups: comp.parallel.mpi
From: STB <barnard@nas.nasa.gov>
Subject: Re: MPI_Alltoall() on Cray T3D
Organization: NAS/NASA Ames Research Center
Date: Thu, 14 Nov 1996 09:20:04 -0800
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <328B54C4.2781@nas.nasa.gov>

N Sundar wrote:
> 
> >MPI_Alltoall() call has many restrictions.
> >For example, send and receive buffer have to be continuous and
> >the receive message size has to be specified. It's very bad
> >if the receiver doesn't know the message size of the sender.
> 
> True. The only application I know of where the receiver does not
> have this knowledge is parallel sample sort. I'd like to hear from
> people if any other application shares this characteristic. In
> general, if an application requires all-to-all scatter with variable
> block sizes, the receiver would not know what is the size of the
> incoming message, unless the message sizes were previously distributed
> with another all-to-all exchange.
> 

I've implemented a parallel version of the SPAI (Sparse Approximate
Inverse) preconditioner that has this characteristic.  In fact, not only
does the receiver not know the length of the incoming message, the
sender doesn't know ahead of time that the receiver wants some data!

Here's the situation.  SPAI is inherently parallel because it computes
the rows of a sparse preconditioning matrix independently.
Unfortunately, to compute a row it needs to refer to other rows of the
"original" matrix in a completely unstructured, unpredictable way.  In
other words, a processor will need data that may reside on another
processor, and the other processor doesn't know ahead of time that the
data is needed.

I solved this by writing a "communications server" that emulates shared
memory.  Processors send requests for data which the communications
server on the other processors detect and service.  The scheme relies
heavily on MPI_Iprobe and MPI_Get_count.

There are some other tough problems to make this efficient.  For
example, work must be redistributed dynamically for load balance, and
remote references must be cached to eliminate redundant communication.
It is a very complicated code, but the SPAI preconditioner is
ridiculously effective for many important problems so it's worth it.

	Steve Barnard

