Newsgroups: comp.parallel.mpi
From: ohnielse@oersted (Ole Holm Nielsen)
Reply-To: ohnielse@fysik.dtu.dk
Subject: MPI_ALLREDUCE: How to implement it efficiently
Organization: Physics Department, Techn. Univ. of Denmark
Date: 28 Aug 1995 13:35:52 GMT
Message-ID: <41sgno$mda@news.uni-c.dk>

Dear MPI experts,

I need to code an ALL-REDUCE operation similar to that of MPI_ALLREDUCE()
by message passing, but we don't yet have an MPI library.
The MPI WWW home page states that MPI_ALLREDUCE() can be done as a reduce
plus a broadcast, but that a direct implementation can lead to better
performance.

I would like to find information about the point-to-point communication
pattern that is needed to implement an ALL-REDUCE operation efficiently
for the number of processors N=2,3,4,5,... up to some small number
such as 32.  Is there an algorithm for generating the required
communication pattern ?  For N being a power of 2, a butterfly-like
pattern seems to be optimal, but what about the general case ?

(Please don't offer pointers to MPI libraries, I want to understand
and implement the ALL-REDUCE operation using other libraries and
simple point-to-point communication).

With best regards,

Ole H. Nielsen
Department of Physics, Building 307
Technical University of Denmark, DK-2800 Lyngby, Denmark
E-mail: Ole.H.Nielsen@fysik.dtu.dk
WWW URL: http://www.fysik.dtu.dk/ohnielse.html
Telephone: (+45) 45 25 31 87
Telefax:   (+45) 45 93 23 99

