Newsgroups: comp.parallel.mpi
From: Robert van de Geijn <rvdg@cs.utexas.edu>
Subject: Re: MPI vs native-CRAY
Organization: University of Texas at Austin
Date: 28 Aug 1996 18:42:00 GMT
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <5023to$rbu@news.cs.utexas.edu>

shane@erc.msstate.edu (Shane Hebert) wrote:
>On 22 Aug 1996 12:47:34 GMT, an alien in the guise of
>I.J.Bush@dl.ac.uk (I.J. Bush) wrote:
>
>Don't forget about the Cray T3D device that is delivered with
>MPICH.
>
>MPICH:	106 Mbytes/s bandwidth, 39 microseconds latency
>
>
>>|Round figures ( MPI is from EPCC ):
>>|
>>|MPI      :  40 Mbytes/s bandwidth, 30 microseconds latency
>>|SHMEM_PUT: 120 Mbytes/s          ,  6 microseconds
>>|SHMEM_GET:  60 Mbytes/s          ,  6 microseconds
>>|
>>|( Note that this latency is what you see after mucking
>>|about ensuring cache etc. coherency, not just for the
>>|PUT/GET call )
>>|
>
>============================================================
>Shane Hebert
>shane@erc.msstate.edu
>Disclaimer:  Any opinions stated above are mine and not
>those of my employer.
>Disclaimer:  If I say something and you do it, it isn't
>my fault.

I always find it interesting that so much emphasis is put on the latency
of a message system.  Yes, lower latency will improve communication, but
often it is equally, or more important to concentrate on better algorithms
in the first place.  For example, from what I understand, the shmem
library achieves very respectable performance for broadcasting short messages.
However, the algorithm used is optimized to generate the fewest messages
possible.  This is not necessarily optimal for long messages, as is shown
very nicely by Ho and Johnson's EDST algorithm for hypercubes.  While
some vendors have invested in high performance communication algorithms
for their native and/or MPI implementations, Cray has yet to join those
ranks.

We have written a number of papers on the implementation of various
collective communication algorithms on mesh architectures.  This has
yielded the InterCom library, which is now used underneath the native 
NX and MPI implementations on the Intel Paragon.

For more information, see

 0. Prasenjit Mitra, David Payne, Lance Shuler, Robert van de Geijn, and 
   Jerrell Watts, "Fast Collective Communication Libraries, Please," 
   Proceedings of the Intel Supercomputing Users' Group Meeting 1995.

 1. M. Barnett, D. Payne, R. van de Geijn, and J. Watts, ``Broadcasting on
   Meshes with Wormhole Routing,'' Journal of Parallel and Distributed
   Computing, Vol. 35, No. 2, pp. 111-122, 1996. 

 2. Jerrell Watts and Robert van de Geijn, "A Pipelined Broadcast for
   Multidimensional Meshes," Parallel Processing Letters, Vol. 5, No. 2 (1995)
   pp. 281-292. 

 3. M. Barnett, R. Littlefield, D. Payne, and R. van de Geijn, ``On the
Efficiency
   of Global Combine Algorithms for 2-D Meshes With Wormhole Routing,'' 
   Journal of Parallel and Distributed Computing 24, pp. 191-201 (1995). 

 4. Robert van de Geijn, ``On Global Combine Operations,'' Journal of Parallel
   and Distributed Computing, 22 , pp. 324-328 (1994). 

Postscript versions of TRs associated with these papers can be found
through my webpage 
        http://www.cs.utexas.edu/users/rvdg
or the intercom webpage:  
        http://www.cs.utexas.edu/users/rvdg/intercom

A large body of work by Howard Ho at IBM is also highly recommended
(perhaps he can post a URL for his work to comp.parallel.mpi)

Finally, we have a partial manuscript on the subject, A Street Guide to
Collective Communication on Parallel Computers by Payne, Shuler, van de Geijn 
and Watts, which we would be happy to share with people teaching courses
on the subject this fall (sign the intercom web page guest book, and we
will be in touch.)

Robert van de Geijn

Ps.: we use MPI on all systems, since the ease of use and portability 
far outways the benefits of any optimized library.

=========================================================================

Robert A. van de Geijn                  rvdg@cs.utexas.edu  
Associate Professor                     http://www.cs.utexas.edu/users/rvdg
Department of Computer Sciences         (Work)  (512) 471-9720
The University of Texas                 (Home)  (512) 251-8301 
Austin, TX 78712                        (FAX)   (512) 471-8885


