Newsgroups: comp.parallel
From: rvdg@cs.utexas.edu (Robert van de Geijn)
Subject: Collective Communication: iCC vs MPI vs BLACS vs NX
Organization: CS Dept, University of Texas at Austin
Date: Mon, 26 Jun 1995 14:17:07 GMT
Message-ID: <3smfh3$euq@daffy.cs.utexas.edu>


We would like to announce the following paper that may be of
interest to the comp.parallel communities.

Prasenjit Mitra, David Payne, Lance Shuler, Robert van de Geijn, and
Jerrell Watts, 
     
     `Fast Collective Communication Libraries, Please," 

to appear in the Proceedings of the Intel Supercomputing Users' Group
Meeting 1995.  Also: Department of Computer Sciences, The Unversity of
Texas, TR-95-??, June 1995.

 Abstract 

It has been recognized that many parallel numerical algorithms can be
effectively implemented by formulating the required communication as
collective communications.  Nonetheless, the efficiency of such
communications has been suboptimal in many communication library
implementations.  In this paper, we give a brief overview of
techniques that can be used to implement a high performance collective
communication library, the iCC library, developed for the Intel family
of parallel supercomputers as part of the InterCom project at the
University of Texas at Austin.  We compare the achieved performance on
the Intel Paragon to those of three widely available libraries:
Intel's NX collective communication library, the MPICH Message Passing
Interface (MPI) implementation developed at Argonne and Mississippi
State University and a Basic Linear Algebra Communication Subprograms
(BLACS) implementation, developed at the University of Tennessee.

For further information, see 

    http://www.cs.utexas.edu/users/rvdg/icc_vs_other.html

Most notably is a comparison in this paper of iCC vs. MPI vs. BLACS
vs. NX.  The following table gives a good insight into what improvements
can be made to the reference implementations of MPI and BLACS to 
attain high performance:


      Comparison of the various library implementations on 
      a 16x32 mesh Paragon time in seconds


         	   BROADCAST

      bytes     iCC     NX  BLACS   MPI     
        16   0.001  0.001  0.001  0.001  
      1024   0.001  0.001  0.001  0.002  
     65536   0.006  0.034  0.018  0.017  
   1048576   0.044  0.498  0.271  0.332  

         	   SUM-TO-ALL

      bytes     iCC     NX  BLACS   MPI     
        16   0.001  0.001  0.002  0.003  
      1024   0.002  0.002  0.002  0.004  
     65536   0.014  0.297  0.057  0.097  
   1048576   0.135  4.655  0.797  1.593  

  Robert A. van de Geijn                  rvdg@cs.utexas.edu  
  Associate Professor                     http://www.cs.utexas.edu/users/rvdg
  Department of Computer Sciences         (Work)  (512) 471-9720
  The University of Texas                 (Home)  (512) 251-8301 
  Austin, TX 78712                        (FAX)   (512) 471-8885 


