Newsgroups: comp.parallel.mpi
From: llewins@msmail4.hac.com (Lloyd J Lewins)
Subject: Re: MPI collective comm question
Organization: Hughes Aerospace Electronics Co.
Date: Tue, 09 May 1995 08:17:53 -0800
Message-ID: <llewins-0905950817530001@x-147-16-95-58.es.hac.com>

In article <3obhig$e6o@monalisa.usc.edu>, zxu@monalisa.usc.edu (Zhiwei Xu)
wrote:

> I would appreciate any answers or pointers to the following question:
> 
> What happens when a member process of a group forgets to participate in a
> collective communication?
> 
> Consider process 0 bcasting to all processes in MPI_COMM_WORLD. The user
> made a mistake and wrote the following code instead:
> 
> ...
> if (my_rank!=1) {
> ...
> MPI_Bcast( ... )
> ...
> }
> ...
> 
> The MPI spec seems to say that the above code is allowed, in that all 
> processes may continue, not bothered by process 1 not joining the bcast.
> The user does not know that process 1 is not receiving the bcasted value,
> and assumes the the incorrect final result to be correct.

While collective operations are not guaranteed to synchronize, clearly the
semantics of some types of operations force synchronization. For example,
a leaf node in a broadcast must block until the root has participated and
provided the neccessary data. 

> Is this more dangerous than deadlocking the bcast operation, i.e., if
> any party does not join, the code deadlocks. This way, the user knows
> that something is wrong.

If the implied data synchronization was violated, I would agree with you,
for example if a leaf node continued with bogus data because the root
failed to make a broadcast call. However, this is not the case. 

Your above example is a simple programming error, and thus is likely to
have bad results. It doesn't seem reasonable to expect MPI to detect this
type of error; many similar errors which MPI could not detect could also
cause the failure of your program. MPI strives for high performance, and
forcing collective operations to provide synchronization will hurt
performance, and will only have an advantage for erroneous programs.

--------------------------------------------------------------------------
Lloyd J Lewins                                  Mail Stop: RE/R1/B507
Hughes Aerospace and Electronics Co.            P.O. Box 92426
                                                Los Angeles, CA 90009-2426
Email: llewins@msmail4.hac.com                  USA
Tel: 1 (310) 334-1145
Any opinions are not neccessarily mine, let alone my employers!!

