Newsgroups: comp.parallel.mpi
From: rdaoud@magnus.acs.ohio-state.edu (Raja B Daoud)
Subject: Re: Aborted send question
Organization: Ohio Supercomputer Center
Date: 7 Jan 1995 04:01:07 GMT
Message-ID: <3el3m3$5m2@charm.magnus.acs.ohio-state.edu>

Joel Clark writes:
>	Should the partially completed recieve finish with an error such
>	as "sendor aborted"?
>	Should the partially completed recieve be quietly cleaned up and 
>	reposted?

or

3) the receive never finishes and the application hangs sometime after
4) the message is fully transmitted to the receiver with the OS handling
   the buffering (hold the process and its memory until all pending sends
   are finished).  :-)

I think (4) is too much code to deal with the case of a program bug.
For the sender to exit before the Isend() finishes, it has to:

- call Finalize(): that's a no-no (bug)
- call Abort(): which should try and kill all procs in the comm or the world
- get a "bad" signal: SEGV/FPE/... the sendor dies and the application hangs
                      sooner or later

(am I forgetting some case here?)

So either 1, 2, or 3 are ok.  I prefer 1: I want to know about the error
pronto and not have to fish for why a message disappeared (2) or why the
receiver is hanging (3) even though I "sent" the message.  If the time
between the Isend() and the exit is non-deterministic, that's tracking
a very nasty bug.  Please give the descriptive error message.

--Raja

-=-
Raja Daoud				raja@osc.edu
Ohio Supercomputer Center		http://www.osc.edu/lam.html

