Newsgroups: comp.parallel.mpi
From: lederman@hume.super.org (Steve Huss-Lederman)
Subject: Re: Truncated message...
Organization: Supercomputing Research Center, Bowie MD, USA
Date: Thu, 17 Aug 1995 18:29:52 GMT
Message-ID: <LEDERMAN.95Aug17142952@hume.super.org>

My guess is that this message came from MPICH.  I to have had this
happen.  It happens when the amount of data sent is greater than the
size of the recieve buffer posted.  It can happen for many reasons.
Common ones are you receive an unexpected message (sent to wrong
process), you use wildcards and as you do different size problems the
order gets reversed, ...  When converting from NX it can easilty
happen if you forget to scale your message size by the size of the
datatype.  This happens because NX sent bytes but MPI sents #
elements.

This type of error message has been discussed in the MPI Forum.  The
problem is that the error occurs inside MPI during a non-blocking
operation.  What should the MPI implementation do?  The user is likely
to be off doing other things.  Should you delay the error until the
Wait or Test occurs?  This can fail because the user never has to do
these operations if they know it finishes for other reasons.  Also, to
let the program continue after such an error may be dangerous.  It is
unfortunate that it is very hard to know what message is associated
with the error message.  Suggests to fix this would be nice.

Steve

