Newsgroups: comp.parallel.mpi
From: "Marcus Marr" <marr@dcs.ed.ac.uk>
Subject: Re: Problem using LAM5.2
Organization: Computer Systems Group, The University of Edinburgh
Date: Mon, 4 Dec 1995 17:46:51 GMT
Message-ID: <MARR.95Dec4174651@jura.dcs.ed.ac.uk>


> Hi, Currently, i have a piece of MPI code and it runs
> successfully using MPICH.  However, with the same piece of
> code, it hangs half way through the execution when using
> LAM5.2.  May i know what could be the possible reasons?  I'm
> running the processes on a SP2.  Thanx in advance!

One possibility is that you are running out of buffer space.  Other
implementations of MPI will allow a communication to succeed without
using the buffer as long as the matching receive has already been
posted, but this is not the case with LAM-MPI.

The most common cause for this sort of behaviour is if one process is
flooding another with messages (e.g. as in a pipeline) and hoping that
an MPI_Send will block until either buffer space becomes available
*or* a matching MPI_Recv has been posted.  Doubling the available
buffer space will not normally help things here.  If this is the case,
the simplest solution is to use 'synchronous' sends using MPI_Ssend().

Good luck,

Marcus

