Newsgroups: comp.parallel.pvm
From: chettri@ncgl.gsfc.nasa.gov (Dey Melvinned Me)
Subject: Re: PVM hangups
Keywords: hang
Organization: NASA Goddard Space Flight Center -- Greenbelt, Maryland USA
Date: 6 Jan 1995 02:47:52 GMT
Message-ID: <3eib0o$3at@post.gsfc.nasa.gov>

In article <3ehp09$mvb@airgun.wg.waii.com> denham@wg.waii.com (Scott Denham) writes:
>
>of messages, each with a unique message tag.  The slave then executes
>
>      DO I=1,N
>         CALL PVMFINITSEND(PvmDataInPlace)
>         CALL PVMFSEND(...MSGTAG(I)...)
>      ENDDO
>
>while the master simultaneously executes
>
>      DO I=1,N
>         CALL PVMFRECV(...MSGTAG(I)...)
>      ENDDO
>
>When all the data from one slave has been received, the master waits
>for a message from the next slave whose work is complete, which may
>take hours.
>
>     In some cases, when running on dedicated or lightly loaded nodes,
>the application completes successfully.  When running in a normal
>production job mix, however, the application "hangs" during execution
>of the loops shown above when data is being sent back from one of the
>slave tasks.  In a typical case, all the data from one slave task is
>sent and received successfully, then the hang occurs during the data
>transmission for the next slave task.  The slave appears hung in 
>send J (for example, J=65), while the master is hung in receive K
>where K.LT.J (for example, K=32).  There are no error indications
>from any of the PVM calls preceding the ones that hang.
>

I too have had similar problems, which I couldn't quite get around.
Do try to get the latest version of PVM and see if that helps.

Also, I hope you will post a summary to the net.

Sincerely.

Samir


