Newsgroups: comp.parallel.pvm
From: jacobs@netcom.com (Allan Jacobs)
Subject: Re: PVM hangups
Keywords: hang
Organization: Applied Parallel Research (Topanga)
Date: Fri, 6 Jan 1995 04:35:28 GMT
Message-ID: <jacobsD1yvF5.4t2@netcom.com>

chettri@ncgl.gsfc.nasa.gov (Dey Melvinned Me) writes:

>In article <3ehp09$mvb@airgun.wg.waii.com> denham@wg.waii.com (Scott Denham) writes:
>>
>>of messages, each with a unique message tag.  The slave then executes
>>
>>      DO I=1,N
>>         CALL PVMFINITSEND(PvmDataInPlace)
>>         CALL PVMFSEND(...MSGTAG(I)...)
>>      ENDDO
>>
>>while the master simultaneously executes
>>
>>      DO I=1,N
>>         CALL PVMFRECV(...MSGTAG(I)...)
>>      ENDDO
>>
>>When all the data from one slave has been received, the master waits
>>for a message from the next slave whose work is complete, which may
>>take hours.
>>
>>     In some cases, when running on dedicated or lightly loaded nodes,
>>the application completes successfully.  When running in a normal
>>production job mix, however, the application "hangs" during execution
>>of the loops shown above when data is being sent back from one of the
>>slave tasks.  In a typical case, all the data from one slave task is
>>sent and received successfully, then the hang occurs during the data
>>transmission for the next slave task.  The slave appears hung in 
>>send J (for example, J=65), while the master is hung in receive K
>>where K.LT.J (for example, K=32).  There are no error indications
>>from any of the PVM calls preceding the ones that hang.
>>

>I too have had similar problems, which I couldn't quite get around.
>Do try to get the latest version of PVM and see if that helps.

>Also, I hope you will post a summary to the net.

>Sincerely.

>Samir

I've had problems like that too.  Does your program use pvm_setopt?

Allan Jacobs
Applied Parallel Research

-- 
Allan Jacobs		jacobs@netcom.com	(310) 455-4111

