Newsgroups: comp.parallel.pvm
From: Chet Vora <chet@brc.uconn.edu>
Subject: Messages get lost
Organization: University of Connecticut
Date: Fri, 12 Apr 1996 20:09:59 -0400
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <316EF0D7.41C67EA6@brc.uconn.edu>

Hi everyone,

I have a MD simulation application where a slave task exchanges info
with its 8 neighbours (in 2 Dimensions). I am using PVM3.3.10 over a
Solaris2.4 cluster of SparcLXs. In the program, I have a setup of  this
kind:
[CONFIG 1]:
--------------------------------------------------
for (i=1 to no_neighbours)
	{
	pvm_initsend(PvmDataDefault);
	pvm_pkint(&data,size,1);/*size is some fixed no. for all sends*/
	pvm_send(neighbour_taskid[i],MSG);
	}
for (i=1 to no_neighbours)
	{
	pvm_recv(neighbour_taskid[i], MSG);
	pvm_upkint(recvbuf,size,1);
	print...
	}
---------------------------------------------------
....which works.But if I change this setup to 

[CONFIG 2]:
for (i=1 to no_neighbours)
	{
	pvm_initsend(PvmDataDefault);
	pvm_pkint(&data,size,1);
	pvm_send(neighbour_taskid[i],MSG);
	pvm_recv(neighbour_taskid[i], MSG);
	pvm_upkint(recvbuf,size,1);
	print...
	}
---------------------------------------------------
.... all of a sudden, messages seem to get lost and all tasks hang
( it seems as if they are waiting for the recv which is never going to 
occur).

Is this normal ??

Also, CONFIG 1 only seems to work fine when the data sent is fixed and
uniform for each send. For instance, if I malloc a different sized data
for each task and do the send/recv, the tasks again hang.
[CONFIG 3]:
for (i=1 to no_neighbours)
	{
	pvm_initsend(PvmDataDefault);
	data = ...malloc some arbitrary size and fill it..
	pvm_pkint(&data,size,1);/*size is diff for each sends*/
	pvm_send(neighbour_taskid[i],MSG);
	}
for (i=1 to no_neighbours)
	{
	pvm_recv(neighbour_taskid[i], MSG);
	...malloc a recvbuf of appropriate size...
	pvm_upkint(recvbuf,size,1); 
	print...
	}

The symptoms for CONFIG 3 are that some tasks seem to be receive the
correct data and but after a while, they get junk data ( or all 0s) 
and hang.

Any thoughts or suggestions ?? I would greatly appreciate any ideas or
help.

Thanks in advance,
Chet
-- 
************************************************************************
Chet Vora		www: http://www.eng2.uconn.edu/~chet
The scientific theory I like the most is that Saturn's rings are
composed  entirely of lost airline baggage.  -Mark Russel
************************************************************************


