Newsgroups: comp.parallel.pvm
From: robert@zaphod.inet.dkfz-heidelberg.de (Robert Niebsch)
Subject: problems with PvmRouteDirect on SUNMP
Organization: Department MBI, DKFZ-Heidelberg
Date: 30 Aug 1995 15:57:04 GMT
Message-ID: <4221og$gep@sun0.urz.uni-heidelberg.de>

hi all,

we have the following problem:

We have several slave-processes, which communicate with each

other.  We detected a strange deadlock, when two slaves mutually
try to send a message via TCP at the same time: 
Both slaves are blocked in the pvm_send - routine, resulting
a deadlock !!??

The pvml.uid - logfile contains the following error - message:


slave1 - tid = t40394,
slave2 - tid = t40396.

*********************************************************
libpvm [t40394]: pvmmctl() connect: Connection refused
libpvm [t40396]: pvmmctl() connect: Connection refused
*********************************************************

This problem doesn't occur whith the communication via pvmd
(option PvmDontRoute)

The number of slave - processes is 32 ,
the number of file descriptors is 1024,

We work on a Sparc20 - 2Processor - Machine with
SOLARIS 2.4 .....

thanks for your help, 

robert



