Newsgroups: comp.parallel.pvm
From: papadopo@cs.utk.edu (Philip Papadopoulos)
Subject: Re: PVM install problem...
Organization: CS Department, University of Tennessee, Knoxville
Date: 24 Feb 1995 10:55:32 -0500
Message-ID: <3ikvhkINNst0@duncan.cs.utk.edu>

In article <3ik27p$j0q@serra.unipi.it> Marcello-Gianluca <meta@calpar.cnuce.cnr.it> writes:
>dendris@leon.cti.gr wrote:
>>
>> hi everyone,
>>         I just installed PVM 3.3.0 to a SPARCStation 5. I have the following
>> problem: My programs, as well as the distribution examples, do not work or
>> hang when I try to spawn more than 10 - 12 processes (i.e spmd.c when
>> compiled with NPROCS = 10, starts but hangs...)
1) check the /tmp/pvml.<uid> to see if there are any tell-tale messages that
   might point the problem
2) Check the return tid list when instance 0 spawns the other processors
   to see if there is a problem when you originally spawn. 
>> 
>>         Is there anything to do? I tried to increase the available file
>> descriptors to 256, (so that the pvmd will be able to create the necessary
>> sockets) but to no avail... Every idea would be more than welcome.
>> 
>Perhaps, the problem is due to the limit on the number of TCP 
>connections. By default, PVM tries to establish direct TCP connections
>between communicating processes. If PVM reaches the limit, the program
>hangs. We had a similar problem on an IBM SP1.
The daemon makes TCP (or unix domain if available) socket connections
to #local# processes.  The default routing option for PVM
is procees 1 -- pvmd 1 -- pvmd2 -- process 2. The connection between
pvmd 1 and pvmd 2 uses a UDP connection with message retry. 
For speed you can turn on direct routing as an option so that
messages from process 1 to 2 bypass the intermediate daemons.

I believe that in IBM's version of pvm that there is a single pvmd
for the whole SP1 ( an SP1 is a collection of IBM workstations with
a high-performance switch for communications, each  "node" on the SP1 runs a 
complete version of AIX).  The public version will run over the switch
if  you run TCP/IP over the switch and put a pvmd on every  node. 

-Phil Papadopoulos


