Newsgroups: comp.parallel.pvm
From: sam@kalessin.jpl.nasa.gov (Sam Sirlin)
Subject: Re: SUNMP bug in pvm3.3.8?
Organization: Jet Propulsion Laboratory, Pasadena, CA
Date: 16 Aug 1995 21:18:00 GMT
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-ID: <40tna8$eut@netline-fddi.jpl.nasa.gov>

In article <40nj5u$2f6k@unixfe.rl.ac.uk>, rff@inf.rl.ac.uk (Ronald Fowler) writes:
|> Having gotten 3.3.8 to try and fix problems with SUNMP hanging,
|> I now find the communication between SUNMP nodes works ok.
|> However, for sending messages to tasks on other hosts (SGI5 and SUN4)
|> some data is lost. It seems that for any message over about 1000
|> bytes, a few values at the end of the message are lost, and an
|> end of buffer warning is generated from pvm_unpack. This is for a
|> simple ping-pong benchmark in Fortran (pvmfsend/pvmfrecv).
|> All is ok if I compile as SUN4SOL2, though this kills the
|> SUNMP->SUNMP speed.
|> 
|> Is this a bug or have I forgotten to set up something on the SUNMP?

We also seem to be seeing this same thing, with a machine running
solaris 2.5 and pvm3.3.8, a master/slaves fortran application. In
fact, just running on the sunmp doesn't seem to work. We seem to be
getting things like (from log files)

[t80040000] da_new() Warning: shared buffer full, using malloc
or

[t80040000] da_new() len = 4080: frag must fit in a page
or 
   Messages too long for shared buffer, deadlocked

Perhaps the shared memory needs to be reset? I've seen messages here
regarding alphas. Is there something similar for sunmp? Let me add
that I don't myself have direct access to the sunmp and have been
running pvm 3.3.7 on sunos4.1.3 - is there an fm to be rt'ed regarding
the mp?

-- 
Sam Sirlin                Jet Propulsion Laboratory         
Email: sam@kalessin.jpl.nasa.gov
WWW: http://grover.jpl.nasa.gov/~sam/index.html

