Newsgroups: comp.parallel.pvm
From: edemaine@ug.cs.dal.ca (Erik Demaine)
Subject: Re: PVM out of memory
Organization: Math, Stats & CS, Dalhousie University, Halifax, NS, Canada
Date: Thu, 17 Aug 1995 12:31:32 GMT
Message-ID: <DDGG4L.IB0@cs.dal.ca>

Jonathan King (jking@cv.HP.COM) wrote:
: I'm currently debugging a large parallel application ported to PVM, and am
: running across 4 nodes.  The program is very long running (esp. on
: workstations), and eventually (could be a couple of days into a run) PVM
: dies with the error:
: libpvm [t4000d]: fr_new() can't get memory
: Segmentation Fault - core dumped
: Now, each node has AT LEAST 128 Meg of ram on board, + 1 gig of swap, and the
: program, in any one iteration should not be coming anywhere CLOSE to that
: number. ...

Perhaps you are not receiving some messages that you pass.  (e.g. they are
sent with tags that are never used upon receival).  To check this, you might
increment a counter for every message sent, and increment a different
counter for every message received.  Do a global sum on these counters and
compare them...

Erik
--
Erik Demaine        || edemaine@ug.cs.dal.ca  || edemaine@fx2800.dal.ca
edemaine@cs.dal.ca  || 01ERIK@ac.dal.ca       || edemaine@is.dal.ca
URL: http://ug.cs.dal.ca:3400/~edemaine/edemaine.html
*** The letter "V" is just a "for every" symbol without the middle line.

