Newsgroups: comp.parallel.pvm
From: "Ralf - G. Au" <au_r@informatik.fh-hamburg.de>
Subject: Spawned processes (slaves) terminate for no apparent reason
Organization: Fachhochschule Hamburg, FB Informatik
Date: Wed, 04 Sep 1996 14:11:40 +0200
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <322D71FC.41C6@informatik.fh-hamburg.de>

Hello PVM-Programmers all over the world !

A fellow student and I are porting a raytracer to PVM, and we're
experiencing some trouble.

We are using PVM Version 3.3.11 on a cluster of DECStation 5000/200's
(PMAX architecture referring to pvmgetarch) running DEC Ultrix
V4.4 (Rev. 69).

We have implemented a farmer/worker model with the workers doing the
actual raytracing and the farmer coordinating them and doing all the
messy I/O stuff.

When running the program, (the workers' output is redirected via
pvm_catchout() ) the workers exit after a while, having already
reached the main loop, for no apparent reason, not even producing
an error message. The farmer then gets and displays a "[tid] EOF"
from each worker, nothing more. The /tmp/pvml.[uid] file does not 
state any error messages either.

Our question is now, of course:
has anyone experienced the same kind of problems and have You found
a solution to this ? Maybe we could port the whole thing to DEC ALPHA
Stations, but there are currently only two of them, or to
SUN SparcStations, if the problem originates from the DEC/PMAX
architecture.
Also, is there any kind of exit code returned by a spawned process
and how can we access that ?


ANY kind of help would be greatly appreciated, as we are trying to
finish this program a.s.a.p., i.e. until the end of the month, if
possible.

Greetings from Germany

Ralf Au
-- 
Ralf Au <au_r@informatik.fh-hamburg.de>
Fachhochschule Hamburg, Germany
Fachbereich Informatik / Informatics Faculty

