Newsgroups: comp.parallel.pvm
From: norman@cs.toronto.edu (Norman Wilson)
Subject: Re: Problem with PVMFKILL - or how to use
Organization: University of Toronto
Date: 5 Aug 94 20:40:32 GMT
Message-ID: <1994Aug5.164032.3585@jarvis.cs.toronto.edu>

Jan Pederson describes a problem in which calling pvm_kill on a
FORTRAN task kills the pvmd on that system too.  He doesn't say
what kind of system it is, but I bet it's an SGI.

The SGI IRIX FORTRAN runtime library contains a charming feature:
when a FORTRAN program receives a kill (SIGTERM) signal, it sends
an interrupt (SIGINT) signal to its parent before exiting.  Allegedly
this has to do with tracking down and killing all the pieces of a
multi-threaded FORTRAN program (i.e. one running in parallel using
SGI parallel mechanisms).  The parent process of a task started with
pvm_spawn is pvmd; pvmd doesn't expect interrupt signals; hence
pvm_kill (which sends SIGTERM to the subject task) kills the daemon
too.

I believe SGI refuse to admit this is a bug.  They're wrong, and I
encourage SGI customers to file bug reports so stating.  Sending
signals to arbitrary processes, which may not be prepared for them,
is just silly.  SGI should either use another mechanism--what do they
do if the FORTRAN program gets SIGKILL, which it can't trap?--or at
least find some way for the parent to expressly request such signals,
and send them only if wanted.

Diatribes aside, it's easy to change PVM to work around the problem;
in $PVM_ROOT/lib/pvmd (which is a shell script), add the line
	trap '' 2
just before the `exec $PVM_ROOT/lib/$PVM_ARCH/pvmd3' line (which is
nearly the last line in the file).  You need only do this on SGI
systems; or if you want, you can add fancier code like
	case "$PVM_ARCH" in
	SGI|SGIMP)
		trap '' 2
	esac
so it is done only on SGI systems.  (I just let the trap happen on
all my systems; it's simpler, and it's harmless.)

I've reported the problem and my simple fix to the PVM folks; I suspect
it will be in a future release.  (I hope I haven't just put words in
someone else's mouth.)

Norman Wilson
University of Toronto
norman@utirc.utoronto.ca

