Newsgroups: comp.parallel.mpi
From: "Michael D. Beynon" <beynon@cs.umd.edu>
Subject: LAM 6.0 lamboot problem
Organization: Computer Science Department, University of Maryland
Date: Fri, 28 Jun 1996 13:58:05 -0400
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <31D41D2D.2781E494@cs.umd.edu>

This is specific to LAM 6.0.

Since lamboot must be run on one of the nodes specified in the
bhost file, I need to rsh the command from a front end machine
to one of the nodes.  The problem is that rsh hangs after lamboot
finishes.

For example: (front end is named tsunami, and the nodes are alf01
              to alf08)
	tsunami%  cat ~/bhost.def
        alf01
        alf02

	tsunami%  rsh alf01 lamboot -v ~/bhost.def

	LAM 6.0 - Ohio Supercomputer Center

	hboot n0 (alf01)...
	hboot n1 (alf02)...
	hboot n2 (alf08)...
	topology done      

At this point the rsh just hangs.  To show that lamboot has actually
completed, I did ...

	tsunami%  rsh alf01 'lamboot -v ~/bhost.def ; echo "Hello"'

	LAM 6.0 - Ohio Supercomputer Center

	hboot n0 (alf01)...
	hboot n1 (alf02)...
	topology done      
	Hello

It still hangs here.  To prove that indeed this problem is due to
something that lamboot is doing, I now try to run this on a node
that is not in the bhost file ...

	tsunami%  rsh alf08 'lamboot -v ~/bhost.def ; echo "Hello"'

	LAM 6.0 - Ohio Supercomputer Center

	lamboot: local host not present
	Hello

	tsunami% 

This does not hang, so the problem is related to the correct execution
of lamboot, causing rsh to never finish.  I also tried the "-n" option
to rsh, even though I did not believe it was the problem, and it did
not help.

Note that for all these examples, a wipe *will* work ...

	tsunami%  rsh alf01 wipe -v ~/bhost.def
	tkill n0 (alf01)...
	tkill n1 (alf02)...

	tsunami% 

Anyone have a suggestion as to what is happening?

Thanks,
Mike.
-- 
===================================================================
Michael D. Beynon                    Department of Computer Science
A.V. Williams Bldg, University of Maryland, College Park, MD  20742
Email: beynon@cs.umd.edu        WWW: http://www.cs.umd.edu/~beynon/

