Newsgroups: comp.parallel.mpi
From: gdburns@osc.edu (Greg Burns)
Subject: Re: Problem running LAM 6.0 on RS6000
Organization: Ohio Supercomputer Center
Date: 20 Sep 1996 15:17:50 -0400
Message-ID: <51uqkv$b58@tbag.osc.edu>

In article <3242BAC2.41C6@informatik.uni-tuebingen.de> Juergen Wakunda <wakunda@informatik.uni-tuebingen.de> writes:
>
>"recon" produces this output:
>
>raibm04:[lam60] >recon -adv myconf
>recon: boot schema file: /home/wakunda/lam60/boot/myconf
>recon: found 2 host node(s)
>recon: origin node is 0
>recon: testing n0 (raibm04)
>recon: testing n1 (raibm02)
>Where are you?
>recon: "raibm02" cannot be booted.
>recon: Error 0
>raibm04:[lam60] >

The "Where are you?" is being printed by your shell.  Your shell
on the remote machine must not print anything to stderr when invoked
non-interactively or else recon(1) and lamboot(1) it interpret it to be
a failure message of some kind.  We'd like to simply check exit status
but rsh is not that sophisticated.

>When i try to start hboot by hand, there is this error message:
>
>raibm02:[lamb60] >hboot -vc myconf -I "-n0 -o0 raibm04 1"
>hboot: cannot find executable raibm04: No such file or directory

The -c option to hboot takes a LAM process schema (list of programs
that constitute LAM on a node) and you are passing the LAM boot schema
(list of nodes that constitute the virtual machine).  Use the
factory supplied file conf.lam instead.

-=-
Greg Burns				gdburns@osc.edu
Ohio Supercomputer Center		http://www.osc.edu/lam.html

