Newsgroups: comp.parallel.mpi
From: Jaideep Ray <jaray@nubis.rutgers.edu>
Subject: Re: Help needed to run MPICH on a cluster of SGI workstations
Organization: Rutgers Univ.
Date: 17 Mar 1996 04:37:40 GMT
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <4ig4uk$sq3@dziuxsolim.rutgers.edu>

ZJ <zjw@cfdrc.com> wrote:
>Hi, there
>
>I have a cluster of SGI workstations and had no problems in configuring every  machine.
>I ran cpi successfully on each machine with several processes.  However, I could not run
>a program with different machines.  I successfully set up the .rhosts file so that I can
>use rsh to login in the workstations in the cluster.
>
>When I run tstmachines, I got
>
>
>Errors while trying to run ls /usr/people/p4815/mpich/bin/machines/foo
>Unexpected response from glasgow.cfdrc.com:
>--> UX:ls: ERROR: Cannot access /usr/people/p4815/mpich/bin/machines/foo: No such file
>or directory
>Unexpected response from glasgow.cfdrc.com:
>--> UX:ls: ERROR: Cannot access /usr/people/p4815/mpich/bin/machines/foo: No such file
>or directory
> 
>2 errors were encountered while testing the machines list for sgi
>Only these machines seem to be available
>    china.cfdrc.com
>    china.cfdrc.com
>    china.cfdrc.com
>

	Looks strange.

	1) Which workstation did you run the successful runs on ? 
	   china.cfdrc.com ?

	2) If that's so, did you try to run the cluster run from 
	   china.cfdrc.com ?

	3) log into glasgow and see if /usr/people/p4815/mpich/bin/machines/
	   exists. 

	4) What does your mpich/util/machines/machines.sgi look like ?
	   should look like -

		china.cfdrc.com
		glasgow.cfdrc.com
		<----other machines in the cluster, one per line --->

	Keep me posted.

	Ray


