Newsgroups: comp.parallel.mpi
From: dar@greatwall.cs.wayne.edu (David A. Reimann)
Subject: Re: SGI MIPCH Question
Organization: Wayne State University
Date: 12 Sep 1995 02:29:14 GMT
Message-ID: <432r9q$s3i@wsu-cs.cs.wayne.edu>

In article <4321ov$d61@epx.cis.umn.edu> Steven VanderWiel <svw> writes:
>Hi,
>
>We are running version 1.0.10 of mpich on a cluster of 4 SGI Challenges
>each of which has 4 processors.  When we compile with DEVICE, COMM = ch_shmem
>everything works fine on a single machine.  When we compile with
>DEVICE, COMM = ch_p4 everything works fine if we only use one processor on 
>each machine.  However, when we set DEVICE, COMM = ch_p4 and try to run on 
>all 16 processors of the cluster we have problems.  Specifically, the problem
>manifests itself as follows:
>
>     > mpirun -p4pg pgfile gauss_mpi
>     p0_9146:  p4_error: more slaves than msg queues
>     : 3
>     P4 procgroup file is pgfile.
>     >
>
>where the contents of pgfile is:
>
>    polar   3   /export/home/kittpeak/lilja/svw/bin/gauss_mpi
>    grizzly 4   /export/home/kittpeak/lilja/svw/bin/gauss_mpi
>    panda   4   /export/home/kittpeak/lilja/svw/bin/gauss_mpi
>    kodiak  4   /export/home/kittpeak/lilja/svw/bin/gauss_mpi
>
>Any ideas/advice?  Thanks.

Try listing only 1 process per line 3 or 4 times for each process:

    polar   1   /export/home/kittpeak/lilja/svw/bin/gauss_mpi
    polar   1   /export/home/kittpeak/lilja/svw/bin/gauss_mpi
    polar   1   /export/home/kittpeak/lilja/svw/bin/gauss_mpi
    grizzly 1   /export/home/kittpeak/lilja/svw/bin/gauss_mpi
    grizzly 1   /export/home/kittpeak/lilja/svw/bin/gauss_mpi
	etc.

I had a similar problem running on a dec alpha with 2 cpu's and the
ch_p4 device.  I thought you should be able to specify with the nice
compact notation, but perhaps the implementation requires inefficient
startup for ineffiecient communication :-).

Dave.
----------------------------------------
David A. Reimann
Doctoral Student
Departmernt of Computer Science
Wayne State University
Detroit, MI  48202
reimann@cs.wayne.edu  EMAIL
WWW URL: http://www.cs.wayne.edu/~dar/index.html

