Newsgroups: comp.parallel.mpi
From: hioki@sci.hiroshima-u.ac.jp (Shinji Hioki)
Subject: more than 12 processes on Cray J90
Organization: Faculty of Science, Hiroshima University, Japan
Date: 23 Apr 1996 09:31:04 GMT
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Message-ID: <4li80o$91p@aso.sci.hiroshima-u.ac.jp>

Hi, this is Shinji Hioki

I am now using mpich on the CRAY J90 with 24 PROCs.
When I try to run a sample program mpich/examples/basic/cpi ,
the program run well under 12 processes like:
--------------
J90% mpirun -np 12 cpi
pi is approximately 3.1416009869230600, Error is 0.0000083333332697
wall clock time = 0.111000
--------------
BUT if I specify more than 12 processes , the program fails like:
--------------
J90% mpirun -np 16 cpi
p0_26027:  p4_error: Timeout in making connection to remote process on J90: 0
bm_list_26028:  p4_error: interrupt SIGINT: 2
rm_l_0_26139:  p4_error: interrupt SIGINT: 2
rm_l_0_26150:  p4_error: interrupt SIGINT: 2
.........
rm_26038:  p4_error: net_recv recv:  EOF on socket: 1457809
--------------

Does anyone know how to handle more than 12 processes on CRAY J90 ?
I have checked that I can spawn rsh and/or rlogin more than 20 processes
at a time.
Thanks in advance.
Shinji Hioki, Hiroshima Univ. JAPAN

