Newsgroups: comp.parallel.mpi
From: w.purvis@dl.ac.uk (Bill Purvis, ext 3357)
Subject: Problem with model implementation for Intel NX/2
Organization: Daresbury Laboratory, UK
Date: 22 Jul 1994 13:16:08 GMT
Message-ID: <30ogqo$ro0@mserv1.dl.ac.uk>

I have just hit a problem using MPI_Barrier in a trivial program.
My code is:

#include <mpi.h>
 
main(argc, argv)
int argc;
char **argv;
{
  MPI_Init(&argc, &argv);
  MPI_Barrier(MPI_COMM_WORLD);
  printf("node %d: barrier complete\n", mynode());
}

I am running this with the model implementation from ANL, dated April 19th, 1994
and I have compiled using the ch_nx implementation.
When I run the compiled program on 4 nodes of our iPSC/860 I get the following
output:

node 0: barrier complete
node 1: barrier complete
(node 2, pid 0): Segmentation violation, data address FFFFFFF0 at 00015FC0
(node 3, pid 0): Segmentation violation, data address FFFFFFF0 at 00015FC0

I have tried debugging this by inserting debug prints in the MPI_Barrier
routine, but when I do, the program now runs OK. It seems to me there are
two possible explanations - the compiler has screwed up the code when it
compiled MPI_Barrier and the print statements are sufficient to correct it,
or - there is a timing problem which is messing things up and the printfs
slow things down enough to avoid the problem. Has anyone else seen anything
like this and can anyone suggest a fix/work-around?

Bill Purvis



I 

