Newsgroups: comp.parallel.mpi
From: recdmj@arhsgi1 (Dan M. Jensen)
Subject: Re: Performance problems using mpi on the SGI Power Challenge
Organization: News Server at UNI-C, Danish Computing Centre for Research and Education.
Date: 26 Sep 1996 11:49:21 GMT
Message-ID: <52dqk1$b9u@news.uni-c.dk>


This is a follow up to Thomas Fiig's original post concerning the degradation
of MPI on a heavily loaded SGI Power Challenge, using SGI's own MPI
implementation.

The trouble was identified as being due to a combination of scheduling and
synchronisation (processses spinning while waiting for MPI communications
to finish).

The responses mostly recommended that #processes < #processors, i.e, the
system should not be under a high load. This requires strict queueing policies
and is not always desirable.

However, the problem can be made to disappear _completely_ by gang scheduling
the MPI program. This is a very simple procedure on the SGI Power Challenge -
simply insert the code below after calling MPI_Init and MPI_Comm_rank (to get
the rank variable).

> #include <limits.h>
> #include <sys/types.h>
> #include <sys/prctl.h>
> #include <sys/schedctl.h>
> 
>     /* Impose gang scheduling */
>     if (rank == 0) {
>         if (schedctl(SCHEDMODE, SGS_GANG) < 0) {
>             fprintf(stderr,"%s: Error in schedctl call\n", argv[0]);
>             perror("schedctl");
>         }
>     }

Thomas and I redid his timings with this piece of code inserted and the
performance degradation disappeared altogether.

I'm not sure how well this will work on a code that is not as well load
balanced as Thomas'.

I hope others may benefit from this as well.

	-Dan

______________________________________________________________________________
Dan Moenster Jensen                           E-mail: Dan.M.Jensen@uni-c.dk
UNI-C, Research Division                       Phone: (+45) 8937 6621
Olof Palmes Alle 38                              Fax: (+45) 8937 6677
DK-8000 Aarhus C, Denmark
______________________________________________________________________________


