Newsgroups: comp.parallel.mpi
From: doss@ERC.MsState.Edu (Nathan E. Doss)
Subject: Proposal for MPI Spawn
Summary: Proposal for dynamic process creation/destruction in MPI
Organization: Mississippi State University
Date: 26 Feb 1995 23:23:30 GMT
Message-ID: <3ir2hi$kmt@NNTP.MsState.Edu>

Included below is the plain text version of a proposal 
for dynamic process creation and destruction in MPI. A 
postscript version of this proposal may be retrieved from
<URL:ftp://ftp.erc.msstate.edu/pub/mpi/docs/spawn_proposal.ps.Z>
or <URL:ftp://ftp.erc.msstate.edu/pub/mpi/docs/spawn_proposal.ps>.

--
Nathan Doss          doss@ERC.MsState.Edu
Anthony Skjellum     tony@CS.MsState.Edu



========-------=========-------=========-------=========-------========
A  Proposal  for  Dynamic  Process Creation  and  Destruction  in  MPI 
Spawn Proposal #1
(MPI Forum 1.5 Proposal)
Anthony Skjellum
Nathan Doss
Mississippi State University
February 26, 1995


1.  Process  creation

Traditionally, in systems such as PVM that have supported
a dynamic process model,  one process spawns one or more
additional processes as needed.  In Using  MPI  [1],  the au-
thors present a model whereby a group of processes collec-
tively spawn processes. Their approach encompasses the tra-
ditional approach (the spawning group can be of size one) as
well as broadens it (the spawning group can contain multiple
processes).  We propose a similar function:

MPI_COMM_SPAWN(string, root, comm, intercomm)

   IN          string     

               Process specification string that describes "what" is to be 
	       created and "where" it is to be created. The exact format of 
	       this argument is not specified by the standard, but is 
	       implementation dependent.  For example, it might be a 
	       filename which contains a list of processes to create or 
	       simply a number which indicates how many processes to
	       create.

   IN          root 

               Process in comm that contains a valid string.  The
               string argument in all other processes is ignored.

   IN          comm 

               Original communicator whose group synchronizes on the
               spawning of the children processes.

   OUT         intercomm

               Resulting intercommunicator. The local group contains the 
               processes found in comm; the re- mote group contains the group 
               of spawned processes.


int  MPI_Comm_spawn(char  *string,  int  root,
                    MPI_Comm  comm,  MPI_Comm  *intercomm)

MPI_COMM_SPAWN(STRING,  ROOT,  COMM,  INTERCOMM,
                    IERROR)
      CHARACTER*(*)  STRING
      INTEGER  ROOT,  COMM,  INTERCOMM,  IERROR

The return value of this call is MPI_SUCCESS if the processes
were spawned.  If the processes could not be spawned or if
spawning is not available in an implementation, MPI_ERR_SPAWN
is returned.
      The  following  code  fragment  illustrates  how  this  call
might be used. The spawn call is collective over MPI_COMM_WORLD
with all processes in MPI_COMM_WORLD receiving an intercom-
municator as the result of the spawn call. MPI_Intercomm_merge
is then used to merge the two separate worlds (represented
by the two sides of the intercommunicator) into one intra-
communicator (galaxy ).


      main  (int  argc,  char  **argv)
      {
          int       rank, root = 0;
          char     *string = (char *)0;
          MPI_Comm  intercomm,  galaxy;

          MPI_Init  (&argc,  &argv);
          MPI_Comm_rank  (MPI_COMM_WORLD,  &rank);
          if  (rank  ==  root)
             string  =  "process  specification";
          MPI_Comm_spawn(string,root,MPI_COMM_WORLD,&intercomm);
          MPI_Intercomm_merge(intercomm,  1,  &galaxy);
          /*  ...  */
      }

The spawned processes come into existence with an MPI_COMM_WORLD
that contains only the newly spawned processes. The parents
and  children  have  different  MPI_COMM_WORLD's.   The  follow-
ing  MPI_Comm_parent  function  allows  these  newly  spawned
processes to access the intercommunicator that contains the
"parent" group of processes:


MPI_COMM_PARENT(intercomm)

   OUT         intercomm

               Intercommunicator which spans both the spawned processes and 
               the processes that spawned them.


int  MPI_Comm_parent(MPI_Comm  *intercomm)

MPI_COMM_PARENT(INTERCOMM,  IERROR)
      INTEGER  INTERCOMM,  IERROR


If intercomm is not a valid communicator, then the processes
were not spawned with MPI_Comm_spawn. The following illus-
trates code that might be used by the spawned processes.

      main  (int  argc,  char  **argv)
      {
         int       rank,  root  =  0;
         char     *string  =  (char  *)0;
         MPI_Comm  intercomm,  galaxy;

         MPI_Init  (&argc,  &argv);
         MPI_Comm_parent  (  &intercomm  );
         if  (intercomm  !=  MPI_COMM_NULL)
            MPI_Intercomm_merge(intercomm,  0,  &galaxy);
         /*  ...  */
      }

The host/node process model can be considered as a special
case where the spawning communicator consists of only one
process (e.g., MPI_COMM_SELF). We note that the MPI_Intercomm_merge
function used in the two code fragments is not necessary since
the resulting intercommunicator can be used for communica-
tion.   It  is  shown  to  demonstrate  how  to  create  a  new  in-
tracommunicator that contains the complete set of existing
processes.


2.  Process destruction

Processes exit collectively just as they are created collectively.
The call to MPI_Finalize is tantamount to MPI_Comm_free(MPI_COMM_WORLD)
at which point the processes depart from the MPI universe of known
processes.  This however does leave the problem of what to do with
"dangling" communicators.  We see two alternatives:

   1.  All communicators to which a process belongs should
       be freed before MPI_Finalize is called.

   2.  Communicators to which a process belongs do not have
       to be freed before MPI_Finalize is called.

Both choices imply that all processes in a relationship with
the exiting group know that the group is indeed exiting. The
first choice implies this since MPI_Comm_free must be called
by all processes participating in a communicator. The second
choice implies this since remaining processes must know to
avoid using "invalid" communicators.


3. Alternatives

One  possibile  alternative  is  for  the  spawned  group  of  pro-
cesses be merged into the original MPI_COMM_WORLD. This im-
plies that all processes in the original MPI_COMM_WORLD must
participate in any spawn call.  This would also cause further
problems with process destruction.
      Another  alternative  was  for  an  intracommunicator  in-
stead of an intercommunicator to be returned by MPI_Comm_spawn.
Our proposal allows the creation of an intracommunicator (as
shown in the example code) if needed, but does not require it.
It is relatively simple (a single call to MPI_Intercomm_merge)
to  create  the  spanning  intracommunicator,  but  quite  a  bit
more  complicated  to  create  an  intercommunicator  from  an
intracommunicator.


4. Summary

We make the following observations:
    o  the parent spawners get an intercommunicator back,
    o  spawn is a collective operation for spawning processes,
    o  implementations are not required to implement the spawn
       function,
    o  the children's world is the remote group of the parent's
       intercommunicator,
    o  parents and children have a collective relationship im-
       mediately,
    o  this bootstraps to bigger "comm_worlds" with well-characterized
       semantics, and no race conditions,
    o  processes are destroyed collectively.



Bibliography
[1] William Gropp, Ewing Lusk, and Anthony Skjellum. Us-
    ing  MPI:  Portable  Parallel  Programming  with  the  Mes-
    sage Passing Interface.  MIT Press, 1994.

