Newsgroups: comp.parallel.mpi
From: Jean Marc Adamo <adamo>
Subject: Re: MPI + threads ?
Organization: IPL
Date: 7 Feb 1996 19:22:07 GMT
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <4fau4v$8u8@tempo.univ-lyon1.fr>

I have written a C++/MPI-based library: ARCH, that provides Threads and 
many other features: local/remote synchronous communication channels, 
global pointers, global read/write functions, spread arrays, etc...

The availabilty announcement is currently being processed by Cornell 
Theory Center. The package can be accessed on the WWW at:

/afs/theory.cornell.edu/archive/ftp/pub/ARCH/ARCH.v.2/

Here is an abstract of what is offered in the library:


ABSTRACT

ARCH is a C++/MPI-based library for asynchronous and loosely synchronous 
system programming. The current version offers a set of programming 
constructs that are outlined below:


Threads.

The construct is presented as a class from which the user can derive his own
classes. The class encapsulates a small set of status variables and offers a
set of functions for declaration, initialization, scheduling, priority
setting, yielding and stopping.


Processes.

A process is a more regular and structured programming construct whose
scheduling and termination obey additional synchronization rules. Together
with the synchronous point-to-point communication system offered in the
library (see below), processes favor a parallel programming style similar to
OCCAM's (actually, an extension of it that removes most static features and
allows processes to share data). The semantics of this model is well
understood and will undoubtedly facilitate the development of correct large
asynchronous code. The library has been designed so that the C++ compiler is
able to check the static semantics of programs (complete type checking,
send-recv correct matching, ...).


Synchronous communication.

Threads and processes synchronize and communicate via communication channels.
There are four types of communication channels for local or remote 
synchronization or synchronous point-to-point communication. Inter-processor
channels are essentially tools for building virtual topologies. The channel
classes offer functions to send to or receive from a channel and get the size
of the latest received message. More specialized synchronization-communication
tools can be derived from channels.


Global data and pointers.

Beside threads, the library offers basic tools for developing distributed data
abstractions. Global data are data that can be defined at given locations in
the distributed memory but are visible from all processors. Global pointers are
a generalization of C++ pointers that allow for addressing global data at any
place over the distributed memory. As usual pointers, global pointers are
subjected to arithmetic and logic manipulations (incrementation, dereferencing,
indexing, comparison...). The library provides basic operators for global data
and pointer definition.


Global read/write functions.

Global pointer expressions provide global references over the distributed
memory that can subsequently be used as arguments to global read/write
functions. These functions allow the processors to get access to all global
data regardless of their locations over the distributed memory. In their most
complete form, the read/write functions operate as remote procedure calls. At
the programmer's level, global read/write functions appear as "one-sided":
a read/write operation is executed on the processor that needs to read/write
global data but need not be explicitly handled by the processor associated to
the memory holding the data.


Spread and remote Arrays.

Two basic distributed data structures have been built in the library. Spread
arrays are arrays that have some of their dimensions spread over the
distributed memory according to a given policy. Remote arrays are arrays that
are defined at a given place in the distributed memory but can be accessed
from any other. The spread and remote array classes (SpreadArray and
RemoteArray) provide functions for global reference calculation. Global
references can subsequently be used as arguments to global read/write
functions. One can specialize  global pointers to operate on spread or remote
arrays. The global pointer class (Star class) offers distinct arithmetic
and logic operator sets for unassigned, spread and remote globalpointers.


The library encourages parallel code writing in a style that relies on the
object-oriented approach: first, build the abstractions that the application
at hand relies on; next, make an efficient implementation of the abstractions;
and finally, develop the application on top of them. The abstractions can be
distributed data types derived from those built in the library (spread and
remote arrays: see code of the segmentation algorithm provided with the
library) or new distributed types built in the same way or types reused from
other applications. This approach should favor parallel code production with
many desirable properties such as efficiency, portability, reusability, 
.. .


The library uses MPI as a communication interface. The current implementation
runs on the IBM-SP2. Two versions of the library have currently been released.
The first one is based on the IBM C++ compiler and MPI library. The second one
makes use of the GNU g++ compiler and the MPICH public domain version of MPI.
Porting the latter to any parallel machine supporting these two software
systems should be straightforward.


