Newsgroups: comp.parallel.mpi
From: Jean-Marc Adamo <adamo>
Subject: CTC Releases ARCH Library of Parallel Programming Tools
Organization: Cornell Theory Center
Date: 13 Feb 1996 00:06:25 GMT
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <4fokm1$1990@theory.tc.cornell.edu>

For Immediate Release
January 29, 1995


New Tools for Parallel Software Development Released by Cornell
Theory Center

Ithaca, NY-The Cornell Theory Center (CTC) announced today the
availability of ARCH, an object-oriented library of tools for
parallel programming on machines using the MPI (message passing
interface) communication library. Based on C++, ARCH was
developed by Jean-Marc Adamo, CTC visitor and professor in the
Ecole Superieure de Chimie Physique et Electronique and the
Universite Claude-Bernard, Lyon, France.

ARCH offers researchers a set of flexible programming constructs
for parallel software development for asynchronous and loosely
synchronous system programming, creating the illusion of shared
memory on distributed memory machines.

Jean-Marc Adamo visited CTC in 1995 in order to extend ARCH's
capabilities, implement the library on the Center's IBM RS/6000
POWERParallel System (SP), and convert the tools to be compatible
with MPI. Over a six-month period, he focused on applying ARCH to
an image segmentation application in remote sensing that involved
two particularly complex data structures-quadtrees and globally
distributed connectivity graphs. The work was supported by CTC
and the French government, as well as his home institution.

"The library provides an alternative to the use of fixed languages
for parallel programming and encourages a parallel programming
style that relies on the object-oriented approach," said Jean-Marc
Adamo, who began his work on ARCH while visiting Berkeley on
sabbatical. He noted that the library is well-suited to irregular
and dynamic problems.

"ARCH allows distribution not just of arrays, but also of general,
user-defined data structures, including pointers to remote data, "
said CTC parallel programming specialist David Schneider. "In
addition, because it allows essentially arbitrary mapping of data
objects to processors, as well as facilities for dynamic load
balancing via lightweight threads, ARCH should be of particular
interest to researchers tackling complicated, irregular dynamic
problems." Detailed information on ARCH is available in a Cornell
Theory Center Technical Report (CTC95TR228), which can be accessed
via the World Wide Web at:

http://www.tc.cornell.edu/Research/tech.rep.html

The library is available via ftp at:

ftp.tc.cornell.edu in /pub/ARCH/

CTC, one of four high performance computing and communications
centers supported by the National Science Foundation, operates
a 512-processor IBM SP system. Activities of the Center are also
funded by New York State, the Advanced Research Projects Agency,
the National Center for Research Resources at the National
Institutes of Health, IBM, and other members of CTC's Corporate
Partnership Program.

For more information, contact Linda Callahan, Director of External
Relations, Cornell Theory Center:

e-mail: cal@tc.cornell.edu
phone: 607-254-8610
fax: 607-254-8888
http://www.tc.cornell.edu/

                        ---------------------------

For technical information Contact Jean-Marc Adamo:

e-mail: adamo@tc.cornell.edu



Abstract.
--------

ARCH is a C++-based library for asynchronous and loosely synchronous system
programming. The current version offers a set of programming constructs that
are outlined below:

Threads.

The construct is presented as a class from which the user can derive his own
classes. The class encapsulates a small set of status variables and offers a
set of functions for declaration, initialization, scheduling, priority
setting, yielding and stopping.

Processes.

A process is a more regular and structured programming construct whose
scheduling and termination obey additional synchronization rules. Together
with the synchronous point-to-point communication system offered in the
library (see below), processes favor a parallel programming style similar to
OCCAM's (actually, an extension of it that removes most static features and
allows processes to share data). The semantics of this model is well
understood and will undoubtedly facilitate the development of correct large
asynchronous code. The library has been designed so that the C++ compiler is
able to check the static semantics of programs (complete type checking,
send-recv correct matching, ...).

Synchronous communication.

Threads and processes synchronize and communicate via communication channels.
There are four types of communication channels forlocal or remote
synchronization or synchronous point-to-point communication. Inter-processor
channels are essentially tools for building virtual topologies. The channel
classes offer functions to send to or receive from a channel and get the size
of the latest received message. More specialized synchronization-communication
tools can be derived from channels.

Global data and pointers.

Beside threads, the library offers basic tools for developing distributed data
abstractions. Global data are data that can be defined at given locations in
the distributed memory but are visible from all processors. Global pointers are
a generalization of C++ pointers that allow for addressing global data at any
place over the distributed memory. As usual pointers, global pointers are
subjected to arithmetic and logic manipulations (incrementation, dereferencing,
indexing, comparison...). The library provides basic operators for global data
and pointer definition.

Global read/write functions.

Global pointer expressions provide global references over the distributed
memory that can subsequently be used as arguments to global read/write
functions. These functions allow the processors to get access to all global
data regardless of their locations over the distributed memory. In their most
complete form, the read/write functions operate as remote procedure calls. At
the programmer's level, global read/write functions appear as "one-sided":
a read/write operation is executed on the processor that needs to read/write
global data but need not be explicitly handled by the processor associated to
the memory holding the data.

Spread and remote Arrays.

Two basic distributed data structures have been built in the library. Spread
arrays are arrays that have some of their dimensions spread over the
distributed memory according to a given policy. Remote arrays are arrays that
are defined at a given place in the distributed memory but can be accessed
from any other. The spread and remote array classes (SpreadArray and
RemoteArray) provide functions for global reference calculation. Global
references can subsequently be used as arguments to global read/write
functions. One can specialize  global pointers to operate on spread or remote
arrays. The global pointer class ({\em Star} class) offers distinct arithmetic
and logic operator sets for unassigned, spread and remote globalpointers.


The library encourages parallel code writing in a style that relies on the
object-oriented approach: first, build the abstractions that the application
at hand relies on; next, make an efficient implementation of the abstractions;
and finally, develop the application on top of them. The abstractions can be
distributed data types derived from those built in the library (spread and
remote arrays: see code of the segmentation algorithm provided with the
library) or new distributed types built in the same way or types reused from
other applications. This approach should favor parallel code production with
many desirable properties such as efficiency, portability, reusability, ... .


The library uses MPI as a communication interface. The current implementation
runs on the IBM-SP2. Two versions of the library have currently been released.
The first one is based on the IBM C++ compiler and MPI library. The second one
makes use of the GNU g++ compiler and the MPICH public domain version of MPI.
Porting the latter to any parallel machine supporting these two software
systems should be straightforward.


