Newsgroups: comp.parallel.pvm
From: crispin@csd.uwo.ca (Crispin Cowan)
Subject: Re: multiuser PVM?
Organization: Department of Computer Science, University of Western Ontario, London, Ontario, Canada
Date: 7 Jul 1994 20:40:08 GMT
Message-ID: <2vhp78$30b@falcon.ccs.uwo.ca>

In article <CsL199.L0t@midge.bath.ac.uk>,
D J Batey <mapdjb@midge.bath.ac.uk> wrote:
>Is anybody else out there interested in the idea of a multiuser PVM?
...[how to let more than one user share a single PVM machine]
>Anybody else got any thoughts on/experiences of multiuser PVM? At some
>point in the future I'm going to need a more explicit, secure,
>multiuser connection mechanism than the hack outlined above. If at

In other words, you're going to need a distributed system, instead of a
parallel system.  PVM is many cool things (small, portable, efficient,
easy to use, easy to understand, etc.) partly because one of the things
that it is NOT is a distributed system.  It is completely lacking the
things you describe:  a secure, explict, etc. mechanism for connecting
different users' tasks.

There are many similarities between parallel and distributed computing,
and the similarities are especially strong between the message passing
model PVM uses and distributed computing.  But there are also
differences.  Parallel computing is primarily concerned with
low-latency, high-bandwidth communications, because the principle
motive for using more than one processor is nothing more than getting
the job done sooner.  Distributed computing is concerned with numerous
issues involving security, authentication, directory services,
persistence, heterogeneity, and autonomy, because the primary motive is
to get existing, separately administered nodes to co-operate, and
getting the job done sooner is only a secondary benefit.  Parallel
computing is one user who owns many nodes & wants to use them all to go
faster.  Distributed computing is numerous users who own their own
nodes and would like to co-operate, but don't want to surrender
ownership and control.

PVM is unusual, in that it is a parallel programming system
(principally) intended to run on top of distributed systems.  Unifying
the parallel and distributed programming worlds is a fascinating topic,
and one of my personal long-term goals.  But it is not a simple task.
In this case, the very first stumbling block is that the way PVM
task-IDs are constructed is inadequate to a secure, distributed,
general-purpose multi-user environment (naturally, because those
weren't goals of the PVM design).  TID space would need to be expanded
so that each user has enough space for their own tasks on each
node, and the TIDs remain globally unique.  Given that IP addresses are
32 bits, a 32-bit TID space doesn't seem large enough.

Beyond that, PVM would need at least the following services:
	-a resource broker--some way of discovering the TIDs of tasks
	 that provide desired services, but are not parents, children,
	 or siblings.
	-a security mechanism.  TIDs are easy to forge now, and a
	 forged TID could be used to gain unauthorized access to
	 services
	-a more sophisticated notion of stdio.  Just printing
	 everything to log.<uid> is adequate for parallel programming,
	 but not for distributed applications.
	-windowing support.  Multiple, distributed tasks are pretty
	 difficult to interact with if it isn't convenient for the
	 tasks to create a window when they want some user-IO.
	-a permissions system, to make it possible to configure who may
	 or may not start tasks on a given node, which is a separate
	 question from who may or may not communicate with a given
	 node.
	-a more convenient data marshalling system.  Scientific
	 applications (the principle consumers of parallel processing
	 machines) exchange mostly homogeneous data, so one can easily
	 say "pack this 10,000 element array of floats".  Distributed
	 applications tend to have more heterogeneous data
	 communications needs, so they can say things like "send this
	 complicated fubar struct".

Crispin
-----
Crispin Cowan, CS PhD student, searching for a research position
University of Western Ontario
Phyz-mail:  Middlesex College, MC28-C, London, Ontario, N6A 5B7
E-mail:     crispin@csd.uwo.ca          Voice:  519-661-3342
"A distributed system is one in which I cannot get something done
because a machine I've never heard of is down"   --Leslie Lamport

