Newsgroups: comp.parallel.mpi
From: Jay Le Beau <lebeau>
Subject: Re: Dynamic load balancing between workstations
Organization: NASA/Johnson Space Center
Date: 10 Oct 1996 21:17:42 GMT
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <53jp5m$krs@cisu2.jsc.nasa.gov>

Yon Han Chong <Y.H.Chong@cranfield.ac.uk> wrote:
>Currently I am using LAM6.0 to create a virtual parallel machine with
>AlphaStations.
>
>These machines can be logoned by other users. What would be ideal is to
>be able to dynamically balance the load between the workstations. Since
>mpi daemon doesn't take much memory or CPU time it can be pre-started on
>many machines. Out of all the pre-started machines only some of the
>machines are loaded initially. If someone logon to a loaded machine the
>load can be move to another machine which is not doing much work. In
>this way I will be happy since the parallel efficiency will be very good
>and others will be happy since they won't be effected by me. This can be
>also true for a big serial code.
>
>Have somebody made this possible or am I just dreaming the impossible?
>
>----------------------------------------------------------------------

In my dynamic load balancing implementation I simply start the job on all the
hosts on which I am interested in running and allow all of them to participate
in the computation.  Each task keeps track of the amount of wall-clock time it
requires to complete its portion of the work and compares it to the average
time.  Once a certain imbalance is reached (as may be caused by other users
when they execute jobs) the problem is dynamically repartitioned and all of the
nodes participate in a data-shift to achieve the new partition and hopefully
load balance the problem.

Unfortunately I am not that familiar with LAM so I can't really comment on any
issues there.  Although my program uses MPI style communication (developed on
an IBM SP2), I've developed my own library which translates the MPI calls into
PVM communication since I'm more comfortable using PVM (just an experience
thing) on a cluster of workstations.  I would guess though that the dynamic
load balancing would be just as straight forward with LAM since I'm not using
anything that is implementation specific.

Hope this helps - Jay Le Beau


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Gerald J. Le Beau                  E-mail : g.j.lebeau@jsc.nasa.gov
 Aeroscience and Flight Mechanics   Phone  : (713) 483-5208
 NASA-Johnson Space Center : EG3    Fax    : (713) 244-5256
 Houston, TX 77058                  http   : One of these days ...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


