Newsgroups: comp.parallel.pvm
From: peery@envy.isc.tamu.edu (Alan Peery)
Subject: Performance Simulation of PVM/batch workstation cluster
Organization: Texas A&M University, College Station, TX
Date: 22 Jul 1994 22:17:21 GMT
Message-ID: <PEERY.94Jul22171721@envy.isc.tamu.edu>


I am hoping to find a model for simulation of system performance and
load on a cluster of workstations operating both as a batch computing
cluster and as a PVM farm.  

Here is the model:
                            outside feed (telnet, ftp, X output)
                            |                                    
                            V           v--------------possible workstation
   --------------------------------------------------- only rings...
   |                                     FDDI RING   |           
   |                                                 |           
   |                                                 <---| ether to 
   |                                                 |   | FDDI  
   |                                                 |   |       
   -----File--------------File---------------File-----   |
       Server            Server             Server       |        
      |      |                                           |-ws    
      | FDDI |                                           |       
   ws-| ring |            ditto              ditto       |-ws  
   ws-|      |                                           |       
      --------                                           |-ws    
      |  |  |                                            |       
      ws ws ws                                           |-ws   
                                                         |       
                                                         |       
                                                                 

There is a main FDDI ring running between the file servers,
and a ring hanging off each file server that contains 32 workstations,
each with a single-attach FDDI card.  In each ring the machines are
binary-compatible, but they may be different speed.  The workstations are:

left ring        32 Sun 10/40s  420 Mb local disk, single processor,
                                64Mb RAM

middle ring      16 HP 735s     1Gb local disk, 80Mb RAM
                 16 HP 730s     500Mb disk, 64Mb RAM

right ring       4 SGI 4D/360s  six processors each, 4Gb local disk,
                                 256Mb

rightmost "ring" 32 Sun 2s      32Mb or similar, ethernet interface
                                only, since we can...
      

The workstations in the clusters may be accessing files on any server,
so the file servers may end up doing some routing.

Currently the system is operated almost entirely in batch mode, where
the file servers act as batch scheduling system (NQS) that load
balance jobs onto the workstations across the cluster, with the
typical long/short execution time queues.  The communication pattern
is almost entirely from workstation to file server, with very little
communication between the workstations.  A few workstations in each
ring may be reserved for interactive (compilation) work.

The jobs tend not to be disk intensive, so we don't (yet) seem to be
hitting the FDDI bandwidth to the file servers.

We want to find:

1) How far we can push the file servers by adding additional
workstation-only rings that attach directly to the main ring?

2) If we were to allow PVM jobs to execute on top of the batch jobs,
what performance can we expect?  What will happen to batch
performance?  (Obviously highly dependent on the communication needs
between the processors...)

3) What would happen to throughput, particularly for PVM, if we were
to replace the FDDI rings to the workstations with an ATM switch?
This replaces the ring with a communication medium with a greater
bandwidth...

Much thanks, email or post, 

Alan

peery@isc.tamu.edu
--

Alan Peery
Institute for Scientific Computation
Texas A&M University
peery@isc.tamu.edu

