Newsgroups: comp.parallel
From: P.H.Welch@ukc.ac.uk
Subject: Workshop on HPC Efficiency, London
Organization: University of Kent at Canterbury, UK.
Date: 5 Sep 1995 14:42:34 GMT
Message-ID: <42hnkq$7hb@usenet.srv.cis.pitt.edu>

Dear All,

This is a second call for participants in a workshop on efficiency
problems in High Performance Computing (HPC).  This contains an
updated timetable for the day, which is now almost complete.

The case for the workshop is that, on the surface, efficiencies
obtained from current HPC (software and hardware) architectures look
very low.  That case needs some public debate and that is what
the workshop will try to provoke.

Some people are put off by the `Crisis' in the title.  The tone of the
announcement is deliberately aggressive ... but this was intended to
make a dent on the excessively upbeat tone of most other announcements
relating to HPC.  The aims of the workshop are to take a sober look at
what is being achieved, find out if there are real problems and, if so,
start talking about them openly and start finding out how to overcome
them.  On the other hand, if there are no real problems and HPC is
delivering value-for-money, we need reassurance and an explanation as
to why 17% efficiency from some of the best groups is no cause for
concern.

Private debate (e.g. in review committees) may have considered such
things and concluded that all is well.  However, the public face
remains - "HPC, yes wonderful, no problems" - with little explanation
as to why this is so.  This workshop will enable a much wider community
of users and potential users to hear and take part in these arguments.

Many thanks,

Peter Welch.


(cut here)
-----------------------------------------------------------------------------


          Crisis in High Performance Computing - A Workshop
          -------------------------------------------------


Place:
------

Lecture room G22 (also known as the Pearson Lecture Theatre)
Pearson Building
University College London
Gower Street
London WC1E 6BT.


Date:
-----

Monday, 11th. September, 1995.


Background:
-----------

State-of-the-art high performance computers are turning in what some
observers consider worryingly low performance figures for many user
applications.  How widespread are such feelings, how justified are they
and, if they prove to be justified, what implications do they hold for
the future of High Performance Computing (HPC)?

Efficiency levels for `real' HPC applications are reported (e.g.
by the NAS parallel benchmarks) ranging around 20-30% (for some 16-node
systems) to 10-20% (for 1024-node massively parallel super-computers).
Are low efficiencies the result of bad engineering at the application
level (which can be remedied by education) or bad engineering at the
architecture level (which can be remedied by <what>)?  Maybe
these efficiency levels are acceptable to users ... after all, 20% of
16 nodes (rated at 160 MFLOPS per node) is still around 500 Mflops and
10% of 1024 nodes is 16 Gflops?  But they may be disappointing to those
who thought they were going to be able to turn round jobs at over 100
Gflops!  Are there other ways of obtaining the current levels of
performance that are more cost-effective?

A further cause of concern is the dwindling number of suppliers of
HPC technology that are still in the market ...

This workshop will focus on the technical and educational problems
that underly this growing crisis.  Political matters will not be
considered ... unless they can be shown to have a direct bearing.


Participants:
-------------

  o potential users of HPC facilities (`what problems am I going
    to face ... will it be worth my while?');

  o current users of HPC facilities (`what performance am I getting
    ... how hard has it been to achieve this ... am I getting value
    for the time I have invested?');

  o non-users of HPC facilities (`what effect has the funding of
    large scale super-computers had on my ability to obtain
    smaller scale facilities locally - preferably on my desk?');

  o architects of HPC facilities (`how can decent efficiency levels
    be achieved and how can application design-implementation-tune-
    test-and-maintain be made simple?').


Organisers:
-----------

The London and South-East consortium for education and training in
High-Performance Computing (SEL-HPC).  SEL-HPC comprises ULCC, QMW
(and the other London Parallel Application Centre colleges - UCL,
Imperial College and the City University), the University of Greenwich
and the University of Kent.


Timetable:
----------

  09:30  Registration

  09:50  Introduction to the Day
  10:00  High performance compute + interconnect is not enough
         (Professor David May, University of Bristol)
  10:40  Experiences with the Cray T3D/PowerGC/...
         (Chris Jones, British Aerospace, Warton)
         (Ian Turton, Centre for Computational Geography, University of Leeds)

  11:20  Coffee

  11:40  Experiences with the Meiko-CS2/...
         (Chris Booth, Parallel Processing Section, DRA Malvern)
  12:00  Experiences with SIMD architectures, ...
         (Stewart Reddaway, Cambridge Parallel Processing Ltd.)
  12:20  Problems of Parallelisation - why the pain?
         (Steve Johnson, University of Greenwich)

  13:00  Working Lunch (provided) [Separate discussion groups]

  14:30  HPF and MPI - tomorrow's standards ... yesterday's solutions?
         (<to be announced>)
  15:10  Parallel software and parallel hardware - bridging the gap
         (Professor Peter Welch, University of Kent)

  15:50  Work sessions and Tea [Separate discussion groups]

  16:30  Plenary discussion session
  16:55  Summary

  17:00  Close


Registration Details:
---------------------

You may edit this form electronically, or print it out and fill out
by hand.  Please return it by email, fax or post:

  Judith Broom
  Computing Laboratory
  The University
  Canterbury
  Kent -- CT2 7NF
  ENGLAND

  (tel: +44 1227 827695)
  (fax: +44 1227 762811)
  (email: J.Broom@ukc.ac.uk)

--------------------------<CUT HERE>------------------------------

            Registration for Crisis in HPC workshop
                 University College London
               Monday, 11th. September, 1995


Name:        ____________________________________________________

Institution: ____________________________________________________

Address:     ____________________________________________________

             ____________________________________________________

             ____________________________________________________

Email:       ____________________________________________________

Telephone:   ______________________ FAX: ________________________


Position and brief job/research description (optional):






Position statement for workshop (optional):






--------------------------<CUT HERE>------------------------------

For further workshop details, please contact Judith Broom.  Electronic
registration can also be found at:

  <URL:http://www.hensa.ac.uk/parallel/groups/selhpc/crisis/>

  <URL:ftp://unix.hensa.ac.uk/pub/parallel/groups/selhpc/crisis/>

where full details of this workshop (e.g. names of speakers, abstracts
of talks and final timetable) will be updated.

All types of participant are welcome -- see above.  Position statements
are also welcome, but not compulsory, from all attending this workshop.
They will be reproduced for all who attend and will help us define the
scope of each discussion group.



               ------------------------------------



Extended Abstract:
------------------

Efficiency levels on massively parallel super-computers have been reported
(e.g. in the NAS Parallel Benchmarks Results 3-95, Technical Report
NAS-95-011, NASA Ames Research Center, April 1995) ranging from 50% for
the `embarrassingly parallel benchmarks', through 20% for tuned
`real' applications, past 10% for typical `irregular' applications and
down to 3% when using a portable software environment.  Low efficiencies
apply not only to the larger system configurations (256 or 1024 nodes),
but also to the smaller ones (e.g. 16 nodes).  Seven years ago, we
would be disappointed with efficiency levels below 70% for any style of
application on the then state-of-the-art parallel super-computers.
What has caused this regression and can it be remedied?

It seems to be proving difficult to build efficient high-performance
computer systems simply by taking very fast processors and joining them
together with very high bandwidth interconnect.  Apart from the need to
keep the computational and communication power in balance, it may also
be essential to reduce communication start-up costs (in line with
increasing bandwidth) and to reduce process context-switch time (in
line with increasing computational power).  Failure in either of these
regards leads to coarse-grained parallelism, which may result in
insufficient parallel slackness to allow efficient use of individual
processing nodes, potentially serious cache-coherency problems for
super-computing applications and unnecessarily large worst-case latency
guarantees for real-time applications.

               ------------------------------------

A further cause of concern is the dwindling number of suppliers of
HPC technology that are still in the market.  Will there be a next
generation of super-computers from the traditional sources?  Or will
HPC users have to rely on products from the commercial marketplace,
in particular the PC Industry and Games/Consumer-Products Industries?
If the latter, how will this change the way we approach the design
of HPC facilities and applications?

               ------------------------------------

At the other end of the spectrum, clusters of workstations are reported
as offering, potentially, good value for money, but only for certain
types of application (e.g. those with very high compute/communicate
ratios).  What are those threshold ratios and how do we tell if our
application is above them?  What do we do if our application does not
so conform?

               ------------------------------------

Blame is often laid at the lack of software tools to support and
develop applications for high performance architectures.  New standards
have been introduced for parallel computing - in particular, High
Performance FORTRAN (HPF) and the Message Passing Interface (MPI).  Old
standards stick around - e.g. the Parallel Virtual Machine (PVM).

These standards raise two problems: depressed levels of efficiency (this
*may* be a temporary reflection of early implementations) and a low-level
hardware-oriented programming model (HPF expects the world to be an
array and processing architectures to be a regular grid, MPI allows
a free-wheeling view of message-passing that is non-deterministic by
default).  Neither standard allows the application developer to design
and implement systems in terms dictated by the application; bridging
the gap between the application and these hardware-oriented tools remains
a serious problem.

New pretenders, based upon solid mathematical theory and analysis, are
knocking on the door - such as Bulk Synchronous Parallelism (BSP).  Old
pretenders, also based upon solid mathematical theory and analysis and
with a decade of industrial application, lie largely unused and
under-developed for large-scale HPC - such as occam.  Might either of
these offer some pointers to the future?

               ------------------------------------

The above paragraphs raise several issues.  The aim of this workshop is
to exercise and debate them thoroughly, see what peoples' real experiences
have been and consider in what ways HPC needs to mature in order to
become viable.  A major goal of the workshop is to start to try to
identify standards of `good behaviour' on software for parallel or
distributed systems that will:

  o enable HPC hardware architectures to operate with much greater
    efficiency levels;

  o enable HPC applications to be developed in their own terms without
    regard for the underlying hardware.

Or maybe the workshop will decide that:

  o HPC architectures (hardware and software) do not have fundamental
    problems;

  o there are no lessons from the past that need re-discovery and
    re-application;

  o everything can be sorted out by better education and tools for
    existing HPC standards.

Please come along and make this workshop work.

