Newsgroups: comp.parallel
From: Rick Schlichting <rick@cs.arizona.edu>
Subject: Kahaner Report: Parallel Alg (PAS'95), U-Aizu & multimedia activities
Organization: University of Arizona CS Department, Tucson AZ
Date: 1 Apr 1995 13:03:40 -0700
Message-ID: <3lrkg0$am2@usenet.srv.cis.pitt.edu>

  [Dr. David Kahaner is a numerical analyst currently heading the Tokyo
   office of the Asian Technology Information Program (ATIP). The
   following is the professional opinion of David Kahaner and in no 
   way has the blessing of the US Government or any agency of it.  All 
   information is dated and of limited life time.  This disclaimer should 
   be noted on ANY attribution.]

  [Copies of previous reports written by Kahaner can be obtained using
   anonymous FTP from host ftp.cs.arizona.edu, directory japan/kahaner.reports
   or on the World Wide Web (WWW) at URL

          http://www.cs.arizona.edu/japan/www/kahaner_reports.html
  ]

To: Distribution
From: D.K.Kahaner, ATIP-Tokyo [kahaner@cs.titech.ac.jp]
Re: Parallel Algorithms (PAS'95), U-Aizu & related multimedia activities
04/01/95 [MM/DD/YY]
This is file name "pas.95"

Dr. David K. Kahaner
Asian Technology Information Program (ATIP)
Harks Roppongi Building 1F
6-15-21 Roppongi
Minato-ku, Tokyo 106
 Tel: +81 3 5411-6670; Fax: +81 3 5411-6671

ATIP: A collaboration between
   US National Institute of Standards and Technology (NIST)
   University of New Mexico (UNM)
------------------------------------------------------------------------

ABSTRACT. The First Aizu International Symposium on Parallel Algorithms /
Architecture Synthesis (PAS'95) is summarized. This was held at the
University of Aizu, in Aizu-Wakamatsu Japan 15-17 March 1995. PAS'95 has a
strong focus on Japanese university activities associated with parallel
computing, including JUMPP, GRAPE,  Adena, and related projects. Of special
interest will be discussions of the Aizu Supercomputer and the graphics
computer VC-1, both from Aizu University, as well as the university's plans
for a newly opened multimedia center.

The University of Aizu, opened in April 1993, specializes in computer
science and engineering, and is claimed to be the first of its kind in the
world. It was established as the first Japanese research university
with positions open to world-class faculty from any contry. There are
currently faculty from 16 countries, including a very large number of
Russian scientists. I have written several reports about U-Aizu. See for
example aizu-u.94, Jan 6 1994. A 190 page (English) Annual Review of the
University's activities is available by writing to the following address

        Alexander R. Taubin
        Chair, Public Relations & Publications Committee
        University of Aizu
        Tsuruga, Ikki-machi
        Aizu-Wakamatsu
        Fukushima 965-80, Japan
         Tel: +81 242-37-2521; Fax: +81 242-37-2531
         Email: taubin@u-aizu.ac.jp

The pAs'95 symposium included original and peer reviewed research papers
in the field of parallel computation. About 100 participants from 17
countries were represented, although relatively few from the US compared to
the Europeans. In the case of several Japanese projects, the presentations
were not necessarily of new results, but prepared in order to give foreign
participants and overview of the current state of the projects. Several of
these projects have already been described in my reports, see for example
the summary reports (j-pp.94a, j-pp.94b, and j-pp.94c) Nov 22, 1994. 
(Some interesting and well known parallel processing projects, such as the Real
World Computing Project, were not discussed at this conference. Thus PAS'95
provides an important, partial, snapshot of Japanese activities in this area.) 

In addition to summaries of well known parallel processing projects, PAS'95
also gave U-Aizu faculty the opportunities to present several much less
well known projects at their university, including the development of the
VC-1, a parallel computer built for graphics applications and with measured
performance of up to 500K polygons rendered per second with 16 i860 (40MHz)
processors. Also described is the Aizu Supercomputer, organized in a
hierarchical 2D network with 75MFlop R4400 RISC processors (cluster
based multiprocessor). This is also designed for graphics applications and to be
installed in April 1995 in a new multimedia center. The latter will contain
3 sets of 260inch large screens covering 120 degrees viewing angle. Each
screen consists of 16 multi-projection cubes which have liquid-crystal
shutters. There are also 14 DSP's connected to the processors. More detail
about this center is given below, mostly from translation by H.Behrens
based on Japanese language material supplied to me by university staff,
although I have made some minor stylistic changes.


  Purpose of Aizu's new multimedia Center

 - Help in furthering multimedia. Offer an opportunity for the region to
have hands-on experience with multimedia.

 - Promote regional support for local enterprises in research development
for   the creation of new industries based on multimedia.

 - Disseminate aquired know-how world-wide.
    Carry out leading-edge research for the creation of synthetic worlds in
cooperation with research partners of the University of Aizu.

  Research topics

  - research on 'CrossoverNet': a next-generation multimedia network
  - research and develop a high-speed processor for use in
                the construction of synthetic worlds.
  - applied research on scientific analysis of movements in sports.
  - sociological research on the application of multimedia and its
                effects on society.
  - research on multimedia groupware and applications on
                language education.
  - research on multi-modal human interfaces (hearing, tasting etc.)

  Outline of facilities

  - Exhibition hall: High Vision (HDTV) 160 inch multi-screen (3 units)
      Offer a facility through which multimedia and related fields can
          be spread through lectures and exhibitions on multimedia
          related research results.

  - Exhibition booth: Virtual Shop & Shopping System
      Use as a base for a virtual shopping system based on multimedia
                technology.
      Display latest technology in cooperation with the local community
                and local enterprises.
      Explore multimedia and its effects/use in society.

  - Artificial worlds zone: 3-D virtual simulation system
      To be used in the creation of virtual reality (VR).
      Using all senses (scent, hearing, vision) allowing interactive VR.
      Centered around a 160" 3-screen multiscreen zone, this unique
        high-speed visualization system is to be used in developping
        next-generation technologies and products.

  - sports movement simulation: load vector measurement/
           Infrared camera system
      To measure human movements in a multi-facetted detailed way.
      Analyse and measure these movements making use of the most advanced
           VR technologies.

  - Artificial world research room: groupware/language education,
          multi-modal interface system
       To put multimedia to effective use in everyday 'mental activities'
          such as culture, arts, industry, education etc.

       To put multimedia to use in designing interface for
          handicapped people.

  - Research cooperation room: 4 rooms
       To support research partners.
       To foster local industries through providing of facilities.

  - Research development center: multimedia development support center
          with all tools necessary for the processing of multimedia
          information are provided here.

  - Network center: CrossoverNet system
       To build a test environment for a next generation high
          speed multimedia network using both analog and digital technology.
       Carry out applied research and experiments with NTT's high speed
          network [INS64,1500].



Another specially interesting project at the university is the research
work of N.Mirenkov on the development of a new software model for use in
programming graphics applications, called VIM technology (visualization,
animation and sonification of data processing methods). Other interesting
work at U-Aizu that I have commented on in earlier reports (but not
presented here) is by M.Cohen on multidimensional sound interfaces.

A published proceedings is available.
     PAS'95
     ISBN 0-8186-7038-X
This may be ordered from the following.
     IEEE Computer Society Press
     Customer Service Center
     10662 Los Vaqueros Circle
     PO Box 3014
     Los Alamitos, CA 90720-1264
      Tel: (714) 821-8380; Fax: (714) 821-4641
      Email: CS.BOOKS@COMPUTER.ORG


The PAS'95 General Chair was Professor T. Kunii, President of the
University, and a leading researcher in computer graphics and geometric
modelling. The Program Chair was Prof N. Mirenkov (The University of Aizu).
The report below was written by

        Harry Behrens
        German National Research Center for Computer Science (GMD)
        Tokyo Bureau
        Gerhan Cultural Center
        7-5-56, Akasaka, Minato-ken
        Tokyo 107, Japan
         Email: behrens@gmd.co.jp

This report was supported by ATIP, the Asian Technology Information Program.

        -The first Aizu International Symposium on Parallel
        Algorithms/Architecture Synthesis and the Aizu
        Supercomputer-

ABSTRACT (Behrens):  The first Aizu International Symposium on Parallel
Algorithms/Architecture Synthesis (pAs '95) was held from March 14 -17,
1995 at the University of Aizu in Aizu-Wakamatsu, Fukushima prefecture,
about 200 km north of Tokyo.  This report covers the conference in
general, giving the titles of the papers and tutorials presented. Those
papers and tutorials that I [Behrens] attended in person will be described
in a little more detail and commented on.  At the end of this report, the
architecture of the Aizu Supercomputer, a very ambitious supercomputer
project currently in progress at the University of Aizu will be
described in detail.

-Organization of the report-
 The report is divided into the following components:

 1) Overview
    This gives a brief overview and summary of the conference and some
    information about the University of Aizu

 2) Conference Program
    The schedule, titles and speakers of the tutorials
    and papers presented. Those lectures that I (Behrens) personally
    attended are later described and commented on.
    They are marked with a (+) mark in the program.

 3) Comments on papers
    This part contains brief abstracts and comments to
    the papers and tutorials I personally attended.

 4) Aizu Supercomputer Project
    A description of the Aizu Supercomputer Project
------------------------------------------------------------------------

(1) OVERVIEW
        The first Aizu International Symposium on Parallel
Algorithms/Architecture Synthesis (PAS'95) was sponsored and organized
by the University of Aizu. The University of Aizu is a very new
university that was founded in 1993 by the Fukushima prefecture with a
budget of about Yen. 50 billion (USD 500 million).  It specializes on
computer science and related fields and currently has about 500 students
attending graduate and undergraduate courses. The university is extremely
well equipped and makes a very innovative impression in many aspects. For
instance, more than 60% of the faculty members are non-Japanese with
quite a few researchers from Russia.  The General Chair of the
conference was held by Professor Toshiyasu Kunii, who is the President
of the University of Aizu.  The Program Chair was held by Professor
Nikolay Mirenkov, University of Aizu, who was also responsible for most
of the organization of the conference.  The pAs'95 was divided into one
day of tutorials featuring four researchers from Europe and the USA and
three days in which lectures and papers were presented.  For those
interested, a visit to Fujitsu's semiconductor plant in Aizu was offered
on the last day.  Due to lack of time, some sessions were presented in
parallel. However all tutorials and invited papers were were given in
non-parallel sessions.  The conference featured speakers from seventeen
countries. Especially noticeable was the strong presence of European
speakers including quite a few from Russia.  This has to be seen against
the background of the strong Russian research community at the
University of Aizu.  The total number of attendees was 86. A complete
list of attendees can be obtained by contacting info@gmd.co.jp.

The quality of presented papers in this conference was quite high. The
topics ranged from very theoretic to very practical, giving a very good
and balanced overview of current research in the field of parallel
computing. The number of attendees was small enough to allow for plenty
of discussion.  Adding to this the superb organization and the extreme
helpfulness especially of Professor Mirenkov, the conference was
definitely a success and I hope to be attending the second pAs next
year.

-------------------------------------------------------------

(2)  GENERAL PROGRAM
     A general program follows. Due to visa problems for  some of the
Russian speakers, some of the papers were rescheduled or could not be
held at all. This accounts for differences in this program and the
program schedule of the conference. (+ indicates further comments in
section (3)  by H.Behrens)

        MARCH 14 (Tuesday)

        Tutorials

        9:00 - 10:30 T1 (+)
        C.Jesshope, University of Surrey, UK
        General Purpose Scalable Parallel Computers

        11:00 - 12:30 T2 (+)
        H.Burkhart, University of Basel, Switzerland
        Structured Parallel Programming: Methods - Languages -  Tools

        13:30 - 15:00 T3 (+)
        H.Zima, University of Vienna, Austria
        High Performance Languages

        15:30 - 17:00 T4 (+)
        D.F. Hsu, Fordham University, USA
        Interconnection Networks and Parallel Algorithms

        MARCH 15 (Wednesday)

        9:30 - 10:00 OPENING (+)
        10:00 - 11:00 Invited talk (+)
        T.Kunii, S. Nishimura, University of Aizu, Japan
        Parallel Polygon Rendering on the Graphics Computer VC-1

        11:00 - 12:00 Invited talk (+)
        C.Lengauer, M. Griebl, University of Passau, Germany
        On the parallelization of loop nests containing while loops

        13:00 - 14:00 Invited talk (+)
        H.Tanaka, University of Tokyo
        Massively Parallel Processing Project as a Priority
        Area of Research for the Ministry of Education

        14:00 - 15:00 Invited talk (+)
        B.Chapman, M. Pantano, H. Zima, University of Vienna, Austria
        Supercompilers for Massively Parallel Architectures

        -Stream A: Tools and Technology-

        15:30 - 16:00 (+)
        A.Bode, Technical University of Muenchen, Germany
        Methods and Tools for the Efficient Use of Parallel
        Computer Architectures

        16:00 - 16:30 (+)
        O.Hansen, J. Krammer, Technical University of Muenchen, Germany
        A Scalable Performance Analysis Tool for PowerPC Based
        MPP Systems

        16:30 - 17:00 (+)
        P.Croll, I. Jelly, and I. Gorton, The University of Sheffield, UK
        Software Engineering Techniques and Tools for High
        Performance Parallel Systems

        17:00 - 17:30 (+)
        A.E.Doroshenko and A.B. Godlersky, Ukrainian Academy of Sciences,
        Kiev, Ukraine
        Constructing Parallel Implementation with Algebraic
        Programming Tools

        -Stream B: Algorithms and Techniques-

        15:30 - 16:00
        A.P.Vazhenin, The Russian Academy of Sci., Novosibirsk, Russia
        Parallel Algorithm for Solving Systems of Linear
        Equations with Dynamically Changed Length of Operands

        16:00 - 16:30
        J.Wang, H. Lung, and Y. Katsumata, Fujitsu Systems
        Business of America, Inc., USA
        Implementing a 3D Multigrid Algorithm on Fujitsu's
        Vector Parallel Supercomputer

        16:30 - 17:00
        T.Rauber and G. Runger, Universitat des Saarlandes, Germany
        Aspects of a Distributed Solution of the Brusselator Equation

        17:00 - 17:30
        R.Huang, T. Kunii, University of Aizu, Japan
        Parallel Algorithms for Extracting Ridges and Ravines

        MARCH 16 (Thursday)
        8:45 - 9:45 Invited talk (+)
        D.Sugimoto, J. Makino, M. Taiji, and T. Ebisuzaki,
        University of Tokyo, Japan
        GRAPE Project for a Dedicated Tera-flops Computer

        9:45 - 10:45 Invited talk (+)
        T.Nogi, Kyoto University, Japan
        Promising Data Parallel Environment - ADEPS, ADETRAN,
        and ADENA

        11:00 - 12:00 Invited talk (+)
        C.Jesshope, D.Barsky, A. Bolychevsky, and A.  Shafarenko,
        University of Surrey, UK
        Asyncrony in distributed parallel computing

        -Stream A: Architectures-

        13:00 - 13:30 (+)
        V.Varshavsky, V. Marakhovsky, and T. Chu, University of Aizu, Japan,
        Cirrus Logic Inc., USA
        Logical Timing (Global Synchronization of Asynchronous Arrays)

        13:30 - 14:00 (+)
        M.V.Screenivas and S. Bhalla, Delhi Institute of
        Technology, India, University of Aizu, Japan

        14:00 - 14:30 (+)
        Ce-Kuen Shieh, An-Chow Lai and Jyh-Chang Ueng,
        National Cheng Kung University, Taiwan
        Cohesion: An Efficient Distributed Shared Memory
        System Supporting Multiple Memory Consistency Models

        14:30 - 15:00 (+)
        V.O.Roda and T.T. Lin, University of Sao Paulo, Brazil
        On the Effect of Spare Positioning on the Reconfigurability of
        Two-dimensional Processor Arrays

        -Stream B: Graph Theory and Networks-

        13:00 - 13:30
        Q.P.Gu and S. Peng, University of Aizu, Japan,
        Fault Tolerant Routing in Toroidal Networks

        13:30 - 14:00
        Zhi-Zhong Chen and Xin He, Tokyo Denki University, Japan
        Parallel Algorithms for Maximal Acyclic Sets

        14:00 - 14:30
        Shou-Cheng Hu and Chang-Biau Yang, National Sun Yat- sen University,
        Taiwan
        Fault Tolerance on Star Graphs

        14:30 - 15:00
        Shyh-Chain Chern, Tai-Ching Tuan and Jung-Sing Jwo,
        National Sun Yat-sen University, Taiwan
        Hamiltonicity, Vertex Symmetry, and Broadcasting of
        Uni-directional Hypercubes

        -Stream A: Distributed Systems-

        15:30 - 16:00
        P.K.Reddy and S. Bhalla, Delhi Institute of
        Technology, India, University of Aizu, Japan
        Non-blocking Concurrency Control in Distributed
        Database Systems

        16:00 - 16:30
        Y.Nakano, Fujitsu Lab Ltd., Japan
        Analysis of Communication Data: Compression Network

        16:30 - 17:00
        Tong-Ying Juang, C.P.Chiu and Kun-Ming Yu, Chung-Hua
        Polytechnic Institute, Taiwan
        Concurrent Rollback for Crash Recovery in Extended
        Hypercube Networks

        17:00 - 17:30
        M.V.Screenivas and S. Bhalla, Delhi Institute of
        Technology, India,
        University of Aizu, Japan
        Garbage Collection in Message Passing Distributed
        Systems

        -Stream B: Scheduling and Load Balancing-

        15:30 - 16:00 (+)
        Weiping Zhu and C.F.Steketee, University of South
        Australia, Australia
        An Experimental Study of Load Balancing on Amoeba

        16:00 - 16:30
        B.Benko, M. Ojstersek and V. Zumer, University of Maribor, Slovenia
        Analysis and Evaluation of an Extended Duplication
        Scheduling Heuristic

        16:30 - 17:00
        H.Shen, H. Kitajima, H. Kobayashi and T. Nakamura
        Tohoku University, Japan
        Task Scheduling with Locality Consideration for a
        Clustered Parallel FL Reduction System

        17:00 - 17:30 (+)
        W.Loewe and W. Zimmermann, University of Karlsruhe, Germany
        On Finding Optimal Clusterings of Task Graphs

        -Stream C: Poster Session-

        15:30 - 16:30

        MARCH 17 (Friday)

        8:45 - 9:45 Invited talk (+) (see below for details)
        T.Ikedo, University of Aizu, Japan
        Aizu Supercomputer Project

        9:45 - 10:45 Invited talk (+)
        N.Mirenkov, University of Aizu, Japan
        Visualization and Sonification of Methods

        11:00 - 12:00 Invited talk (+)
        D.F.Hsu, M.D. Grammatikakis, and M. Kraetzl, Fordham University,
        USA
        A Journey into Multicomputer Routing Algorithms

        -Stream A: Formal Methods-

        13:00 - 13:30 (+)
        A.Max Geerling, University of Nijmegen, The Netherlands
        Program Transformations and Skeletons: Formal
        Derivation of Parallel Programs

        13:30 - 14:00 (+)
        T.Heywood and C. Leopold, University of Edinburgh, UK
        Dynamic Randomized Simulation of Hierarchical PRAMs on
        Meshes

        14:00 - 14:30
        N.A.Anisimov and A.A. Kovalenko, The Russian Academy of Sci.,
        Vladivostok, Russia
        Towards Petri Net Calculi based on Synchronization via
        Places

        -Stream B: Algorithms and Tools-

        13:00 - 13:30
        R.Sarnath, St Cloud State University, USA
        Efficient Scalable Mesh Algorithms for Merging,
        Sorting and Selection

        13:30 - 14:00
        B.Herndon, A. Raefsky, R. Goossens, and R. Dutton,
        Stanford University, USA
        Parallelizing a PDE Solver: Experiences with PISCES

        14:00 - 14:30
        W.Cai, Tee L. Pian and S. J. Turner, Nanyang
        Technological University, Singapore
        A Framework for Visual Parallel Programming

        14:30 - 15:00
        Lin Peng Huang and Kam Wing Ng, The Chinese University of Hong Kong,
        Hong Kong
        Implementing Higher-order Gamma on MasPar: A Case Study

        15:00 - 15:30
        A.Sh. Nepomniaschaya and Ya. I. Fet, The Russian Academy of Sci.,
        Novosibirsk, Russia
        Investigation of Some Hardware Accelerators for Relational Algebra
        Operations

--------------------------------------------------------------------

(3) COMMENTS ON PAPERS

        -Tutorials-
        "General  Purpose Scalable Parallel Computers"
        by C.Jesshope, University of Surrey, UK

        Chris Jesshope spoke on the current state of parallel
        computer technology and gave his outlook for future
        developments. He argued that most of present-day
        parallel computers are based on off-the-shelf hardware
        and the main difference between one multiprocessor system
        and the next is the interconnection topology used.
        In his analysis Mr.Jesshope strongly focused on Virtual
        Shared Memory technology  as a method to extend the RAM
        paradigm to parallel architectures.He explained the main
        issues concerning coherence and consistence  that occur
        in Virtual Shared Memory  systems.
        As an aim for the future,  Mr.Jesshope calls it the "holy grail"
        of parallel computing,  Mr.Jesshope said that the ideal
        system would allow for uniform-access, scalable, multi-port,
        virtual shared memory.

               Chris Jesshope
               The University of Surrey
               Guildford Surrey GU2 5XH, UK
               email: C.Jesshope@ee.surrey.ac.uk
-------------------------------------------------------

        "Structured Parallel Programming"
        by  H.Burkhart, University of Basel, Switzerland

        Mr.Burkhart started off with a list of pessimistic
        statements concerning the research situation in the
        parallel computing community. Among others, he mentioned
        an overall reduction of funding, a shift to distributed
        computing through the introduction of programming
        libraries such as PVM or MPI and lack of a uniform high-
        level programming system for parallel computers.
        As a solution he proposed four approaches:
             1) Programming languages or paradigms
             2) Programming libraries
             3) Coordination languages
             4) Templates and Skeletons
        At his research laboratory  at the University of Basel, a
        skeleton generating system was developped.It is based on
        the Basel Algorithm Classification Scheme (BACS).It
        constitutes a framework for abstract description of
        parallel algorithms.The coordination language ALWAN,
        which is based on BACS, can be used to generate source
        code skeletons for various languages using the skeleton
        generator TIANA.
        The main aim of this research project is to simplify
        parallel programming and hide the nasty details involved,
        such as synchronisation or machine architecture, from the
        programmer. This approach would allow the
        programmer to generate running systems for various
        platforms by just specifying an abstract algorithm and a
        few topological details.

            Helmar B. Burkhart
            Institut fur Informatik
            University of Basel
            CH-4056 Basel, Switzerland
            email:   burkhart@ifi.unibas.ch

-------------------------------------------------------
        "High Performance Languages"
        by H.Zima, University of Vienna, Austria

        Mr.Zima is the European Mr.Fortran. He has been
        involved in the specification and design of compilers for Fortran,
        Vienna Fortran and High Performance Fortran languages
        (HPF) for more than two decades. He gave a talk on Vienna
        Fortran and HPF and explained the issues involved in
        designing SPMD (Single Program Multiple Data) programs
        for parallel systems.The main problems in this context
        are those of data distribution.Consequently his talk
        focused on the programming constructs Vienna Fortran and
        HPF offer to tackle this problem.

               Hans Zima
               University of Vienna
               Brunnerstrasse 72
               A-1210 Vienna, Austria
               email: zima@par.univie.ac.at
-------------------------------------------------------

        "Interconnection Networks and Parallel Algorithms"
        by F.Hsu, Fordham University, USA

        Mr.Hsu gave an overview of interconnection schemes used
        in parallel architectures.He explained the main issues
        such as routing problems, scalability  and embedding of
        one system into the other.
        He gave an outline of  the main advantages and
        disadvantages of the most commonly used interconnection
        schemes, such as  bus, mesh, hypercube and crossbar
        topologies.
        He then proceeded to a detailed case study of permutation
        routing on a hypercube.He chose the hypercube as an
        example because of the wide-spread use of hypercube based
        systems.

              D. Frank Hsu
              Department of Computer\&Information Science
              Fordham University
              Bronx, NY 10458-5198, USA
              email: hsu@murray.fordham.edu
-------------------------------------------------------

        -Sessions-

        Opening talk by Mr.Kunii
        Mr.Kunii welcomed everybody and gave a brief introduction to the
University of Aizu. He outlined the international aspects of modern
research and the need to unify different approaches such as theoretic
research in which for instance Russia is very strong and manufacturing
technology in which Japan excels.  Mr.Kunii sees his  and the University
of Aizu's responsibility in  combining these various approaches to
define the next generation's needs and research topics.

        Tosiyasu L. Kunii
        University of Aizu
        Tsuruga, Ikki-machi
        Aizu-Wakamatsu
        Fukushima 965-80, Japan
        e-mail: kunii@u-aizu.ac.jp
-------------------------------------------------------

        -Plenary Session I-
        Invited Talk: "Parallel Polygon Rendering on the Graphics
        Computer VC-1"
        by T.Kunii, S.Nishimura, University of Aizu, Japan
        The talk was given by Mr.Nishimura

        The Visual Computer 1, VC-1, is a graphics computer developped
at the University of Aizu during the past four years. It consists of 16
boards each of which contains an Intel i860 CPU running at 40MHz.Each of
the processors has  8MB local memory with an access time of 25ns. The
Processing Elements (PEs) are interconnected with a 2dimensional torus
topology using point-to-point links with a bandwidth of 2MB/sec.  The
point-to-point  communication interface is based on the Inmos T805
processor. In addition to the serial inks that a linear bus with a
maximum transmission rate of 30MB/sec is used for broadcasting messages.
Local frame buffers are merged through a pipelined image merging unit
and transferred to a global frame buffer which holds pixel values for
the CRT.  The talk focused on polygon rendering as an application for
the VC-1.  A brief overview of Polygon rendering and the different
methods to parallelize it was given. Mr.Nishimura then proceeded to
classify various graphics computers according to the parallelization
scheme (polygon parallel vs. pixel parallel) used. He outlined the
advantages and drawbacks of each of these approaches. He classified the
VC-1 as a hybrid approach and then proceeded to describe the HW
architecture of the system.  He concluded his talk by showing slides of
various polygon rendering results and showed that the system can obtain
a polygon rendering rate of more than 400K polygons/sec.

        S. Nishimura
        University of Aizu
        Aizu-wakamatsu
        Fukushima 965-80, Japan
        email: nisim@u-aizu.ac.jp
-------------------------------------------------------

        Invited Talk: "On the parallelization of loop nests
        containing while loops"
        by  C.Lengauer, M. Griebl, University of Passau, Germany
        The talk was given by Mr.Lengauer

        Mr.Lengauer proposes a strict mathematical model for the
parallelization of for- and while-loops. His model can be used to handle
while-loops which present problems not encountered in for- loops.One of
these problems is that  loop boundaries are unknown at compile time.
The mathematical model is based on polyhedrons and polytopes which are
specific forms of convex spaces.In case of a for-loop, the loop is
transformed into a polytope which is then transformed into an equivalent
polytope with explicit time-space annotation.  In case of a while-loop
the automatic parallelization becomes more difficult because of unknown
loop boundaries. Mr.Lengauer outlined two approaches, the "conservative"
and the "speculative", to deal with this problem.

               Christian Lengauer
               The University of Passau
               D-94030 Passau, Germany
               e-mail: lengauer@fmi.uni-passau.de
-------------------------------------------------------

        Japanese University Projects: "Massively Parallel
        Processing Project as a Priority Area of Research for the
        Ministry of Education"
        by H.Tanaka, University of Tokyo, Japan

        Mr.Tanaka is one of the most important people in the field of
parallel computing in Japan. He is or was, in one way or the other,
involved in the Fifth Generation Computing Systems Project, the Real
World Computing Project (RWCP) and the Japanese University Massively
Parallel Processing Project (JUMPP).  His talk was about the JUMP
project, which ran from April 1992 through  March 1995.A symposium,
showing the main results achieved, will be held at the University of
Tokyo on March 23/24, 1995. A prototype of the Jump-1 parallel computer
is scheduled to be demonstrated by March 1996.  Currently applications
for a 4 year follow-up project  to be funded with roughly 1.1 billion
Yen  (about 12 million USD)  are under way.  Mr.Tanaka gave an overview of
the organizational structure of the project, the universities involved and
the overall manpower available.  He stressed the priciple of "Direct
Mapping" which maps individual objects in problem space on processors with
a one-to-one mapping.This strategy greatly simplifies programming albeit at
a considerable cost of HW.  Mr.Tanaka then briefly introduced the two
parallel programming languages (NCX and V) developped within the project,
the operating system to be used (COS) and the architecture of the Jump-1
parallel computer which is to have an interconnection network based on a
recursive, diagonal torus topology.

              Hidehiko Tanaka
              University of Tokyo
              Department of Electrical Engineering
              7-3-1 Hongo, Bunkyo-ku
              Tokyo 113, Japan
               email: TANAKA@MTL.T.U-TOKYO.AC.JP
-------------------------------------------------------

        "Supercompilers for Massively Parallel Architectures"
        by B.Chapman, M. Pantano, H.Zima, University of Vienna, Austria
        The talk was given by Mr.Zima

        As mentioned in the tutorial section, Mr.Zima is an authority in
the field of Fortran languages.  In this talk he presented a brief
history of his brainchild, Vienna Fortran, and proceeded to outline the
main problems encountered in parallelizing data-parallel programs. He
showed that some constructs can not be parallelized by the compiler. To
tackle such problems he uses the Inspector/Executor model  which allows
for runtime distribution of data in dependence of system performance and
user annotations that show, for instance, data dependencies.The basic
strategies for program transformation were outlined, using the Vienna
Fortran Compilation System as a background.

               Hans Zima
               University of Vienna
               Brunnerstrasse 72
               A-1210 Vienna, Austria
               email: zima@par.univie.ac.at
-------------------------------------------------------

        -Stream A: Tools and Technology-

        "Methods and Tools for the Efficient Use of Parallel
        Computer Architectures"
        by A.Bode, Technical University of Munich, Germany

        Mr.Bode took the occasion to present his laboratory and the
German Science Foundation (DFG) project fund, under which the article
presented, was funded (SFB 342). Half of the funding is provided by the
German Science Foundation and the remaining half by the federal
government which is in charge of the university. Projects under this
funding scheme are allocated in three-year slots and can be extended to
up to 15yrs. At the end of every three years they are reviewed and
evaluated.  Based on the result of this review further funding is
granted or refused. Private companies or research organizations may
participate and are in fact encouraged to do so, however they receive no
government funding.  Among the private partners that cooperate with
Mr.Bode are Siemens Corporate Research and the Intel European
Supercomputer Development Center which was actually founded by members
of Mr.Bode's laboratory.  Mr.Bode then outlined the various project
areas and the HW infrastructure that supports the joint projects. This
includes various Intel supercomputers, a Parsytec PowerPC based MPP, a
workstation cluster connected through a 32x32 ATM switch etc.  After
having presented the more organizational aspects of his laboratory,
Mr.Bode gave a brief overview of monitor technology, classifying them
into HW, SW and hybrid monitors. The work being conducted at his
laboratory involves the development of an integrated tool for post-
design analysis of parallel programs. This contains a debugger, a
performance analyzer and a visualizer.  As a conclusion Mr.Bode
admitted, that the integrated approach seems to be sub-optimal and that
independent development of the tools mentioned might be a more promising
approach.

        Arndt Bode
        Institut fur Informatik
        Lehrstuhl fur Rechnertechnik and Rechnerorganisation
        Technische Universtat Munchen
        D-80290 Munchen, Germany
        email: bode@informatik.tu-muenchen.de
-------------------------------------------------------

        "A Scalable Performance Analysis Tool for PowerPC based
        MPP Systems"
        by A.Bode, O. Hansen, J. Krammer, Technical University
        of Munich, Germany
        The talk was also given by Mr.Bode.

        This talk was actually just a continuation of the talk
before. Here Mr.Bode focused on the performance analyzer PATOP,
developped by his laboratory for the Parsytec company that distributes
it under licence.He stipulated "low intrusion" as one of the major
requirements for a good performance analyzer and explained how this goal
is achieved in the PATOP.  In order to make full use of the capabilities
of a tool to analyze MPP systems, scalability must not only be achieved
on a performance level but also in visualization.  This means that
significant data should be displayed in a way, that makes it independent
of the actual scale of the system monitored. Various display techniques
supporting this paradigm were presented and slides giving screen dumps
of live sessions were shown to explain the concepts introduced.
-------------------------------------------------------

        "Software Engineering Techniques and Tools for High
        Performance Parallel Systems"
        by P.Croll, I. Jelly, I. Gorton, University of
        Sheffield, UK
        The talk was given by Mr.Croll.

        Mr.Croll first stated that Software Engineering (SE) approaches
to develop quality SW are still very rare in the parallel computing
community. This seems especially noteworthy as the programming effort for
parallel programs is obviously much greater than that for conventional
serial systems.  One of the main requirements for a successful tool
would be to hide this extra complexity from the programmer, thus
allowing him to focus on the essential semantic requirements of his
program. Mr.Croll proposes that this can only be achieved by using formal
methods, such as Petri nets, that allow for mathematical analysis and
correctness proofs. This would have to be augmented by heuristic
strategies and graphical display techniques in order to make it easy and
efficient to use.  The PARSE approach developped at the University of
Sheffield aims at offering all the above mentioned features. It is based
on an existing commercial CASE tool, Software through Pictures
(StP),which is augmented by a Petri-net based language.  In his
conclusion Mr.Croll stated that in future developments, tool
interoperability  and method integrability will have to be provided in
order to make system adaptions easy and straightforward.

        P. Croll
        Dept. of Computer Science
        The University of Sheffield, UK
        email: p.croll@dcs.shef.ac.uk
-------------------------------------------------------

        "Constructing Parallel Implementation with Algebraic
        Programming Tools"
        by A.E.Doroshenko, A.B. Godlevsky, Ukrainian Academy of
        Science, Ukraine
        The talk was given by Mr.Doroshenko

        Mr.Doroshenko proposes a very formal approach to bridge the
HW/SW gap between architectures and languages. It is based on the
technique of Algebraic specification which is a formal method that
allows to carry out semantics- preserving program transformations.His
institute developped the Algebraic Programming System (APS) which is
based on the notion of Algebras of Algorithms.Each such algebra is a
two-dimensional tuple consisting of an Algebra of Operators and an
Algebra of Conditions, this being sufficient to characterize any
algorithm.  The semantics of parallelization are expressed by using the
state of a program as a unit. This state is a triple, consisting of 1)
memory state, 2) a list of already scheduled instructions and 3) a list
of not yet scheduled instructions.During the flow of a program the not
yet scheduled instructions are shuffled over to the already scheduled
instructions, with the program ending when all instructions are
scheduled and executed.  In his conclusion Mr.Doroshenko  said that the
system is currently being implemented on a PC where it is used to
explore promising strategies that pick the rewriting rules to be
used.This is of major importance if the system is ever to go into
production.

        A.E. Doroshenko
        Glushkov Institute of Cybernetics
        Ukrainian Academy of Sciences
        Glushkov prosp., 40, Kiev-22
        252187 MSP, Kiev, Ukraine
        email: dor@d105.icyb.kiev.ua
-------------------------------------------------------

        -Plenary Session II-
        Japanese University Projects: "GRAPE Project for a
        Dedicated Tera-flops Computer"
        by D.Sugimoto, J. Makino, M. Taiji, T. Ebisuzaki,
        University of Tokyo, Japan
        The talk was given by Mr.Sugimoto

        This talk was in quite a few ways unconventional and
refreshing. The computer introduced was not developped by computer
scientists or electrical engineers but by astronomers who wanted cheap,
high-performance computing power to tackle complex  n-body
calculations. The computer is a 100% dedicated solution not making any
concessions to flexibility when it comes to the performance
vs.flexibility dichotomy. The first prototype system, having a peak
performance of 120 MFlops was built with HW costs of about 2.000 USD!  The
development of this computer is entirely application- driven:
Mr.Sugimoto's Department of Earth Science and Astronomy faces the need
to simulate binary clusters in the Magellanian cloud, a galaxy 'close'
to ours. This is essentially an n-body problem. The simulation of such a
problem has the advantage that it can be simulated using the "Direct
Mapping" approach and processing every virtual processor in
parallel. Data exchange is very regular and synchronous.  The system is
built upon eight multichip modules, each holding six  custom-built HARP
chips with an external clock rateof 15MHz and an internal clock rate of
30MHz.  The particles (stars) to be simulated are divided into 8x6
streams which are processed in parallel by the HARP chips. At the end of
the pipeline, result data are collected and various integrals are
calculated.  Essentially the whole system is a hardwired system to
realize fourth-order Hermitian integration. The system can be used to
tackle any problem that can be expressed through n-bodies interacting
through two-body forces, such as molecular dynamics or some aspects of
the three- dimensional structure of proteins.  Mr.Sugimoto concluded by
expressing his disappointment at the lack of interest by computer
scientists in his project and his hope that this situation will change
for the better.

               Daiichiro Sugimoto
               Department of Astrophysics
               College of Arts and Sciences (Komaba)
               University of Tokyo, Japan
               email: sugimoto@chianti.c.u-tokyo.ac.jp
-------------------------------------------------------

        Japanese University Projects:
        "Promising Data Parallel Environment - ADEPS,ADETRAN and
        ADENA"
        by T.Nogi, University of Kyoto, Japan

        Mr.Nogi represents a rather small research group consisting of
himself and one graduate student. He proposes a novel approach to data
parallelization which is based on line segmentation. In this segmentation
method every n-dimensional array is divided into 1- dimensional line
segments where each segment participates in computation as well as data
exchange (communication).  To accomodate this novel approach Mr.Nogi
presented a programming language ADETRAN, which is a Fortran dialect
augmented by some segmentation directives. In order to process programs
written in this language, a new type of architecture, ADENA, is
needed. This basically revolves around two- to three-dimensional memory
arrays which are connected to the PEs in a row-column fashion.He has
built two prototypes, the ADENA-I in 1985 and the ADENA- II in 1989. The
latter, which was built in cooperation with Matsushita Electric Industry
Co.Ltd, consisted of 256 PEs with 8 MB each and had a peak performance
of 'several GFlops'.For the next step, the ADENA-04, Mr.  Nogi proposes
an architecture based on  4x4 processor modules and 4x4x4 memory
modules.A  configuration with 256 processor modules (16 processors each)
and 4096 memory modules, each processor being comparable in performance
to a VPP500 vector processor, would result in a 1.6 TFlops machine with
main memory of 1TB.

              Tatsuo Nogi
              Dept. of Applied Mathematics and Physics
              Faculty of Engineering
              Kyoto University
              Kyoto 606, Japan
              email:  nogi@kuamp.kyoto-u.ac.jp
-------------------------------------------------------

        Invited Talk: "Asynchrony in distributed parallel computing"
        by C.Jesshope, D. Barsky, A. Bolychevsky, A Shafarenko,
        University of Surrey, UK
        The talk was given by Mr.Jesshope.

        Mr.Jesshope talked about asynchronous parallel computing.This
is so far mainly used in data-flow architectures, which have not yet
proceeded beyond the research or prototype phase. Mr.Jesshope and his
group propose a data-parallel approach in combination with asynchronous
processing.According to him, this can be achieved by making full use of
symmetry in data types.  His aim is to define a basic set of symmetric
data types.  Making use of this typing system, new compiler technology
would have to be developped that make efficient data-driven asynchronous
computation possible.

               Chris Jesshope
               The University of Surrey
               Guildford Surrey GU2 5XH, UK
               email: C.Jesshope@ee.surrey.ac.uk
-------------------------------------------------------

        -Stream A: Architectures-

        "Logical Timing (Global Synchronization of Asynchronous
        Arrays)"
        by.V. Varshavsky, V. Marakhovsky, University of Aizu,
        Japan
        and T.Chu, Cirrus Logic Inc., USA
        The talk was given by Mr.Marakhovsky

        This talk dealt with the synchronization problems encountered
when looking at abstract cellular automata.  This is one of the
theoretical models that can be used to describe parallel
computers. Mr.Marashovsky began by giving essential definitions about
time: logical time, physical time, synchronization etc.  He then
proceeded to present the quite complex mathematical framework which is
based on the master/slave model where receiving and sending of
information is separated into two distinct phases. This ensures
'asynchronous synchrony', meaning synchrony in logical time while having
asynchrony in physical time.  One of the main advantages of asynchrony
in physical time lies in the design of VLSI circuits.Clock-driven
circuits consume order of magnitude more electricity than asynchronous
circuits.Achieving true asynchronous synchrony would allow the design of
cheaper and more efficient VLSI circuits.

        V.B. Marakhovsky
        University of Aizu
        Aizu-wakamatsu
        Fukushima 965-80, Japan
        email: marak@u-aizu.ac.jp
-------------------------------------------------------

        "Garbage Collection on Message Passing Distributed
        Systems"
        by M.V.Screenivas, S. Bhalla, Delhi Institute of
        Technology, India
        The talk was given by Mr.Bhalla

        Garbage collection in loosely coupled distributed systems was
discussed. The problem examined was recovery in case of abrupt failures.
There are two issues concerning this: 1) Data recovery 2) Process
receovery.  The first is dealt with in standard database technology
using transaction mechanisms. Mr.Bhalla concerned himself more with the
issue of recovering distributed processes without having to restart the
whole system. This is done by defining checkpoints at which process
states are saved and thus commited.  The system is based on
send/recv-dependencies among communicating processes. This dependency is
expressed in a dependency graph.  Mr.Bhalla then used the concept of
terminal graphs to combine a group of message-passing processes into
independent groups that can recover their state after a certain
checkpoint as defined by their message interaction.

        S. Bhalla
        University of Aizu
        Aizu-wakamatsu
        Fukushima 965-80, Japan
        email: bhalla@u-aizu.ac.jp
-------------------------------------------------------

        "Cohesion: An Efficient Distributed Shared Memory System
        Supporting Multiple Memory Consistency Models"
        by Ce-Kuen Shi, An-Chow Lai, Jyh-Chang Ueng, National
        Cheng Kung University, Taiwan
        The talk was given by Mr.Shi

        One of the approaches in implementing virtual shared memory
systems based on message-passing architectures is through cache
coherency.  Mr.Shi's team implemented an object-oriented run-time thread
system, based on C++ classes using the intel iRMK real-time kernel, to
implement cache coherency on page and on object granularity level. By
using a run-time object server to manage the system-provided  classes,
no compiler or preprocessor is needed.The user just uses the
system-provided base class for his objects to ensure coherency across
the system.

        Ce-Kuen Shieh
        Dept. of Electrical Engineering
        National Cheng Kung University
        Tainan, Taiwan
        email: shieh@eembox.ncku.edu.tw
-------------------------------------------------------

        "On the effect of Spare Positioning on the
        Reconfigurability of Two-dimensional Processor Arrays"
        by V.O.Roda, T.T. Lin, University of Sao Paulo, Brazil
        The talk was given by Mr.Roda

        The probability that a processing element on a given wafer is
defect, rises with the level of circuit integration.In order to increase
the yield of WSI or VLSI scale two-dimensional processing arrays, fault
tolerant system design becomes desirable.  Mr.Roda proposes introducing
spare PE's into each two- dimesional processing array. When system
diagnosis shows some of the PEs to be faulty, these spare PEs are used
to replace them.If this can be achieved, the circuit offers full
functionality despite being originally defect.  The main problem with
this approach is that of routing.  Because it can not be guaranteed that
interconnections can be found to replace the defect PE with a spare PE,
different positioning of the spare PEs are examined.  Mr.Roda
experimented with positioning the spare PEs along the right/lower edges
of the processing array and positioning them crosswise in the center  of
the processing array.His simulations yielded satisfactory results with
positioning in the center being slightly superior to positioning along
the edges.

        V.O. Roda
        Inst. de Fisica e Quimica de Sao Carlos
        University of Sao Paulo
        C. Postal 369/13560
        Sao Carlo, Brazil
-------------------------------------------------------

        -Stream B: Scheduling and Load Balancing-

        "An Experimental Study of Load Balancing on Amoeba"
        by Weiping Zhu, C.F.Steketee, University of South
        Australia, Australia
        The talk was given by Mr.Zhu

        Amoeba is a parallel operating system based on the model of a
processor pool. It is available as free SW.  Mr.Zhu and his team adapted
the source code to port it to a network-connected cluster of
workstations.  Two load balancing methods were examined : 1) pre-emptive
load balancing at process start time  and 2) dynamic load balancing
involving process migration.  Mr.Zhu and his team implemented an
operating load balancer based on a two-level model. The model consists of
a load balancer which determines the policy to be followed and a
migration server which provides the mechanism.  When migrating processes
the team chose to migrate all of the memory space at once, thus
transferring the full processor image from source to destination
node. They experimented with distributed and with centralized load
balancing systems and evaluated their respective performances.  Their
results can be summarized as follows:  1) Prevention is better than
correction, The best results are achievd by implementing a sensible
pre-emptive load balancing  strategy.Process migration proved quite
expensive and did not yield good results.  2) Centralized management is
better than distributed management. This is understandable when bearing
in mind the overhead involved in managing distributed servers of any
kind.

        Weiping Zhu
        School of Computer and Information Science
        University of South Australia
        Adelaide, Australia SA5095
        weiping@machons3.levels.unisa.edu.au
-------------------------------------------------------

        "On finding Optimal Clusterings of Task Graphs"
        by W.Loewe, W. Zimmermann, University of Karlsruhe,
        Germany
        The talk was given by Mr.Loewe

        The PRAM (parallel random access machine) model is one of the
most widely used models to describe distributed shared memory
machines. It allows to analyze parallel algorithms without having to
bother with the nitty-gritty of machine-dependent peculiarities. One of
the main drawbacks of this model is, that it assumes uniform
instantaneous memory access. This makes the results obtained based on
this model quite unrealistic. Mr.Loewe bases his paper on the A-DRAM
model ( asynchronous distributed random access machine ). This model is
based upon message-passing parallel systems which asynchronously pass
messages of finite length among individual PEs.  Based on this model he
gives a definition of the granularity of a communication structure. A
communication structure is a graph where the vertices are the PEs and
the edges depict messages being passed between the two connected edges. A
communication structure is coarse-grained iff the minimum computation
time for each predecessor of a vertice in the communication  graph is
alwas larger or equal the maximum communication time from any
predecessor  to the vertice.  Mr.Loewe uses this definition to show that
finding the optimal clustering of task graphs is not NP complete, as one
might expect, but rather O(T(n) x P(n) + log(idg) ) where T(n) is the
depth of the communication structure; P(n) is the number of processors
used and idg is the maximum fan-in of any vertice in the graph.
Mr.Loewe proposes his model as practical approach for the design of
parallel algorithms. Because the underlying A-DRAM model can be
parametrized to reflect real hardware architectures, algorithms can be
designed for the A-DRAM model, optimized using Mr.Loewe's algorithm and
then compiled to the target architecture.
-------------------------------------------------------

        -Plenary Session III-

        Japanese University Projects: "Aizu Supercomputer Project"
        by T.Ikedo
        see part 2 of this report for  a detailed description of
        the Aizu Supercomputer Project.

        T. Ikedo
        University of Aizu
        Aizu-wakamatsu
        Fukushima 965-80, Japan
        email: ikedo@u-aizu.ac.jp

-------------------------------------------------------

        Japanese University Projects: "Visualization and
        Sonification of Methods"
        by N.Mirenkov, University of Aizu, Japan

        Mr.Mirenkov's paper was maybe the most innovative and courageous
presented in this symposium. He proposes a radically new programming
paradigm based on animation films. Algorithms are stored in a multimedia
format where color, sound and text each describe some aspect of the
algorithm. New programs are created through techniques familiar to hobby
filmers: cutting, editing etc. By combining and merging component films,
new programs can be created.  The main motivation of this approach is to
hide irrelevant syntactical details from the programmer. By using
multimedia techniques to display algorithms, Mr.  Mirenkov proposes that
different human talents can be effectively put to productive use. Color,
sound and text together offer a space infinitely more complex  than
simple text yet are at the same time easily processed by the human
brain. Mr.Mirenkov will make full use of multimedia techniques in
designing his new programming system. The system will start off with a
library of hand- made algorithmic films to be expanded dynamically by
the users.  Mr.Mirenkov  illustrated his new concepts with an animation
show that didn't fail to enthuse the audience.  Mr.Mirenkov's new
programming environment is one of the main applications foreseen for the
Aizu Supercomputer. In fact the multimedia center due to be inaugurated
on March 24th, will supply the first test environment for his paradigm
with integrated video, audio and computing capabilities.  As a
conclusion Mr.Mirenkov said that his approach is not revolutionary but
evolutionary being closely related to the field of algorithmic templates
and skeletons. In fact he referred to his approach as one of 'multimedia
skeletons'.

        Nikolay N. Mirenkov
        University of Aizu
        Aizu-wakamatsu
        Fukushima 965-80, Japan
        email: nikmir@u-aizu.ac.jp

-------------------------------------------------------

        "A Journey into Multicomputer Routing Algorithms"
        D.F.Hsu, M.D. Grammatikakis, M. Kraetzl
        The talk was given by Mr.Hsu

        This talk was in essence a re-rendering of the tutorial given by
Mr.Hsu. He listed the main requirements to an efficient interconnection
network as: easy routing, fault tolerance, scalability, symmetry,  small
hardware costs etc.  Then he gave a brief overview of the most common
routing methods such as store&forward, circuit switching, wormhole
routing etc.  After this introduction he used permutation routing in a
hypercube topology as an example to explain the problems and issues
involved in designing and using efficient interconnection networks.

              D. Frank Hsu
              Department of Computer & Information Science
              Fordham University
              Bronx, NY 10458-5198, USA
              email: hsu@murray.fordham.edu

-------------------------------------------------------

        -Stream A: Formal Methods-

        "Program Transformations and Skeletons: Formal Derivation
        of Parallel Programs"
        by A.Max Geerling, University of Nijmegen, The Netherlands

        This paper uses semantics-preserving transformations on
algorithmic skeletons to create an environment in which parallel
programs can be designed independently of the underlying machine
topology. The program is specified in a way that allow to translate it
into program skeletons.  These can then be compiled for a specific
machine topology to create an executable program.  Mr.Geerlings idea is
to apply semantics-preserving transformations not to programs but to
program skeletons.  By formally specifying the transformation from one
machine topology Ma to another machine topology Mb, an equivalent,
parallel transformation can be applied to a program skeleton developped
for Ma  to yield a new program skeleton for machine topology Mb.  The
whole system is based on a functional programming language. The reason
being that this programming paradigm is much easier to handle than
procedural languages, when applying transformations.  Mr.Geerling
demonstrated his ideas by transforming a matrix-multiplication program
for a two-dimensional mesh topology to an equivalent program for a
two-dimensional torus.

        A. Max Geerling
        Computer Science Institute
        University of Nijmegen
        Toernooiveld 1, NL-6525 ED Nijmegen
        The Netherlands
        email: max@cs.kun.nl
-------------------------------------------------------

        "Dynamic Randomized Simulation of Hierarchical PRAMSs on Meshes"
        by T.Heywood, University of Edinburgh, UK
        and C.Leopold, Friedrich-Schiller-Universitaet  Jena, Germany
        The talk was given by Ms.Leopold

        The Hierarchical PRAM is based on the PRAM programming model. The
main difference is that it allows for recursive partitioning, where no
inter-partition communication is allowed. Using the Peano indexing
scheme multiple H-PRAMs are mapped onto a two-dimensional mesh. The
mapping can be done so that for randomized memory access (this allows
uniform memory access time statistically) any p-processor m-memory  sub
H-PRAM can be simulated in O(sqrt(p)) with a probability of 1 -
pow(p,k), with a memory requirement of O(m/p) per mesh node.  The
motivation of this research is the assumption, that due to scalability
reasons, mesh architectures are going to become predominant in the near
to medium future. Using the simulation approach proposed in this paper,
algorithms designed for any abstract PRAM machine can then effectively
be simulated (embedded) on a given machine with mesh-topology.

        Claudia Leopold
        Fakultat fur Mathematik und Informatik
        Friedrich Schiller Universtat Jena
        Jena 07740, Germany
        email: claudia@minet.uni-jena.de
-------------------------------------------------------

        -The Aizu Supercomputer Project-

        The Aizu Supercomputer is a very ambitious project to create a
massively parallel system for Virtual Reality (VR) and multimedia
applications.  The designers seem to have followed the simple strategy
to use only the best and fastest components.  The system is a cluster
based multiprocessor, each cluster consisting of 8 PEs.

        *The Processor Elements*
        Each PE consists of a 64-bit R4400 RISC microprocessor by MIPS,
two memory management processors (MMP), two routers and a cache
coherence controller for distributed shared memory configurations and up
to 64MB local SRAM with 15nc access time.  The R4400 RISC microprocessor
contains 32 general purpose registers, 32 KB embedded primary cache and
4MB secondary cache.  The microprocessor is not directly connected to
the local memory, but rather uses the MMP units for memory access.
Because the MMPs are connected to the local memory as well as to the
routers, remote memory access is transparent and SW-less.

        *Interconnection topology*
        All inter-PE connections are based on serial links.  Within each
cluster all PEs are fully interconnected. Each processor has 1 outgoing
and 7 incoming serial links based on ECL technology with a bandwidth of
150 MB/sec (1.2 GHz).  Similarly, each cluster has 1 outgoing and n-1
incoming serial optical links (n<13 is the number of directly connected
clusters).  The outgoing inter-cluster link is shared by the 8 local PEs
using a round-robin scheduling mechanism.  The inter-cluster links have
a bandwidth of 3 GHz and will be upgraded to 10 GHz by 1996.  In a 10
GHz configuration, inter-cluster PE-to-PE communication would be exactly
as fast as intra-cluster PE-to-PE communication (150 MB/sec).

        *Memory Organization*
        Memory access is managed by the MMP. The processors are not
directly connected to the SRAM modules. Any load/store request are
handled by the MMP which is responsible for adress resolution, local
memory management and communication to and from the router for remote
memory access.  Due to the extremely fast interconnection network that
has practically zero latency due to the hardwired communication scheme,
local and remote memory access is more or less equally fast.  In the
basic configuration there is no explicit shared memory system. However
by simply replacing one PE per cluster with a memory module, shared
memory with uniform access time can be easily implemented.

        *Applications of the Aizu Supercomputer*
        The Aizu Supercomputer is meant to be used in VR or multimedia
applications. The Aizu multimedia center, due to be inaugurated March
24, 1995 will be built around this computer. The architecture and
topology are meant to be especially tuned for graphic computation,
allowing for instance real time video processing. One application slated
to run on the Aizu supercomputer will be Professor Mirenkov's multimedia
programming environment (see Plenary Session III for a discussion of his
paper).

        *Current state*
        A one-cluster prototype is due to be delivered by the hardware
maker just in time for the inauguration of the multimedia center. A full
system having 1365 PEs with a peak performance of more than 100GFlops is
planned to be up and running by next year. (The uneven number of 1365
PEs is the result of some of the PEs being replaced by memory modules).

--------------------------------END OF REPORT---------------------------

