Newsgroups: comp.lang.fortran,comp.parallel,comp.sys.super
From: David Coster TOK <dpc@ipp-garching.mpg.de>
Subject: Re: APR xHPF 2.1 Released; NAS Parallel Benchmark Results
Organization: Max Planck Institute for Plasma Physics, Garching b. Munich, Germany.
Date: Fri, 21 Jul 1995 12:16:52 GMT
Message-ID: <3uo5rk$1jk@s30l5s.aug.ipp-garching.mpg.de>

Recently there was a posting about the apparent success of an HPF
compiler.  Below, I have extracted the tables from this posting, and
added additional columns comparing the results with those quoted by
the manufacturers (``NAS Parallel Benchmark Results 3-95, Saini and
Bailey).  The original posting did not make it clear whether the Class
A or Class B suites were being used (Class B typically being
``bigger'' problems) --- the codes at the ftp site quoted in the
article seemed to have the characteristic sizes of the Class A codes.
Thus I have calculated a ratio between the HPF times and the
manufacturer's Class A times.  If, in fact, the codes used were sized
for Class B, the situation is not quite as bad as I have pictured it:

Possible conclusions that I can draw from this are:
  (1) the compiling system still has a long way to go;
  (2) the NAS Parallel Benchmark is a bad benchmark because it allows
too much freedom, and the manufacturer's NAS Parallel Benchmark
numbers are no real reflection of hardware performance, but instead
rate the performance of the people rewriting the code;
  (3) some combination of (1) and (2).

#========  NAS Benchmark Results  ========
#
#NOTE: The following times are between 2-10 times slower than the
#timings reported by the various vendors.  The major difference is due
#to the vendors' extensive rewriting of the benchmarks to obtain the
#best possible single node performance.  APR has asked and will continue
#to ask the vendors to supply their optimized single node versions of
#the benchmarks so everyone can start with the same sequential
#programs.  To date, however, all vendors have refused saying their
#versions of the benchmarks are proprietary.
#

The three added columns are:

Manufacturer's Class B Time =====================================+
                                                                 |
Manufacturer's Class A Time ==============================+      |
                                                          |      |
Ratio of quoted time to manufacturers Class A Time+       |      |
                                                  |       |      |
#                                                 |       |      |
#Benchmark SP: Simulated CFD Application          |       |      |
#---------------------------------------------------------------------
#Platform         Processors       Time(Sec.) 
#---------------------------------------------------------------------
# Cray C90            1               7634. **	43.74	174.50	689.60
#---------------------------------------------------------------------
# Cray T3D           16               2368.	11.71	202.11	818.07
#                    32               1353.	12.99	104.10	463.62
#                    64                728.	13.90	53.26	233.52
#---------------------------------------------------------------------
# IBM SP2-WIDE       16                576.     6.92    83.2    300.6  
#                    32                320.     6.57    48.7    163.8       
#                    64                192.     6.37    30.1    91.7       
#---------------------------------------------------------------------
# Intel Paragon      16               3435.     
#                    32               2202.    
#                    64               1257.   
#---------------------------------------------------------------------
# Sun SPARCcenter     8               2382.
# 2000 (40 MHz)      16               1617.
#---------------------------------------------------------------------
#
#
#
#Benchmark EP: Embarrassingly Parallel Benchmark
#---------------------------------------------------------------------
#Platform         Processors       Time(Sec.)       
#---------------------------------------------------------------------
# Cray C90            1                694.	18.95	36.62	146.41
#---------------------------------------------------------------------
# Cray T3D           16                100.	4.39	22.74	91.83
#                    32                 50.	4.39	11.37	45.92
#                    64                 25.	4.40	5.68	22.52
#---------------------------------------------------------------------
# IBM SP2-WIDE       16                 79.	7.93	9.95	39.89
#                    32                 40.	8.03	4.98	19.9
#                    64                 23.	9.23	2.49	9.95
#---------------------------------------------------------------------
# Intel Paragon      16                251.  
#                    32                126. 
#                    64                 64.
#---------------------------------------------------------------------
# DEC ALPHA           4                261.  
# 3000/900 (275Mhz)   8                131. 
#---------------------------------------------------------------------
#SGI PowerChallenge   4                459.
#MIPS R8000           8                233. 
#                    16                116.
#---------------------------------------------------------------------
#
#
#Benchmark BT: Simulated CFD Application
#---------------------------------------------------------------------
#Platform         Processors       Time(Sec.) 
#---------------------------------------------------------------------
# Cray C90            1              10615.	38.34	276.80	1023.4
#---------------------------------------------------------------------
# Cray T3D           16               1958.	8.49	230.41	918.04
#                    32               1044.	9.03	115.53	476.97
#                    64               551.	9.33	59.01	252.86
#---------------------------------------------------------------------
# IBM SP2-WIDE       16                446.	3.95	112.9	440.86
#                    32                245.	3.96	61.8	226.8
#                    64                164.	4.72	34.7	119.1
#---------------------------------------------------------------------
# Intel Paragon      16               5741.  
#                    32               3091. 
#                    64               1809.
#---------------------------------------------------------------------
# Sun SPARCcenter     8               3393
# 2000 (40MHz)       16               1759
#---------------------------------------------------------------------
#
#
#
#Benchmark FT
#---------------------------------------------------------------------
#Platform         Processors       Time(Sec.)  
#---------------------------------------------------------------------
# Cray C90           Code Requires more memory than available
							8.95	110.60
#---------------------------------------------------------------------
# Intel Paragon      16               1165.
#                    32               249.7
#                    64               247.7
#---------------------------------------------------------------------
# Cray T3D           16               279.4	23.67	11.80	NA
#                    32               192.41	32.61	5.90	NA
#---------------------------------------------------------------------
# IBM SP2-WIDE       16               104.8	14.61	7.17	91.8
#                    32                67.52	17.05	3.96	47.23
#
#
#
#
#Benchmark MG
#---------------------------------------------------------------------
#Platform         Processors       Time(Sec.)   
#---------------------------------------------------------------------
# Cray C90           Code Requires more memory than available
							7.27	33.78
#---------------------------------------------------------------------
# IBM SP2-WIDE       16                 17.48	5.51	3.17	14.58
#                    32                 12.25	7.24	1.69	7.72
#                    64                 12.07	12.70	0.95	4.36
#
#
#Source code for these benchmarks can  be found via anonymous FTP to
#ftp.infomall.org in subdirectory /tenants/apri/Bench or via WWW to the
#URL http://www.infomall.org/apri/
#
#Printed hardcopy of this report can be obtained by contacting:
#
#           ___   _____    _____
#__________/==|__|==__=\__|==__=\     Applied Parallel Research, Inc.
#_________/===|__|=|__\=\_|=|__\=\    1723 Professional Drive
#________/=/|=|__|=|__/=/_|=|__/=/    Sacramento, CA 95825
#_______/=/_|=|__|==___/__|====_/_______________________________________________
#______/=___==|__|=|______|=|\=\________________________________________________
#_____/=/___|=|__|=|______|=|_\=\_______________________________________________
#    /_/    |_|  |_|      |_|  \_\    
#Voice:     (916)481-9891             E-mail:    support@apri.com
#FAX:       (916)481-7924             APR Web Page: http://www.infomall.org/apri
#-------------------------------------------------------------------------------
#

Dave.
-- 
David Coster
dpc@ipp-garching.mpg.de                   http://www.ipp-garching.mpg.de/~dpc/
dcoster@pppl.gov
dcoster@princeton.edu


