Newsgroups: comp.arch,comp.parallel,comp.sys.super,comp.parallel.pvm,comp.parallel.mpi
From: Zhiwei Xu <zxu@monalisa.usc.edu>
Subject: How to solve this simple problem in F90, HPF, PVM, or MPI? (Summary of Responses to "Is SP2 the best supercomputer?")
Organization: University of Southern California, Los Angeles, CA
Date: Thu, 22 Jun 1995 20:45:34 GMT
Message-ID: <3sckpe$c95@monalisa.usc.edu>

Dear Colleagues,

A few weeks ago, I sent a post titled "Is SP2 the best supercomputer?".
I have received many encouraging responses. This is a brief summary.

1.	Some people pointed out that the good things I said about SP2 are
	not unique of SP2. In particular, Cray machines, CM-5 and SGI Power
	Challenge have them, too. I never used Power Challenge. Maybe 
	someone can post his experience.

2.	There are two broad classes of benchmarks. Macro benchmarks
	such as SPEC and NAS try to measure the execution times of a set
	of applications, which measures the performance of a whole
	system, including hardware, OS, and compiler. Rafael Saavedra at
	USC CS dept. developed a micro benchmark method, which aims to
	measuring primitives such as memory acesses and various overheads.
	Roger Hockney's formula and the COMM codes in Genesis (or PARKBENCH),
	and the STREAM benchmark also relate to microbenchmarking.

	Macro benchmarks are useful to compare systems. Micro benchmarks
	are helpful in using a system. Vendors should provide results of
	both. But currently they provide little microbenchmark information.

3.	christal@savines.imag.fr, at the IMAG institute in Grenoble, France,
	suggested to form 
	
	"... an IBM SPx User's Group. An user's group is
much well suited to take users problems and research solutions, for any kind
of problem (basic or complex, involving third-party software or hardware).
There has been a lot of user's groups, for various machines. Why not for the
SPx ?. As this can lead to sell much SPx, maybe IBM will draw funds to
sustain the effort."

	I think this is a wonderful suggestion. I hope IBM or
	one of the HPC centers can follow up.

4.	Some people asked why I suggested that support for atomicity and
	eureka should be included in a message passing system. I have the
	following comments:

4.1	Atomicity and eureka are needed by some user applications, which
	must be satisfied one way or another, whether you are using a shared
	variable, a messaging passing, or a data parallel programming model.
	Examples of such applications include database, transaction processing,
	discrete event simulation. A short but concrete example is given below.

4.2	Atomicity and eureka ARE supported by some message passing systems.
	E.g., in Express, there is exhandle and semaphore.

4.3	The following loop is a simplified version of a real code fragment.
	I feel that there is no simple parallel implementation in Fortran 90,
	HPF, MPI, or PVM, because they lack supports for atomicity and eureka.
	I would appreciate any suggestions and will summarize the proposed
	solutions.

----------------------------------------------
COMPLEX 		A[N][M];	
Target_Structure	TargetList[10] ;

k=0 ;
for ( i=0 ; i<N ; i++ )
	for ( j=0 ; j<M; j++ )
	{
		if ( IsTarget(A[i][j]) )
		{
			TargetList[k].distance = i ;
			TargetList[k].direction = j ;
			k = k++ ;
			if ( k==10 ) goto finished ;
		}
	}
finished:
-------------------------------------------------

Additional Information from the Programmer (User):

	The code tries to detect the ten closest targets (if any)

	The for (j ...) loop can be parallelized. That is, for each i,
		the data points A[i][*] can be examined in any order

	You can assume that IsTarget is a side-effect free, pure routine



Many thanks.

Zhiwei Xu	zxu@aloha.usc.edu


