Newsgroups: comp.parallel.pvm
From: zxu@monalisa.usc.edu (Zhiwei Xu)
Subject: How to solve these TWO problems in Fortran 90, HPF, PVM, or MPI? (long)
Organization: University of Southern California, Los Angeles, CA
Date: 2 Jul 1995 04:55:11 -0700
Message-ID: <3t61ev$cej@monalisa.usc.edu>

Dear Colleagues,

I have received several very educational responses to my post. Here is
a summary and additional comments:

1.	I had a hidden agenda in my previous post, which I'd like to make clear:

	I believe there are two essential functionalities that are missing in
	current Fortran 90, HPF, PVM and MPI: 

		support for atomicity, and
		support for eureka (exhandle in Express)

	Following the HPF-2 practice, I offer two simple and concrete
	motivating examples to support my claim. 

	I suspect that these two simple problem are difficult to solve
	efficiently using any current data parallel or message passing system.
	I still	have this feeling after carefully studying the responses I got.

	More specifically, is there a parallel solution to the target detection
	problem as nice as the sequential code (BYW, I did not write it)?
	It is short, simple, and clear. Anyone can understand it after staring
	at it for two minutes. It is also efficient, not performing extra
	computation. In short, it is *elegant* (I know, I know, there is a
	goto!).

2.	Any parallel computing system can be viewed from three levels:

	Application:	database, molecule dynamics, radar signal processing

	Programming	data parallel, message passing, shared variable
	Model:

	Architecture	SMP, MPP, Workstation Clusters
	Platform:

	Atomicity and eureka are needed at the application level. They should be
	supported by any programming model on architecture platform. They are
	not inherently shared memory problems.

3.	People are starting to address the atomicity (e.g., in HPF-2) and eureka
	(e.g., one-side communications in MPI-2) issues. In fact, there is a
	section in the MPI book about how to simulate an atomic counter. However,
	the solution there is at odds with the elegance of the rest of MPI.

	Michael Kumbera of the Maui High Performance Computing Center showed me
	a nice HPF program for target detection, which is quoted below, with
	Mike's permission. Note that Mike's is a hacker's code :), just showing
	how to get the job done. You probably can make it shorter and cleaner.
	Also, Mike's code always evaluates IsTarget N*M times.

4.	I am an admirer of the Fortran 90, HPF, PVM and MPI designs. I am 
	especially enthusastic about MPI. It has powerful functionalities, yet
	it is NOT complex. Its simplicity is due to a few *orthogonal*,
	clean concepts (e.g., message datatypes, communicator, virtual topology,
	point-to-point and collective operations), which are elegantly integrated
	without adversely affecting one another. This orthogonality creates a
	multiplying effect: a few simple concepts, when used together, can
	create a lot of functionalities, in a clean way.

	And it is designed by a committee! The elegance of MPI design can be
	compared to that of the IEEE standard for floating-point arithmetic.
	I hope atomicity and eureka can be integrated with such elegancy.

5.	Some people suggested that the targeting problem can be solved by
	master-slave approach in a message passing environment. The slaves
	try to find targets. When a target is found by a slave, it sends a
	message to the masker, which updates k. When MaxTargets is reached,
	the master broadcasts to tell all slaves to terminate.

	Although this is a nice idea, it is not easy to implement in PVM or
	MPI, (or even in a shared memory model). Recall that all communications
	in PVM or MPI are two-sided.

Any response is appreciated.

Zhiwei Xu	zxu@aloha.usc.edu

Appendix 1. The banking problem is to illustrate the need for atomicity.

Assume a company gives an ATM card to each of its 10,000 employees (that'll be
the day :), which can be used to access a checking account and a savings account.
Assume these employees must be allowed to transfer funds between these accounts,
probably simultaneously, besides depositing and withdrawing. 
A transfer is an atomic operation:

Atomically do {
	if ( balance(savings) > $100 ) {
		balance(savings) = balance(savings) - $100
		balance(checking) = balance(checking) + $100
	}
}

How to do this in Fortran 90, HPF, PVM and MPI?	

Appendix 2. The target detection problem is to illustrate the need for eureka
		and probably atomicity.

----------------------------------------------
#define			MaxTargets	10	/* between 1 and 50 */
#define			N		1000	/* between 16 and 5000 */
#define			M		1000	/* between 16 and 5000 */
COMPLEX                 A[N][M];
Target_Structure        TargetList[MaxTargets] ;

k=0 ;
for ( i=0 ; i<N ; i++ )
        for ( j=0 ; j<M; j++ )
        {
                if ( IsTarget(A[i][j]) )
                {
                        TargetList[k].distance = i ;
                        TargetList[k].direction = j ;
                        k = k++ ;
                        if ( k==10 ) goto finished ;
                }
        }
finished:


--------------------------------------
Additional Information from the Programmer (User):

        The code tries to detect the MaxTargets closest targets (if any)

	How the targets are distributed is completely unknown

        The for (j ...) loop can be parallelized. That is, for each i,
                the data points A[i][*] can be examined in any order

        You can assume that IsTarget is a side-effect free, pure routine

	It is desirable that the program terminates when MaxTargets targets
	are found (That's why the sequential code has the goto). IsTarget
	could be expensive, in the order of Mflops.
----------


Appendix 3. Mike's HPF code for targeting:

      ! if point != 0 then it's valid
      pure integer function IsTarget(point)
      integer, INTENT(IN) :: point
      if (point >= 1) then IsTarget = 0
      else IsTarget = 1
      end function

      program target
      use HPF_LIBRARY

      implicit none

      interface
      integer pure function IsTarget(point)
      integer, INTENT(IN) :: point
      end function
      end interface
      integer M,N, num_results

      parameter(N = 100)
      parameter(M = 200)
      parameter(num_results = 10)

      integer mask(N,M)
      integer row(N,M)
      integer col(N,M)
      integer a(N,M)
      !HPF$ PROCESSORS procs(NUMBER_OF_PROCESSORS())
      !HPF$ DISTRIBUTE (BLOCK,B
      integer base(num_results)
      integer dir(num_results)
      integer dist(num_results)
      integer i,j

      !initialize a to something.
      forall(i=1:N,j=1:M) a(i,j)=modulo(i+j,17)

      !initialize row and col
      row = 1
      col = 1
      forall(i=1:N,j=1:M) row(i,j)=i
      forall(i=1:N,j=1:M) col(i,j)=j

      ! mask will store the destination of of the target data
      forall(i=1:N,j=1:M) mask(i,j)=IsTarget(A(i,j))
      mask = sum_prefix(mask)

      where (mask(i,j)>=1 .and. mask(i,j)<=num_results)
         dist(mask(i,j)) = row(i,j)
         dir(mask(i,j)) = col(i,j)
         end where

      do i=1,num_results
         write(*,*)  dist(i), dir(i)
      end do

      end program target

