Newsgroups: comp.parallel.mpi
From: steinhh@amon.uio.no (Stein Vidar Hagfors Haugan)
Subject: Re: How to solve these TWO problems in Fortran 90, HPF, PVM, or MPI? (long)
Organization: Institute for Theoretical Astrophysics
Date: 3 Jul 1995 07:42:27 GMT
Message-ID: <3t8713$693@hermod.uio.no>


In article <3t61dt$cde@monalisa.usc.edu>, zxu@monalisa.usc.edu (Zhiwei Xu) writes:
[..snip..]
|> 
|> 	More specifically, is there a parallel solution to the target detection
|> 	problem as nice as the sequential code (BYW, I did not write it)?
|> 	It is short, simple, and clear. Anyone can understand it after staring
|> 	at it for two minutes. It is also efficient, not performing extra
|> 	computation. In short, it is *elegant* (I know, I know, there is a
|> 	goto!).
|> 

[...snip...snip...]

This is my $.05:

Target detection: Introduce a variable Delta_Hits, which contains
the number of targets found (0 or 1) for each processor for each loop.
This is then MPI_Reduce'd for each loop, with the result ending up
in all the working nodes. Now, all the nodes will know when the total
number of hits exceed the desired limit (10). This takes care of the
efficiency concerning IsTarget(). As for the amount of bandwidth
used, I cannot see that it's possible to have the desired functionality
with any less .....

After that, all the nodes send in their list of targets through
an MPI_Allgather call, together with information on how many targets
each processor has found. This can be implemented with a fixed length
array (10 targets) for each processor. The only remaining task is
to pick out those targets that would have been selected with the
sequential model. (There might be N < N_nodes too many hits found,
with no extra computational overhead..)

The general philosophy: Instead of safeguarding "volatile" status
variables on one "master" process, share the information and make
sure all the nodes make the same decisions (after all, they have
the same information).

The same philosophy might be used for the bank transfer problem
(without having studied the case vey thoroughly): For each
round of withdrawals, distribute information about the size of 
all withdrawals to all the processes, and make the decision 
locally. This example is a bit more awkward, though, as you're 
really describing "real-time interactive banking" on parallell 
processors. This would mean that some synchronization has to 
occur for each round of withdrawals. But still....
I think to some extent that the underlying mechanisms of 
atomicity might be less efficient in terms of (lack of)
parallellization and communication overheads rather 
than actually doing explicit things about your algorithms.

Stein Vidar H. Haugan



