Newsgroups: comp.parallel.mpi,comp.parallel.pvm
From: jeg@theory.tc.cornell.edu (Jerry Gerner)
Subject: Re: Thinking about parallel I/O
Organization: Cornell Theory Center
Date: 11 Jul 1995 13:57:53 GMT
Message-ID: <3tu011$2tur@theory.tc.cornell.edu>

In article <3ttcgi$uh8@winx03.informatik.uni-wuerzburg.de>,
hinker@cip.informatik.uni-wuerzburg.de (Stefan Hinker) writes:
|> David M. Beazley (dmb@bifrost.Lanl.GOV) wrote:
|> 
|> : Having spent some time working on a message-passing application for the
|> : CM-5, T3D and other machines, one of the biggest problems that has come 
|> : up has been the apparent nonstandardization of I/O on different machines.
|> : My solution to this problem,
|>	[stuff deleted...]
|> : others have solved this problem within the PVM/MPI environment.  If 
|> : you'd like to share your experience, please respond.  If there's interest,
|> : I'll post a summary of responses later.
|> 
|> Sounds interesting.  However, to me, it is not quite clear what exactly you
|> mean by parallel I/O.  I could right now think of several different things,
|> some of which I have had problems with myself (mostly collecting
|> debug-output
|> from different tasks, for example (which is, of course, somewhat easier with
|> xpvm or such, but there could be other solutions...)).  Given some example
|> problems or definitions of "parallel I/O", I would look forward to the
|> discussion here.
|> 
|> Greetings,
|> Stefan
|> 

Parallel I/O is different strokes for different folks, e.g.,

1) your example of "collecting debug-output" is a good one!  Anyone who's
   ever tried to debug a parallel program run across N processors using 
   Brand X's tracing tools probably knows what you're talking about!

2) John Q. Engineer is modeling unsteady flows around the latest and greatest
   aircraft surface.  His simulation is running across 128 processors on
   Brand Y's scalable parallel computer.  Every "N iterations" each and every
   processor wants to save 18 MB's of data which will be used later on to
   generate an animation of these flows over time.  "Every N interations"
   occurs 550 times during a single run!  John Q. has spent some
   considerable amount of time porting, tuning, tweaking his parallel code
   in order to get "good speedups".  I bet John Q. wants to have some type
   of parallel I/O as well in order to deal with 18 MB's x 128 processors x
   550 occurences = 1.3 TB's of "soon to be viz data"!!!  He probably is 
   talking about parallel I/O at a fairly fine-grained applications I/O level.

3) John Q. also spends 1000's of hours of computing time modeling these new
   aircraft surfaces.  Naturally he's interested in "checkpointing" his runs
   occasionally (his computer has been known to go down from time to time).
   When his model checkpoints, each and every processor wants to save 85 MB's
   of "restart data".  Since John Q. wants to make sure this stuff is around
   the next time he might wish to restart his simulation, he'd really
   appreciate having some way to move the 85 MB's x 128 processors = 10.9 GB's
   of data from Brand Y's filesystem into Brand Z's archival data system.
   He's probably talking about multiple/parallel (you name 'em,...  HiPPI,
   whatever) pipes to slam that 10.9 GB's onto (you name 'em,...  RAID, DLT,
   whatever) some archival media in some mss in some efficient way,...  and to
   be able to get it back if/when he needs its.

Just 3 slightly different examples.  The PTOOLS folks might be concerned 
about the first of these, the MPI-IO folks about the second, and the PIO folks
about the third.  'Course its all just "parallel I/O"   :-)

BTW,...  be sure to check out Dave Kotz's Parallel I/O Archive web page at:

	http://www.cs.dartmouth.edu/pario.html

Lots of interesting stuff there, and lots of links to lots of other pages too.


Jerry Gerner
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Cornell Theory Center			      Email: jeg@tc.cornell.edu
730 Frank H.T. Rhodes Hall		 Tel: +1 607-254-8852
Cornell University		    Fax: +1 607-254-8888
Ithaca, NY 14853-3801 USA      URL: http://www.tc.cornell.edu/~jeg

