Newsgroups: comp.parallel.mpi
From: salo@mrjones.engr.sgi.com (Eric Salo)
Subject: Re: Performance problems using mpi on the SGI Power Challenge
Organization: Silicon Graphics, Inc.  Mountain View, CA
Date: 24 Sep 1996 07:47:25 GMT
Message-ID: <5283md$h0v@murrow.corp.sgi.com>

> The reason for the performance hit is that the SGI implementation
> of MPI spins when waiting for a message to arrive. The receiving process
> accumulates CPU time and doesn't yield to another process that might
> have useful work to do. Synchronization delays further magnify the
> problem. The tradeoff is that spinning reduces the latency
> (substantially?) when there are fewer processes than processors. 
> There are ways to get the best of both worlds, by spinning for
> a short time and then yielding the processor, but the implementation
> is more difficult. I wouldn't be surprised to see this in 
> a subsequent release. (No inside info here - it's just an obvious
> thing to try). 

Exactly. (Thanks for covering for me while I was out of town, Bill! :-)

As soon as you have more MPI processes than CPUs on which to run them,
performance drops in a very big way. This was a deliberate trade-off
that we made early on because we believed that the common case would be
to not oversubscribe the CPUs in this way.

We definitely intend to go for the "best of both worlds" in some future
release, but I don't yet have a good feel for when that might be. On
the other hand, if I can come up with a quick-and-dirty backoff algorithm
that appears to do the trick, I'll certainly drop it in...

Eric Salo         Silicon Graphics Inc.             "Do you know what the
(415)933-2998     2011 N. Shoreline Blvd, 8U-802     last Xon said, just
salo@sgi.com      Mountain View, CA   94043-1389     before he died?"

