Newsgroups: comp.parallel
From: Paul Havlak <havlak@cs.umd.edu>
Subject: Re: Help - explain superlinear speedup?
Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742
Date: Fri, 13 Jan 1995 17:21:01 GMT
Message-ID: <3f68le$9rv@condor.cs.umd.edu>

In article <3f23ht$ec9@agate.berkeley.edu>,
Steve Slater <slater@nuc.berkeley.edu> wrote:
>I have a program which has superlinear speedup and I
>can't explain it. Does anyone have any ideas. Here is
>the summary.
>...
>There was NO memory swapping occurring during the entire
>execution time. I would periodically check with ps.
>
>Does anyone have any thoughts?

As I follow your argument, you've checked for swapping on the single
processor, so reduced usage of virtual memory cannot be the cause of the
superlinear speedup on multiple processors. 

But that's only one level of the memory hierarchy.  How about cache
effects?

	* Cache size vs. subarray size
		The more processors you have, the more data cache.  
		Perhaps each processor's subarray now fits into its 
		local cache.

	* Cache line size and traversal order
		If you've made the mistake of using row-major traversal
		over an array stored in column-major order (or vice 
		versa), you may get *no* cache reuse for long columns 
		(with one processor) and some reuse once the columns are
		shorter than the cache-line size (continually improving 
		as the column size then approaches one or the subarray 
		completely fits in cache).
-- 
Paul Havlak                      Dept. of Computer Science, A.V. Williams Bldg
Postdoctoral Research Associate  U. of Maryland, College Park, MD 20742-3255
High-Performance Systems Lab     (301) 405-2697 (fax: -2744) havlak@cs.umd.edu


