Newsgroups: comp.parallel
From: halstead@crl.dec.com (Bert Halstead)
Subject: Re: Help - explain superlinear speedup?
Organization: AOSG
Date: Tue, 31 Jan 1995 14:38:12 GMT
Message-ID: <3gj3cr$4bl@quabbin.crl.dec.com>

In article <mjrD36ADL.2FJ@netcom.com>, mjr@netcom.com (Mark Rosenbaum) writes:
|> In article <3g7slu$m8q@hub.meiko.co.uk>, James Cownie <jim@meiko.co.uk> wrote:

  [stuff deleted]

|> >Another possiblity for superlinear speedup can occur in searching problems.
|> >Consider searching an array of size N for a particular value, where by chance that
|> >value occurs half way down the array. If we run on one processor, we will find
|> >the value in time N/2. Give it to two processors with a block split and the 
|> >second processor will find it in time 1. (Since it'll be the first element
|> >it looks at). With a big enough array you can get an arbitrarily large
|> >superlinear speedup !
|> >
|> 
|> I think this may be a bit optimistic. If you subtract 1 for the locaction
|> you now have processor 1 finding it at the end of the search. The average
|> should be the same. You may find some advantage with odd distributions
|> mapped to the architecture but this would not be the general case. ...

I'm inclined to agree with you on this case, but in more general search
problems where the search space is tree-like, rather than linear, there are
genuine superlinear speedup possibilities that arise when the distribution
of solutions in the tree is highly uneven.  In this case, even without
pre-arranging that the solution is found at a convenient location (as in
Jim Cownie's example) it is still the case that the expected time to
finding the first solution for N search engines that search different parts
of the tree in a depth-first manner is less than 1/N of the expected time
that one sequential depth-first search takes to find the first solution. 
Intuitively, this is because the one sequential search has a high
likelihood of exploring one or more protracted dead ends before finding the
first solution, whereas in the case of N parallel search engines, it is
more likely that one of them will begin by searching a part of the tree
where a solution can be found at a shallow depth.  (I agree that the
intuitive argument is not definitive, but in this case the mathematics
backs up this intuitive argument.)

Of course, instead of performing a single depth-first search, a single
processor could simulate N search engines executing the parallel algorithm
described above, and in that case the superlinear speedup would go away
(except for the possibility that the single processor would suffer from
context-switching overhead that would not be present in the multiprocessor
case).						-Bert Halstead


