Newsgroups: comp.parallel
From: loeb@cs.duke.edu (Michael Alan Loeb)
Subject: FFT's on the CM-5
Organization: Duke University Department of Computer Science, Durham, NC
Date: 18 Apr 1995 00:35:20 GMT
Message-ID: <3n36v7$jsg@usenet.srv.cis.pitt.edu>

I have been trying to develop some pseudospectral codes simulating 3-D
convection for the CM-5, and have got some poor performance using the
FFT's in the CMSSL library.  Does anybody out there do anything
similar, and what kind of performance do other people get?

Here's the technical data.

In the problems we're studying, the arrays for our 3-D mesh are on the 
order of 256 * 256 * 16.  (The first 2 dimensions will always be much
greater than the 3rd.)  I use the "Layout" directives to keep all of
the data in the last 2 dimensions local to a processor decomposing the
first dimension.  This seemed a reasonable approach to me.  Is it
reasonable?  The trouble is that the communication on the axis that is
non-local is killing my performance.  I've tried the approach
suggested in the CMSSL manual where you do the FFT's on the local
axes, transpose the local and non-local axis using the
gen_matrix_transpose routine, and FFT on the 3rd axis (which is now 
local).  It buys me about a factor of 2 speed up, but the code is
still slow compared to similar codes on the Cray YMP.  Is there
something I'm missing?  Any help that people could give would be
greatly appreciated.


Thanks in Advance,

Michael Loeb
--
Department of Computer Science, Duke University, Durham, NC 27706
Internet:	loeb@cs.duke.edu
UUCP:		mcnc!duke!USER

