Newsgroups: comp.parallel
From: lim@halcyon.usc.edu (Young Won Lim)
Subject: QR decomposition on a single node SP-2
Organization: University of Southern California, Los Angeles, CA
Date: 9 Apr 1995 12:14:58 -0700
Message-ID: <3mje1u$srt@usenet.srv.cis.pitt.edu>

Dear Sir:

  This is Young Won Lim.  It would be appreciated if you reply
my question.

  I ran the QR decomposition routine (sgells in essl) on a single node
of SP-2.  The input problem matrix has a size of 1024 x 192. The
measured execution time was approximately 2.5 seconds.  I don't know
the exact MFlops of sgells routine, however, the sequential
Householder QR decomposition for m x n matrix takes n^2(m - n/3). In
my knowledge, the peak performance of POWER2 is 266 MFlops/sec. But it
shows approximately 14 Mflops/sec.

  I am wondering why this takes so long.  I tried -qarch and -O3
options, but it didn't help much.  Is there any other optimizating
way?

Thank you in advance.

Sincerely yours,
Young Won Lim.


