Newsgroups: comp.parallel.mpi
From: Wai Sun Don <wsdon@hydra.cfm.brown.edu>
Subject: Re: Matrix-matrix multiply
Organization: Div. of Applied Math., Brown U.
Date: Fri, 06 Sep 1996 22:44:45 -0400
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <3230E19D.167EB0E7@hydra.cfm.brown.edu>

Moreover, I am curious if there exist any scalable and efficient 
algorithm that compute y=A*x^T  where T denote transpose and 
the rank of A not necessary a integer multiple of the number of
processors.  In the other words, I would like to multiply A
with data distributed across processors.  

The obvious way to 
do this is to do a global tranpose of x so that the right element
of x are local in each processor, do the matrix multiply and
do an global tranpose of the result y back to its orginal form.

The second way is to do y=(x*A^T)^T  .

Thanks for the info.

-- 
Wai Sun Don
Visiting Associate Professor (Research)

