Newsgroups: comp.sys.transputer
From: schedan@eskimo.com (Daniel Scheurell)
Subject: Re: T805 + C004 = Help needed.
Organization: Eskimo North (206) For-Ever
Date: 16 Apr 94 03:08:09 GMT
Message-ID: <CoC0pL.Ar8@eskimo.com>

Efthimios Tambouris (Efthimios.Tambouris@brunel.ac.uk) wrote:
: Hi,
: following the discussion on the performance of transputers I
: would like to pose some questions:
: 1) suppose an interconnection of T805 via C004 switches. What
: is the time required for two transputers to communicate? Is 
: there a start-up cost and a per word cost? Is there any mesurement
: for the values of these costs? or any benhmark to measure it?

As has been posted previously (by andyr, I believe), there's about a 300Kbyte
per second loss in data rate through a C004 switch.  This results in 
about a 1.4 Mbps sustained rate between two transputers connected via C004.
I suppose the task switching time could be seen as a start-up cost for 
channel communication.  I've seen about 4-5 microseconds per transaction.
"ispy"/"check" will give a measure of the data rate between transputers on
a link (in the "Mb" column).

: 2) suppose that the T805 are connected via C004 in a logical 2D 
: mesh. Is there a performance measurement available for the one
: to all personalised communication problem (one node sends to all 
: other a different message)? Is there a code in occam for optimally
: performing this? (using all links concurrently) Furthermore, is it
: possible for some nodes to work while forwarding incomming messages?

I don't know for sure, but it seems that using virtual channels would be the
easiest way to go from one to all.  There's no real performance penalty
to speak of using the virtual channel software.  Then the virtual channel
routers & buffers take care of forwarding messages while other processes work
on each node.  I guess a PAR construct would be the optimal Occam for this:
PROC p ( [N]CHAN OF COUNTED.ARRAY.PROTOCOL to.output )
  [N][M]BYTE message:
  [N]INT msg.size:
  SEQ
    -- ... INITIALIZE MESSAGES
    PAR i = 0 FOR N
      to.output[i] ! msg.size[i]::message[i]
:

: 3) Finally do you know the time needed for T805 to perform the 
: following:
: - addition, multiplication, division of integer or floating points

About 2.4 MFLOPS, single precision, on a ??25 Mhz T800?? (that was the last 
time I looked).  I think I measured this with a dot product function 
provided in a hand-coded library.  Must be better performance on a 
T805-30, eh?.

: - swap of the location of integers or floating points
: Is there any code to measure these?

Whatever you have handy.  Are there any standard benchmarks out there?????

: Thanks and sorry for the length of this posting.

Your welcome and no need to be sorry, considering the length of this reply.

: Efthimios
: --
: Efthimios.Tambouris@brunel.ac.uk

Kali Nichtasas :)

Dan Scheurell
schedan@eskimo.com
schdj700@ccmail.iasl.ca.boeing.com

