Newsgroups: comp.parallel.pvm
From: papadopo@cs.utk.edu (Philip Papadopoulos)
Subject: Re: PvmDataInPlace takes longer than PvmDataRaw???
Organization: CS Department, University of Tennessee, Knoxville
Date: 7 Aug 1995 12:02:40 -0400
Message-ID: <405df0INNnrj@duncan.cs.utk.edu>

In article <404f70$bua@ds2.acs.ucalgary.ca> adilger@enel.ucalgary.ca (Andreas Dilger) writes:
>In article <3vtqmn$qu7@news.uncc.edu>,
>Stuart D Blackburn <sdblackb@uncc.edu> wrote:
>[snip]
>>PVM Version 3.2 output  
>>	* 20x10x10 matrix with 0.01 tolerance
>>	* Using PvmDataInPlace encoding
>>
>>Slave 1 finished 76 iterations in 5.3855 seconds
>>==============================================================
>>PVM Version 3.1 output  
>>	* 20x10x10 matrix with 0.01 tolerance
>>	* Using PvmDataRaw encoding
>>
>>Slave 1 finished 76 iterations in 2.5458 seconds
>[snip]
Given the size (small) of your test cases, you are seeing latency
effects only.  The way inplace packing works is to send a complete
fragment for each array packed in the message. What this means is
that if you do
          pvm_initsend(PvmDataInPlace);
          pvm_pkint(iarray,10,1);
          pvm_pkint(farray,10,1); 
#TWO# network packets are sent, 1 for iarray and 1 for farray.  For
this small data size, PvmDataDefault would first copy iarray and farray
into a single buffer and then send 1 buffer.
So how can you see this for yourself?  At the end is a little test
program, called test1.

              % pvmd -d1 &                 # debug flag for pvmd
              % test1 0                    # run the program with default pack
              % test1 1                    # run with raw pack 
              % test1 2                    # run with inplace pack

              Now look at the log file. Replace <nnnn> with  your userid.

              % grep "src t4" /tmp/pvml.<nnnn> | grep locloutput 

I get lines that look like the following:

[t80040000] locloutput() src t40001 dst t40002 f SOM,EOM len 112
[t80040000] locloutput() src t40001 dst t40002 wrote 112

[t80040000] locloutput() src t40003 dst t40004 f SOM,EOM len 112
[t80040000] locloutput() src t40003 dst t40004 wrote 112

[t80040000] locloutput() src t40005 dst t40006 f SOM len 72
[t80040000] locloutput() src t40005 dst t40006 wrote 72
[t80040000] locloutput() src t40005 dst t40006 f EOM len 56
[t80040000] locloutput() src t40005 dst t40006 wrote 56

This first pair is for the default packing. The second for Raw
and the third is for inplace. Notice two packets are sent (because of
two arrays) in the last case.


Hope this helps you understand your results,

Phil Papadopoulos

P.S.  Here's the test program 

#include <stdio.h>
#include "pvm3.h"
static int iarray[10];
static float farray[10];
main(argc,argv)
int argc;
char **argv;
{
int myparent, master, otid,enc;
      master = ((myparent = pvm_parent()) == PvmNoParent ) || (argc >= 2);
      if (master)
      {
          enc = atoi(argv[1]);
          printf("%s -- encoding %d \n", argv[0], enc);
          pvm_spawn(argv[0],(char **) 0, PvmTaskDefault, (char *)0, 1, &otid);
          pvm_initsend(enc);
          pvm_pkint(iarray,10,1);
          pvm_pkfloat(farray,10,1);
          if (otid)
             pvm_send(otid, 100);
      }
          else
             pvm_recv(myparent,100);
      pvm_exit();
}



