Newsgroups: comp.parallel.pvm
From: maillet@imag.fr (Eric Maillet )
Subject: Re: XPVM: Messages transmitted back in time ?
Organization: Institut Imag, Grenoble, France
Date: 20 Oct 1995 14:08:10 GMT
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
Message-ID: <468aga$fgv@cosmos.imag.fr>

In article <463qc0$mtb@natasha.rmii.com>, Ken Lancaster <ken@wynde.com> writes:
> I've noticed that the arrows depicting message transfers in an XPVM timeline
> occasionally depict messages traveling backwards in time, i.e. being received
> before they were transmitted.  Einstein would be proud, but it makes me wonder
> if I can trust the timing information reported by XPVM for message transfers,
> or for anything else, for that matter.  
> 

The fact that messages travel back in time is due to improperly
synchronized clocks. Network time protocols do not provide
sufficient precision to date events in distributed systems.
To avoid timing incoherencies, like the one you observed,
the maximum allowable discrepancy between any two clocks in
your system has to be less than the minimum possible communication
time. Thus, the faster your network, the tighter your clocks have
to be synchronized.

To deal with this problem, the "pvm_hostsync" function has been
introduced in PVM. This function allows the caller to determine
the discrepancy of its clock with the one of another host. XPVM
uses the pvm_hostsync function to determine the clock discrepancies.
Events are then dated using local clocks, taking account of
the discrepancies of these local clocks with a reference clock (the
one of the host XPVM runs on, I think). The fact that you still
observe causal incoherencies may be due to the following:

1) pvm_hostsync does a "rough" estimation of the discrepancy

   The estimation is based on the symmetry of a message exchange. 
   If there is a lot of traffic on your network, this exchange
   is not likely to be symmetric and the resulting estimation
   is bad.

2) clocks are drifting

   Even if pvm_hostsync could correctly capture the discrepancy,
   this discrepancy will change with time (clock drift). If
   your application executes over a long period of time, initial
   discrepancies may not be correct any longer. That's why
   incoherencies due to clock drifts appear later in your
   application's execution.

   Change in discrepancy is proportional to the length of your
   application's execution. The ratio is about 1e-6. Roughly
   speaking

      change_in_discrepancy = 1e-6 * execution_time


In order to synchronize clocks in our tracing tools, we call
pvm_hostsync several times. Thus, we compute the offset between
clock values and their relative speed (drift). This allows us
to correctly timestamp events in our applications. For more
information, you may refer to our article in EuroPVM'95,
"Issues in Performance Tracing with Tape/Pvm".


-- 
------------------------------------------------------------------------------
Eric MAILLET                                            LGI/LMC (INPG)
e-mail : Eric.Maillet@imag.fr              o            46, av. Felix Viallet
NeXT   : maillet@petole.imag.fr         --#>            F-38031 GRENOBLE CEDEX
voice  : (033) 76 57 48 73             ___>____,        FRANCE
------------------------------------------------------------------------------

