This example allows an MPI implementation to overlap communication and
computation.  Note, however, that both operations require memory operations;
the contention for memory access may effectively slow down both communications
and computations.
