This version uses nonblocking operations for both sending and receiving; 
primarily, this is to handle the buffering issues.  In this case, the sends
are posted first, allowing receiver-pull rendezvous protocols to often avoid
synchronization delays (but without guarenteeing that)
<P>
A separate example shows the use of nonblocking operations to express the
overlap of communication and computation.
