This example does allow all data transfers to take place simultaneously.
Whether or not the underlying MPI implementation and hardware can do this is
something that this example evaluates.
