Write a program to measure the time it takes to send 1, 2, 4, ..., 1M C
doubles from one processor to another using MPI_Isend, MPI_Irecv, and
MPI_Wait.  Use the same techniques as in the memcpy assignment to average out
variations and overhead in MPI_Wtime.
<P>
Unlike the other examples where the data is sent, pingpong fashion, from one
processor to another, have both processors send at each other.  This test
will measure the effective bandwidth and latency when a processor is both
sending and receiving.
<P>
Print the size, time, and rate in MB/sec for each test.
<P>
Make sure that both sender and reciever are ready when you begin the test.
The sample solution uses MPI_Sendrecv, but other choices are possible.
Compare the performance to that when using MPI_Send and MPI_Recv.
