Newsgroups: comp.parallel.mpi
From: Nick Nevin <nevin@osc.edu>
Subject: Re: cancel threads with communication
Organization: Ohio Supercomputer Center
Date: 06 Aug 1996 12:51:07 -0400
Message-ID: <vn2bugoh2x0.fsf@alex.osc.edu>


The behaviour you see is caused by the GER protocol used by default in
LAM6.0.  The forever thread is actually cancelled (at least it was when
I tested it) but the main thread blocks in MPI_Finalize waiting for a
GER ack which should be sent when the message is received. In your
scenario the message is never received and the ack is not sent. If you
turn GER off by using the -nger flag to mpirun then the program will
terminate.

---nick.


     > Hello,
     > I'm working with LAM/MPI in a multithreaded environment (POSIX threads).
     > I've written following program with two threads:
     > (--- further explanation below -----)


     > #include <stdio.h>
     > #include <pthread.h>
     > #include <stdlib.h>
     > #include <mpi.h>

     > pthread_t forever_thread;
     > pthread_t send_thread;
     > pthread_mutex_t mpi_mutex;


     > pthread_addr_t iprobe_proc(pthread_addr_t dummy)
     > {
     >    int flag, i;
     >    MPI_Status iprobe_status, recv_status;

     >    pthread_setasynccancel(CANCEL_ON);
     >    pthread_setcancel(CANCEL_ON);
     >    for (;;)
     >    {
     >       pthread_mutex_lock(&mpi_mutex);  
     >       MPI_Iprobe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &flag,
     >                  &iprobe_status);
     >       pthread_mutex_unlock(&mpi_mutex);
     >       if (flag)
     >       {
     >          pthread_mutex_lock(&mpi_mutex);  
     >          MPI_Recv(&i, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &recv_status);
     >          pthread_mutex_unlock(&mpi_mutex);
     >       }
     >    }   
     >    pthread_exit(0);
     > }


     > pthread_addr_t send_proc(pthread_addr_t dummy)
     > {
     >    int i;

     >    i = 909;
     >    pthread_mutex_lock(&mpi_mutex);  
     >    MPI_Bsend(&i, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
     >    pthread_mutex_unlock(&mpi_mutex);
     >    printf("Habe Integer abgeschickt\n");
     >    pthread_cancel(forever_thread);
     >    pthread_exit(0);
     > }


     > int main(int argc, char* argv[])
     > {
     >    char buffer[100];
     >    int size;
   
     >    MPI_Init(&argc, &argv);
     >    MPI_Buffer_attach(buffer, 100);
     >    pthread_mutex_init(&mpi_mutex, pthread_mutexattr_default);
     >    pthread_create(&forever_thread, pthread_attr_default,
     >                  (pthread_startroutine_t)iprobe_proc,    
     >                  (pthread_addr_t)0);
     >    pthread_create(&send_thread, pthread_attr_default,
     >                  (pthread_startroutine_t)send_proc, (pthread_addr_t)0);
     >    pthread_join(send_thread, NULL);
     >    pthread_mutex_destroy(&mpi_mutex);
     >    MPI_Buffer_detach(&buffer, &size);
     >    MPI_Finalize();
     >    return 0;
     > }


     > The main thread creates the two threads forever_thread and send_thread
     > and waits for the end of the send_thread. The send_thread first sends
     > a number which should be received by the forever_thread and then tries 
     > to cancel the forever_thread. The program hangs up. 
     > mpitask shows that this LAM process blocks. It seems to me the
     > send-thread can't cancel the forever_thread, but can't also go on.
     > My question is why ?
     > Also the number is never received !

-=-
Nick Nevin				nevin@osc.edu
Ohio Supercomputer Center		http://www.osc.edu/lam.html



