Newsgroups: comp.parallel
From: genolini@westminster.ac.uk (Francois D. M. GENOLINI)
Subject: Core dump on a large distributed system
Organization: University of Westminster (London UK)
Date: Thu, 9 Feb 1995 17:08:41 GMT
Message-ID: <GENOLINI.95Feb9140351@leopard.westminster.ac.uk>

How do you dump a core on a large distributed system (single task
that is)?

What is the meaning of "post-mortem analysis" on parallel machines?

Do you get only the faulty thread to crash, log something in an
obscure file, and have the embedded fault-tolerance system smoothe out
the hicup?

I am not referring to any particular implementation, and would gladly
hear from *ANY* experience or concept related to this area.

Thanks.
--
Francois GENOLINI                     Email: genolini@westminster.ac.uk
Centre for Parallel Computing
University of Westminster, 115 New Cavendish Street, LONDON W1M 8JS, UK


