Newsgroups: comp.parallel
From: lm@slovax.engr.sgi.com (Larry McVoy)
Reply-To: clust-survey@slovax.engr.sgi.com
Subject: Using clusters - please read
Organization: Silicon Graphics Inc., Mountain View, CA
Date: Sun, 5 Feb 1995 01:30:48 GMT
Message-ID: <3gp0a0$2bg@fido.asd.sgi.com>


Do you have, use, or know of a cluster, workstation farm, an IBM SP2,
or some other parallel system made up of cooperating machines?

I need to collect ways in which people use and administer such beasts.
Please take a moment to fill out the survey below and return it to
clust-survey@slovax.engr.sgi.com (the which should be the reply-to field of
this posting).

Some information is better than no information - if you know the answer
to just the first question, send that please.

Please forward this survey to any people that you know of that use clusters.

If you know of papers/surveys/archives of similar information, please
forward that to me.

The replies to the survey will be available by sending mail like so:

	% Mail archives@slovax.engr.sgi.com
	Subject: clust-survey
	^D
	%

Thanks in advance for your help.  I hope to use this information to make
SGI's Challenge Array product better serve your needs.

-------------------- survey start -----------------------------------


*** Basics

    How long has this cluster been in use?

    Where is it installed?

    If this is a commercial product, what's the name & vendor of the
    product?

    If this is a home grown "farm", is there a well known name for the
    farm, like the Fermi lab farm?

    How many nodes are there in your cluster?

    Are they SMP or uniprocessor nodes?

    What's the processor?

    How much memory in each node?

*** Applications

    What is the cluster used for? 

    Time shared or batch?

    One dedicated application?

    Please describe as many of the applications as you can.  If you
    know what the memory/bandwidth/latency/etc requirements are for
    that application, please state those.

*** Naming

    What's the cluster look like on the net?  N names, N IP addresses?

    Do all of the nodes have uniform file system namespaces?  Do I see
    the same files (either cross mounted or replicated) on all nodes?

    Including or not including places like /tmp?

    Is a uniform file name space important to you / your users?

    Is the process name space uniform throughout the cluster?  Can I
    do a ps and see everything and do a kill and have that work no
    matter which node I am on?  

    Is a uniform process name space important to you / your users?

    Is the device name space uniform?  Can I access tape drives, for
    example, from any node in the cluster no matter where they are?

    Is a uniform device name space important to you / your users?

*** Networking details

    What is the interconnect used to talk to the nodes from the outside?

    What is the interconnect used between the nodes?  Please specify
    bandwidth and latency that you *know* to have been acheived on
    your cluster (not what the vender told you, what you have actually
    seen).

    What bandwidths and latencies do your applications need to perform
    well?

*** Data details

    Do you use the file system to store your data?  Even for big data sets?

    Is the file system throughput fast enough?  

    If you could get 50MB/sec read rate for any file, would that be
    fast enough that you would use the file system?

    Do you need the file system to be coherent, more so than the current 
    NFS open-to-close coherence?  Please be specific if you say "yes".

*** System adminstration

    How do you (did you) do software installation?

    How do you (did you) do software upgrades?

    How do you get at the consoles of each node?

    How do nodes boot?  Local disk, network, all together, one at a time?

    Do all nodes have disks?  

    If nodes have disks and use disks for / and/or /usr, does the
    cluster have any means of keeping those partitions identical?
    Close to identical?

    Are you happy with this model?

    Can you suggest a model that you think would (a) be better and
    (b) be implemented in a realistic time frame?

    Any other comments?  Such as: admin is not a problem, admin is a
    nightmare, etc.

*** Programming model

    Do you use PVM or MPI?

    Anything else?

    What would you like to use?

*** Dollars

    Approximate price?

    Do you have any idea of how that price is spread out over disks, memory,
    processors, interconnect, tape, etc.?

--
---
Larry McVoy			(415) 390-1804			 lm@sgi.com


