Newsgroups: comp.parallel.pvm
From: pjs@b11.b11.ingr.com (Peter J. Smith)
Subject: What else?  PVM Porting problems
Summary: Frustration reigns but I think I'm close...
Keywords: PVM Porting
Organization: Intergraph Corporation, Huntsville, AL
Date: 15 Sep 1994 23:06:12 -0500
Message-ID: <35b5jk$8ct@b11.b11.ingr.com>

   Greetings,

   	I'm trying to port PVM to a new (old) architecture.  I seem to have two
	startup problems when trying to start it with two machines (walk before
	running!) as shown by standard PVM debugging statements:

		1.) pl_startup() mach_2 timed out after 60 secs
		2.) startack() host mach_2 expected version

	I know the rsh interface is working properly as the remote pvmd
	gets started fine.  I invoke the 'master' pvmd as follows:

	% pvmd3 -d65536 host &

	where "host" is my hostnames file with just one entry:

	mach_2 dx=/usr/pvm/pvm3/lib/XXARCH/pvmd3 ep=/usr/pvm/pvm3/bin/XXARCH

------

	Can anyone offer some net.wisdom on this?  I'm about to destroy what
	little grey matter I have left...:-)

	My apologies for the length of this message but I hate it when someone
	says "It's broke" and gives ME insufficient data so I try not to do the
	same.
		Thanks for any help you can offer!

	Peter J. Smith
	pjs@shmoe.b11.ingr.com

	P.S.  The "----- Select ..."  lines below were added by me...

/tmp/pvml.xxxxxx (on master machine mach_1):
-------------------------------------------

[pvmd pid17437] main() debugmask is 65536 (msg,tsk,hst,sel,sch,wai)
[pvmd pid17437] version 3.3.4
[pvmd pid17437] ddpro 2315 tdpro 1317
[t80040000] ready  3.3.4   Fri Sep 16 02:24:59 1994
[t80040000] sendmessage() dst t80000000 code dm_add len 15
[t80040000] hostentry() from host mach_1 src t80040000 dst t80000000 cod dm_add wid 0
[t80040000] wait_new():
[t80040000]  wid 262145 kind hoststart on t0 tid t0 dep 0 peer { } cnt 0
[t80040000] work() select tout is 59.933332
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:24:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:25:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:25:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:25:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:25:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:25:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:25:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:25:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:25:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:25:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:25:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:25:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:25:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:25:59 1994
[t80040000] work() ping timer
[t80040000] work() select tout is 60.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:25:59 1994
[t80000000] pl_startup() mach_2 timed out after 60 secs
[t80000000] sendmessage() dst t80040000 code dm_startack len 25
[t80000000] ready  3.3.4   Fri Sep 16 02:25:59 1994
[t80040000] work() SELECT returns 1
[t80040000] 	-------  Select returns @ Fri Sep 16 02:25:59 1994
[t80000000] work() select tout is 1.966666
[t80000000] work() wrk_nfds=10
[t80000000] work() rfds=9
[t80000000] work() wfds=
[t80000000] 	-------  Select started @ Fri Sep 16 02:25:59 1994
[t80040000] hostentry() from host pvmd' src t80000000 dst t80040000 cod dm_startack wid 262145
[t80040000] startack() host mach_2 expected version
[t80040000] sendmessage() dst t80040000 code dm_addack len 26
[t80040000] hostentry() from host mach_1 src t80040000 dst t80040000 cod dm_addack wid 0
[t80040000] wait_delete():
[t80040000]  wid 262145 kind hoststart on t80000000 tid t80040000 dep 0 peer { } cnt 0
[t80000000] work() SELECT returns 1
[t80000000] 	-------  Select returns @ Fri Sep 16 02:25:59 1994
[t80040000] work() select tout is 59.816663
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:25:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:26:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:26:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:26:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:26:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:26:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:26:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:26:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:26:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:26:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:26:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:26:59 1994
[t80040000] work() select tout is 0.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:26:59 1994
[t80040000] work() SELECT returns 0
[t80040000] 	-------  Select returns @ Fri Sep 16 02:26:59 1994
[t80040000] work() ping timer
[t80040000] work() select tout is 60.000000
[t80040000] work() wrk_nfds=11
[t80040000] work() rfds=8,10
[t80040000] work() wfds=
[t80040000] 	-------  Select started @ Fri Sep 16 02:26:59 1994
[t80040000] catch() caught signal 2
[t80040000] pvmbailout(2)
[t80040000] sending FIN|ACK to all pvmds


/tmp/pvml.xxxxxx (on remote machine mach_2 (partial - becomes boring quickly):
----------------------------------------------------------------------------

[pvmd pid720] main() debugmask is 65536 (msg,tsk,hst,sel,sch,wai)
[pvmd pid720] version 3.3.4
[pvmd pid720] ddpro 2315 tdpro 1317
[t80080000] ready  3.3.4   Thu Sep 15 21:16:43 1994
[t80080000] work() select tout is 59.966666
[t80080000] work() wrk_nfds=10
[t80080000] work() rfds=7,9
[t80080000] work() wfds=
[t80080000] work() SELECT returns 0
[t80080000] work() select tout is 0.000000
[t80080000] work() wrk_nfds=10
[t80080000] work() rfds=7,9
[t80080000] work() wfds=


