[Linux-cluster] neophyte question: system gets fenced immediately after reboot

Darren Jacobs darren.jacobs at utoronto.ca
Tue Jan 24 22:51:42 UTC 2006


Just went through setting up a basic three node gfs cluster.  Things 
worked fine right up until I rebooted one of the nodes (wc3).  After a 
very long pause where the following appears on the screen of wc3:

--
Jan 24 17:25:23 wc3 kernel: Pool 6.0.2.27 (built Sep  7 2005 14:47:26) 
installed
Jan 24 17:25:24 wc3 kernel: Removing (8, 34)
Jan 24 17:25:24 wc3 kernel: Removing (8, 33)
Jan 24 17:25:24 wc3 kernel: Removing (8, 18)
Jan 24 17:25:24 wc3 kernel: Removing (8, 17)
Jan 24 17:25:24 wc3 kernel: Removing (8, 34)
Jan 24 17:25:24 wc3 kernel: Removing (8, 33)
Jan 24 17:25:24 wc3 kernel: Removing (8, 18)
Jan 24 17:25:24 wc3 kernel: Removing (8, 17)
--

the server eventually comes up.  However a 'nodelist' shows the following:

--
[root at wc1 darren]# gulm_tool nodelist wc1
 Name: wc1
  ip    = w.x.y.z
  state = Logged in
  mode = Master
  missed beats = 0
  last beat = 1138142219455878
  delay avg = 10002247
  max delay = 10014980

 Name: wc3
  ip    = w.x.y.b
  state = Expired
  mode = Slave
  missed beats = 3
  last beat = 1138138855408076
  delay avg = 10000466
  max delay = 10009912

 Name: wc2
  ip    = w.x.y.c
  state = Logged in
  mode = Slave
  missed beats = 0
  last beat = 1138142223034096
  delay avg = 10000238
  max delay = 10020033
--

wc3's state is expired.  How do I stop this from happening and/or get it 
to sucessfully log into gulm_lock and back into the cluster?


Regards,

Darren....




More information about the Linux-cluster mailing list