[Linux-cluster] failover questions after upgrade
jason at monsterjam.org
jason at monsterjam.org
Wed Nov 15 01:06:55 UTC 2006
ok, upgraded my rpms to the following
cman-kernheaders-2.6.9-45.8
cman-kernel-2.6.9-45.8
cman-1.0.11-0
cman-kernel-hugemem-2.6.9-45.8
cman-devel-1.0.11-0
cman-kernel-smp-2.6.9-45.8
GFS-6.1.6-1
GFS-kernheaders-2.6.9-60.3
GFS-kernel-smp-2.6.9-60.3
dlm-kernel-smp-2.6.9-44.3
dlm-devel-1.0.1-1
dlm-kernel-hugemem-2.6.9-44.3
dlm-kernheaders-2.6.9-44.3
dlm-1.0.1-1
dlm-kernel-2.6.9-44.3
magma-plugins-1.0.6-0
magma-devel-1.0.6-0
magma-debuginfo-1.0.6-0
magma-1.0.6-0
and when I reboot both servers of 2 node cluster, they come up fine..
[jason at tf2 ~]$ clustat
Member Status: Quorate, Group Member
Member Name State ID
------ ---- ----- --
tf1 Online 0x0000000000000001
tf2 Online 0x0000000000000002
Service Name Owner (Last) State
------- ---- ----- ------ -----
Apache Service tf1 started
[jason at tf2 ~]$
when I reboot (shutdown -r now) tf1,
tf2 never takes over
[jason at tf2 ~]$ clustat
Member Status: Quorate, Group Member
Member Name State ID
------ ---- ----- --
tf2 Online 0x0000000000000002
Service Name Owner (Last) State
------- ---- ----- ------ -----
Apache Service ((null) ) failed
[jason at tf2 ~]$
heres the logs from tf2:
Nov 14 19:48:21 tf2 clurgmgrd[5345]: <info> Logged in SG "usrm::manager"
Nov 14 19:48:21 tf2 clurgmgrd[5345]: <info> Magma Event: Membership Change
Nov 14 19:48:21 tf2 clurgmgrd[5345]: <info> State change: Local UP
Nov 14 19:48:22 tf2 clurgmgrd[5345]: <info> State change: tf1 UP
Nov 14 19:48:25 tf2 snmpd[5195]: Got trap from peer on fd 13
Nov 14 19:48:44 tf2 kernel: process `omaws32' is using obsolete setsockopt SO_BSDCOMPAT
Nov 14 19:48:58 tf2 Server Administrator: Storage Service EventID: 2164 See readme.txt for a list
of validated controller driver versions.
Nov 14 19:49:00 tf2 snmpd[5195]: Got trap from peer on fd 13
Nov 14 19:50:31 tf2 sshd(pam_unix)[6920]: session opened for user jason by (uid=0)
Nov 14 19:51:03 tf2 sshd(pam_unix)[6951]: session opened for user jason by (uid=0)
Nov 14 19:51:39 tf2 clurgmgrd[5345]: <info> Magma Event: Membership Change
Nov 14 19:51:39 tf2 clurgmgrd[5345]: <info> State change: tf1 DOWN
Nov 14 19:52:19 tf2 ntpd[4896]: synchronized to 193.162.159.97, stratum 2
Nov 14 19:52:19 tf2 ntpd[4896]: kernel time sync disabled 0041
Nov 14 19:52:28 tf2 kernel: e100: eth2: e100_watchdog: link down
Nov 14 19:52:34 tf2 kernel: CMAN: removing node tf1 from the cluster : Missed too many heartbeats
Nov 14 19:52:58 tf2 kernel: e100: eth2: e100_watchdog: link up, 100Mbps, full-duplex
Nov 14 19:55:14 tf2 kernel: CMAN: node tf1 rejoining
Nov 14 19:55:45 tf2 clurgmgrd[5345]: <info> Magma Event: Membership Change
Nov 14 19:55:45 tf2 clurgmgrd[5345]: <info> State change: tf1 UP
then when tf1 comes back up, my apache service doesnt come up correctly..
[jason at tf2 ~]$ clustat
Member Status: Quorate, Group Member
Member Name State ID
------ ---- ----- --
tf1 Online 0x0000000000000001
tf2 Online 0x0000000000000002
Service Name Owner (Last) State
------- ---- ----- ------ -----
Apache Service (tf1 ) failed
[jason at tf2 ~]$
and I see this in the logs on tf1 as hes booting up.
Nov 14 19:55:44 tf1 rhnsd[5445]: Red Hat Network Services Daemon starting up.
Nov 14 19:55:44 tf1 rhnsd: rhnsd startup succeeded
Nov 14 19:55:44 tf1 cups-config-daemon: cups-config-daemon startup succeeded
Nov 14 19:55:44 tf1 haldaemon: haldaemon startup succeeded
Nov 14 19:55:44 tf1 clurgmgrd[5488]: <info> Loading Service Data
Nov 14 19:55:44 tf1 rgmanager: clurgmgrd startup succeeded
Nov 14 19:55:44 tf1 fstab-sync[5764]: removed all generated mount points
Nov 14 19:55:45 tf1 clurgmgrd[5488]: <info> Initializing Services
Nov 14 19:55:45 tf1 fstab-sync[6152]: added mount point /media/cdrom for /dev/hda
Nov 14 19:55:45 tf1 httpd: httpd shutdown failed
Nov 14 19:55:45 tf1 clurgmgrd[5488]: <notice> stop on script "cluster_apache" returned 1 (generic
error)
Nov 14 19:55:45 tf1 clurgmgrd[5488]: <info> Services Initialized
Nov 14 19:55:45 tf1 clurgmgrd[5488]: <info> Logged in SG "usrm::manager"
Nov 14 19:55:45 tf1 clurgmgrd[5488]: <info> Magma Event: Membership Change
Nov 14 19:55:45 tf1 clurgmgrd[5488]: <info> State change: Local UP
Nov 14 19:55:46 tf1 fstab-sync[6465]: added mount point /media/floppy for /dev/fd0
Nov 14 19:55:46 tf1 clurgmgrd[5488]: <info> State change: tf2 UP
any suggestions?
Jason
More information about the Linux-cluster
mailing list