[Linux-cluster] Cluster doesn't come up while rebooting

Stevan Colaco stevan.colaco at gmail.com
Tue Jul 1 09:06:55 UTC 2008


Hello All,

I need your help for one issue i am facing .

OS: RHEL4 ES Update 6 64bit

I have a deployment where we have 2 + 1 cluster (2 active and one
passive). I have a service which is to be failed over but faced issues
when i rebooted all 3 servers. Services got disabled. But when i use
clusvsadm to manually enable service it works. Here are the logs : -

Jun 25 11:13:15 mb1 clurgmgrd[14825]:  Resource Group Manager Starting
Jun 25 11:13:15 mb1 clurgmgrd[14825]: Loading Service Data
Jun 25 11:13:17 mb1 clurgmgrd[14825]: Initializing Services
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: /dev/sdh1 is not mounted
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-BACKUP with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-BACKUP returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-STORE with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-STORE returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-DBDATA with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-DBDATA returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-CONF with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-CONF returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-REDOLOG with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-REDOLOG returned
2 (invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-INDEX with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-INDEX returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-LOG with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-LOG returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-ZIMBRA-CLUST with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-CLUSTER returned
2 (invalid argument(s))
Jun 25 11:13:22 mb1 clurgmgrd: [14825]: /dev/sdg1 is not mounted
Jun 25 11:13:27 mb1 clurgmgrd: [14825]: /dev/sdf1 is not mounted
Jun 25 11:13:33 mb1 clurgmgrd: [14825]: /dev/sde1 is not mounted
Jun 25 11:13:38 mb1 clurgmgrd: [14825]: /dev/sdd1 is not mounted
Jun 25 11:13:43 mb1 clurgmgrd: [14825]: /dev/sdc1 is not mounted
Jun 25 11:13:45 mb1 rgmanager: clurgmgrd startup failed
Jun 25 11:13:48 mb1 clurgmgrd: [14825]: /dev/sdb1 is not mounted
Jun 25 11:13:53 mb1 clurgmgrd: [14825]: /dev/sda1 is not mounted
Jun 25 11:13:58 mb1 clurgmgrd[14825]: Services Initialized
Jun 25 11:14:01 mb1 clurgmgrd[14825]: Logged in SG "usrm::manager"
Jun 25 11:14:01 mb1 clurgmgrd[14825]: Magma Event: Membership Change
Jun 25 11:14:01 mb1 clurgmgrd[14825]: State change: Local UP
Jun 25 11:14:01 mb1 clurgmgrd[14825]: State change: mbstandby.ku.edu.kw UP
Jun 25 11:14:03 mb1 clurgmgrd[14825]: Magma Event: Membership Change
Jun 25 11:14:03 mb1 clurgmgrd[14825]: State change: mb2.ku.edu.kw UP


MB2 server Logs

Jun 25 11:13:40 mb2 clurgmgrd[14776]:  Resource Group Manager Starting
Jun 25 11:13:40 mb2 clurgmgrd[14776]: Loading Service Data
Jun 25 11:13:41 mb2 clurgmgrd[14776]: Initializing Services
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-DBDATA with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-DBDATA returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-INDEX with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-INDEX returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-LOG with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-LOG returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-CONF with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-CONF returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: /dev/sdh1 is not mounted
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-BACKUP with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-BACKUP returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-REDOLOG with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-REDOLOG returned
2 (invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-STORE with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-STORE returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-ZIMBRA-CLUST with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-CLUSTER returned
2 (invalid argument(s))
Jun 25 11:13:46 mb2 clurgmgrd: [14776]: /dev/sdf1 is not mounted
Jun 25 11:13:52 mb2 clurgmgrd: [14776]: /dev/sdg1 is not mounted
Jun 25 11:13:57 mb2 clurgmgrd: [14776]: /dev/sde1 is not mounted
Jun 25 11:14:02 mb2 clurgmgrd: [14776]: /dev/sdd1 is not mounted
Jun 25 11:14:07 mb2 clurgmgrd: [14776]: /dev/sdc1 is not mounted
Jun 25 11:14:10 mb2 rgmanager: clurgmgrd startup failed
Jun 25 11:14:12 mb2 clurgmgrd: [14776]: /dev/sdb1 is not mounted
Jun 25 11:14:18 mb2 clurgmgrd: [14776]: /dev/sda1 is not mounted
Jun 25 11:14:23 mb2 clurgmgrd[14776]: Services Initialized
Jun 25 11:14:25 mb2 clurgmgrd[14776]: Logged in SG "usrm::manager"
Jun 25 11:14:25 mb2 clurgmgrd[14776]: Magma Event: Membership Change
Jun 25 11:14:25 mb2 clurgmgrd[14776]: State change: Local UP
Jun 25 11:14:25 mb2 clurgmgrd[14776]: State change: mb1.ku.edu.kw UP
Jun 25 11:14:25 mb2 clurgmgrd[14776]: State change: mbstandby.ku.edu.kw UP

MBSTANDBY LOGS

Jun 25 11:13:26 mbstandby clurgmgrd[15850]:  Resource Group Manager Starting
Jun 25 11:13:26 mbstandby clurgmgrd[15850]: Loading Service Data
Jun 25 11:13:27 mbstandby clurgmgrd[15850]: Initializing Services
Jun 25 11:13:27 mbstandby clurgmgrd: [15850]: /dev/sdl1 is not mounted
Jun 25 11:13:27 mbstandby clurgmgrd: [15850]: /dev/sdp1 is not mounted
Jun 25 11:13:32 mbstandby clurgmgrd: [15850]: /dev/sdk1 is not mounted
Jun 25 11:13:32 mbstandby clurgmgrd: [15850]: /dev/sdn1 is not mounted
Jun 25 11:13:38 mbstandby clurgmgrd: [15850]: /dev/sdj1 is not mounted
Jun 25 11:13:38 mbstandby clurgmgrd: [15850]: /dev/sdo1 is not mounted
Jun 25 11:13:43 mbstandby clurgmgrd: [15850]: /dev/sdi1 is not mounted
Jun 25 11:13:43 mbstandby clurgmgrd: [15850]: /dev/sdm1 is not mounted
Jun 25 11:13:47 mbstandby sshd(pam_unix)[17583]: session opened for
user root by (uid=0)
Jun 25 11:13:48 mbstandby clurgmgrd: [15850]: /dev/sdd1 is not mounted
Jun 25 11:13:48 mbstandby clurgmgrd: [15850]: /dev/sdh1 is not mounted
Jun 25 11:13:53 mbstandby clurgmgrd: [15850]: /dev/sdg1 is not mounted
Jun 25 11:13:53 mbstandby clurgmgrd: [15850]: /dev/sdc1 is not mounted
Jun 25 11:13:56 mbstandby rgmanager: clurgmgrd startup failed
Jun 25 11:13:56 mbstandby su(pam_unix)[18378]: session opened for user
zimbra by (uid=0)
Jun 25 11:13:56 mbstandby zimbra: -bash: /opt/zimbra/log/startup.log:
No such file or directory
Jun 25 11:13:56 mbstandby su(pam_unix)[18378]: session closed for user zimbra
Jun 25 11:13:56 mbstandby rc: Starting zimbra: failed
Jun 25 11:13:58 mbstandby clurgmgrd: [15850]: /dev/sdf1 is not mounted
Jun 25 11:13:58 mbstandby clurgmgrd: [15850]: /dev/sdb1 is not mounted
Jun 25 11:14:04 mbstandby clurgmgrd: [15850]: /dev/sde1 is not mounted
Jun 25 11:14:04 mbstandby clurgmgrd: [15850]: /dev/sda1 is not mounted
Jun 25 11:14:09 mbstandby clurgmgrd[15850]: Services Initialized
Jun 25 11:14:09 mbstandby clurgmgrd[15850]: Logged in SG "usrm::manager"
Jun 25 11:14:09 mbstandby clurgmgrd[15850]: Magma Event: Membership Change
Jun 25 11:14:09 mbstandby clurgmgrd[15850]: State change: Local UP
Jun 25 11:14:12 mbstandby clurgmgrd[15850]: Magma Event: Membership Change
Jun 25 11:14:12 mbstandby clurgmgrd[15850]: State change: mb1.ku.edu.kw UP
Jun 25 11:14:13 mbstandby clurgmgrd[15850]: Resource groups locked;
not evaluating
Jun 25 11:14:14 mbstandby clurgmgrd[15850]: Magma Event: Membership Change
Jun 25 11:14:14 mbstandby clurgmgrd[15850]: State change: mb2.ku.edu.kw UP
Jun 25 11:49:22 mbstandby sshd(pam_unix)[9438]: session opened for
user root by (uid=0)

I am using e2label to mount on failover as well as primary server.
Attached also is my cluster.conf.

Right now fencing is not being used properly just using manual and was
doing tetsing with HP ILO fencing.

!st query i have is why does it show  "Magma Event: Membership Change" ?

Since i have initially defined 3 members in cluster , it should not
give me this . Is it because of some package missing or i have to run
up2date ?

I have installed following packages : -

ccs-1.0.11-1.x86_64.rpm
cman-kernheaders-2.6.9-53.5.x86_64.rpm  gulm-1.0.10-0.x86_64.rpm
 magma-plugins-1.0.12-0.x86_64.rpm
ccs-devel-1.0.11-1.x86_64.rpm          dlm-1.0.7-1.x86_64.rpm
       gulm-devel-1.0.10-0.x86_64.rpm
perl-Net-Telnet-3.03-3.noarch.rpm
cman-1.0.17-0.x86_64.rpm               dlm-devel-1.0.7-1.x86_64.rpm
       iddev-2.0.0-4.x86_64.rpm        rgmanager-1.9.72-1.x86_64.rpm
cman-devel-1.0.17-0.x86_64.rpm
dlm-kernel-2.6.9-52.2.x86_64.rpm        iddev-devel-2.0.0-4.x86_64.rpm
 system-config-cluster-1.0.51-2.0.noarch.rpm
cman-kernel-2.6.9-53.5.x86_64.rpm
dlm-kernel-smp-2.6.9-52.2.x86_64.rpm    luci-0.11.0-3.x86_64.rpm
cman-kernel-smp-2.6.9-53.5.x86_64.rpm  fence-1.32.50-2.x86_64.rpm
       magma-1.0.8-1.x86_64.rpm

Should i be missing any other important package for cluster ? I
installed packages using rpm -ivh *.rpm .
 Also i stopped lock_glumd service as i am using lock_dlm lock manager.

Later i tried using just IP in service part w/o mount points and
application service. Then also on reboot it doesnt startup.Here are
the logs :-

Jun 27 19:44:37 mb1 clurgmgrd[12737]: <notice> Resource Group Manager Starting
Jun 27 19:44:37 mb1 clurgmgrd[12737]: <info> Loading Service Data
Jun 27 19:44:37 mb1 fstab-sync[12738]: removed all generated mount points
Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Initializing Services
Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Services Initialized
Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Logged in SG "usrm::manager"
Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Magma Event: Membership Change
Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> State change: Local UP
Jun 27 19:44:38 mb1 rgmanager: clurgmgrd startup succeeded
Jun 27 19:44:41 mb1 clurgmgrd[12737]: <info> Magma Event: Membership Change
Jun 27 19:44:41 mb1 clurgmgrd[12737]: <info> State change:
mbstandby.ku.edu.kw UP
Jun 27 19:44:43 mb1 clurgmgrd[12737]: <info> Magma Event: Membership Change
Jun 27 19:44:43 mb1 clurgmgrd[12737]: <info> State change: mb2.ku.edu.kw UP

Attached is also cluster.conf for this

Please guide what could be the issue. Thanks in advance.

Regards,
-Steven
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cluster-with-IP.txt
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080701/a7aa847d/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cluster-with-service.txt
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080701/a7aa847d/attachment-0001.txt>


More information about the Linux-cluster mailing list