[Linux-cluster] Cluster doesn't come up while rebooting

Jonathan Brassow jbrassow at redhat.com
Wed Jul 2 14:51:43 UTC 2008


I wouldn't worry about the "Magma Event: Membership Change" messages.   
I think that get printed out whenever a machine joins or leaves the  
cluster.  (You have to be part of the cluster to see the changes...  
which is why everyone sees local change first, followed by whoever  
comes after them.)  Do you have syslog set to print out 'debug'?  That  
may explain some of these messages...

Just to get this straight, after all machines are up, if you use  
'clusvcadm' to start the services, it works?  If you reboot all  
machines, it doesn't work on bootup?  What if you just reboot one  
machine?

Someone will have to confirm my next few statements, but this is what  
I think is happening...  rgmanager does a 'stop' when a machine comes  
up.  I'm guessing this is why you are seeing the "is not mounted" and  
other messages.  In your cluster.conf, you have the services set to  
'autostart="0"', which means they will not start by default(?).  So,  
you need to start by hand when the machines come up.  Potential  
solution is to ignore the messages you've attached (or figure out why  
syslog is being so verbose), and take out the 'autostart="0"' from  
cluster.conf.

  brassow

On Jul 1, 2008, at 4:06 AM, Stevan Colaco wrote:

> Hello All,
>
> I need your help for one issue i am facing .
>
> OS: RHEL4 ES Update 6 64bit
>
> I have a deployment where we have 2 + 1 cluster (2 active and one
> passive). I have a service which is to be failed over but faced issues
> when i rebooted all 3 servers. Services got disabled. But when i use
> clusvsadm to manually enable service it works. Here are the logs : -
>
> Jun 25 11:13:15 mb1 clurgmgrd[14825]:  Resource Group Manager Starting
> Jun 25 11:13:15 mb1 clurgmgrd[14825]: Loading Service Data
> Jun 25 11:13:17 mb1 clurgmgrd[14825]: Initializing Services
> Jun 25 11:13:17 mb1 clurgmgrd: [14825]: /dev/sdh1 is not mounted
> Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
> LABEL=MB2-BACKUP with a real device
> Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-BACKUP returned 2
> (invalid argument(s))
> Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
> LABEL=MB2-STORE with a real device
> Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-STORE returned 2
> (invalid argument(s))
> Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
> LABEL=MB2-DBDATA with a real device
> Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-DBDATA returned 2
> (invalid argument(s))
> Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
> LABEL=MB2-CONF with a real device
> Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-CONF returned 2
> (invalid argument(s))
> Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
> LABEL=MB2-REDOLOG with a real device
> Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-REDOLOG returned
> 2 (invalid argument(s))
> Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
> LABEL=MB2-INDEX with a real device
> Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-INDEX returned 2
> (invalid argument(s))
> Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
> LABEL=MB2-LOG with a real device
> Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-LOG returned 2
> (invalid argument(s))
> Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
> LABEL=MB2-ZIMBRA-CLUST with a real device
> Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-CLUSTER returned
> 2 (invalid argument(s))
> Jun 25 11:13:22 mb1 clurgmgrd: [14825]: /dev/sdg1 is not mounted
> Jun 25 11:13:27 mb1 clurgmgrd: [14825]: /dev/sdf1 is not mounted
> Jun 25 11:13:33 mb1 clurgmgrd: [14825]: /dev/sde1 is not mounted
> Jun 25 11:13:38 mb1 clurgmgrd: [14825]: /dev/sdd1 is not mounted
> Jun 25 11:13:43 mb1 clurgmgrd: [14825]: /dev/sdc1 is not mounted
> Jun 25 11:13:45 mb1 rgmanager: clurgmgrd startup failed
> Jun 25 11:13:48 mb1 clurgmgrd: [14825]: /dev/sdb1 is not mounted
> Jun 25 11:13:53 mb1 clurgmgrd: [14825]: /dev/sda1 is not mounted
> Jun 25 11:13:58 mb1 clurgmgrd[14825]: Services Initialized
> Jun 25 11:14:01 mb1 clurgmgrd[14825]: Logged in SG "usrm::manager"
> Jun 25 11:14:01 mb1 clurgmgrd[14825]: Magma Event: Membership Change
> Jun 25 11:14:01 mb1 clurgmgrd[14825]: State change: Local UP
> Jun 25 11:14:01 mb1 clurgmgrd[14825]: State change:  
> mbstandby.ku.edu.kw UP
> Jun 25 11:14:03 mb1 clurgmgrd[14825]: Magma Event: Membership Change
> Jun 25 11:14:03 mb1 clurgmgrd[14825]: State change: mb2.ku.edu.kw UP
>
>
> MB2 server Logs
>
> Jun 25 11:13:40 mb2 clurgmgrd[14776]:  Resource Group Manager Starting
> Jun 25 11:13:40 mb2 clurgmgrd[14776]: Loading Service Data
> Jun 25 11:13:41 mb2 clurgmgrd[14776]: Initializing Services
> Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
> LABEL=MB1-DBDATA with a real device
> Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-DBDATA returned 2
> (invalid argument(s))
> Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
> LABEL=MB1-INDEX with a real device
> Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-INDEX returned 2
> (invalid argument(s))
> Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
> LABEL=MB1-LOG with a real device
> Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-LOG returned 2
> (invalid argument(s))
> Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
> LABEL=MB1-CONF with a real device
> Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-CONF returned 2
> (invalid argument(s))
> Jun 25 11:13:41 mb2 clurgmgrd: [14776]: /dev/sdh1 is not mounted
> Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
> LABEL=MB1-BACKUP with a real device
> Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-BACKUP returned 2
> (invalid argument(s))
> Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
> LABEL=MB1-REDOLOG with a real device
> Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-REDOLOG returned
> 2 (invalid argument(s))
> Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
> LABEL=MB1-STORE with a real device
> Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-STORE returned 2
> (invalid argument(s))
> Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
> LABEL=MB1-ZIMBRA-CLUST with a real device
> Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-CLUSTER returned
> 2 (invalid argument(s))
> Jun 25 11:13:46 mb2 clurgmgrd: [14776]: /dev/sdf1 is not mounted
> Jun 25 11:13:52 mb2 clurgmgrd: [14776]: /dev/sdg1 is not mounted
> Jun 25 11:13:57 mb2 clurgmgrd: [14776]: /dev/sde1 is not mounted
> Jun 25 11:14:02 mb2 clurgmgrd: [14776]: /dev/sdd1 is not mounted
> Jun 25 11:14:07 mb2 clurgmgrd: [14776]: /dev/sdc1 is not mounted
> Jun 25 11:14:10 mb2 rgmanager: clurgmgrd startup failed
> Jun 25 11:14:12 mb2 clurgmgrd: [14776]: /dev/sdb1 is not mounted
> Jun 25 11:14:18 mb2 clurgmgrd: [14776]: /dev/sda1 is not mounted
> Jun 25 11:14:23 mb2 clurgmgrd[14776]: Services Initialized
> Jun 25 11:14:25 mb2 clurgmgrd[14776]: Logged in SG "usrm::manager"
> Jun 25 11:14:25 mb2 clurgmgrd[14776]: Magma Event: Membership Change
> Jun 25 11:14:25 mb2 clurgmgrd[14776]: State change: Local UP
> Jun 25 11:14:25 mb2 clurgmgrd[14776]: State change: mb1.ku.edu.kw UP
> Jun 25 11:14:25 mb2 clurgmgrd[14776]: State change:  
> mbstandby.ku.edu.kw UP
>
> MBSTANDBY LOGS
>
> Jun 25 11:13:26 mbstandby clurgmgrd[15850]:  Resource Group Manager  
> Starting
> Jun 25 11:13:26 mbstandby clurgmgrd[15850]: Loading Service Data
> Jun 25 11:13:27 mbstandby clurgmgrd[15850]: Initializing Services
> Jun 25 11:13:27 mbstandby clurgmgrd: [15850]: /dev/sdl1 is not mounted
> Jun 25 11:13:27 mbstandby clurgmgrd: [15850]: /dev/sdp1 is not mounted
> Jun 25 11:13:32 mbstandby clurgmgrd: [15850]: /dev/sdk1 is not mounted
> Jun 25 11:13:32 mbstandby clurgmgrd: [15850]: /dev/sdn1 is not mounted
> Jun 25 11:13:38 mbstandby clurgmgrd: [15850]: /dev/sdj1 is not mounted
> Jun 25 11:13:38 mbstandby clurgmgrd: [15850]: /dev/sdo1 is not mounted
> Jun 25 11:13:43 mbstandby clurgmgrd: [15850]: /dev/sdi1 is not mounted
> Jun 25 11:13:43 mbstandby clurgmgrd: [15850]: /dev/sdm1 is not mounted
> Jun 25 11:13:47 mbstandby sshd(pam_unix)[17583]: session opened for
> user root by (uid=0)
> Jun 25 11:13:48 mbstandby clurgmgrd: [15850]: /dev/sdd1 is not mounted
> Jun 25 11:13:48 mbstandby clurgmgrd: [15850]: /dev/sdh1 is not mounted
> Jun 25 11:13:53 mbstandby clurgmgrd: [15850]: /dev/sdg1 is not mounted
> Jun 25 11:13:53 mbstandby clurgmgrd: [15850]: /dev/sdc1 is not mounted
> Jun 25 11:13:56 mbstandby rgmanager: clurgmgrd startup failed
> Jun 25 11:13:56 mbstandby su(pam_unix)[18378]: session opened for user
> zimbra by (uid=0)
> Jun 25 11:13:56 mbstandby zimbra: -bash: /opt/zimbra/log/startup.log:
> No such file or directory
> Jun 25 11:13:56 mbstandby su(pam_unix)[18378]: session closed for  
> user zimbra
> Jun 25 11:13:56 mbstandby rc: Starting zimbra: failed
> Jun 25 11:13:58 mbstandby clurgmgrd: [15850]: /dev/sdf1 is not mounted
> Jun 25 11:13:58 mbstandby clurgmgrd: [15850]: /dev/sdb1 is not mounted
> Jun 25 11:14:04 mbstandby clurgmgrd: [15850]: /dev/sde1 is not mounted
> Jun 25 11:14:04 mbstandby clurgmgrd: [15850]: /dev/sda1 is not mounted
> Jun 25 11:14:09 mbstandby clurgmgrd[15850]: Services Initialized
> Jun 25 11:14:09 mbstandby clurgmgrd[15850]: Logged in SG  
> "usrm::manager"
> Jun 25 11:14:09 mbstandby clurgmgrd[15850]: Magma Event: Membership  
> Change
> Jun 25 11:14:09 mbstandby clurgmgrd[15850]: State change: Local UP
> Jun 25 11:14:12 mbstandby clurgmgrd[15850]: Magma Event: Membership  
> Change
> Jun 25 11:14:12 mbstandby clurgmgrd[15850]: State change:  
> mb1.ku.edu.kw UP
> Jun 25 11:14:13 mbstandby clurgmgrd[15850]: Resource groups locked;
> not evaluating
> Jun 25 11:14:14 mbstandby clurgmgrd[15850]: Magma Event: Membership  
> Change
> Jun 25 11:14:14 mbstandby clurgmgrd[15850]: State change:  
> mb2.ku.edu.kw UP
> Jun 25 11:49:22 mbstandby sshd(pam_unix)[9438]: session opened for
> user root by (uid=0)
>
> I am using e2label to mount on failover as well as primary server.
> Attached also is my cluster.conf.
>
> Right now fencing is not being used properly just using manual and was
> doing tetsing with HP ILO fencing.
>
> !st query i have is why does it show  "Magma Event: Membership  
> Change" ?
>
> Since i have initially defined 3 members in cluster , it should not
> give me this . Is it because of some package missing or i have to run
> up2date ?
>
> I have installed following packages : -
>
> ccs-1.0.11-1.x86_64.rpm
> cman-kernheaders-2.6.9-53.5.x86_64.rpm  gulm-1.0.10-0.x86_64.rpm
> magma-plugins-1.0.12-0.x86_64.rpm
> ccs-devel-1.0.11-1.x86_64.rpm          dlm-1.0.7-1.x86_64.rpm
>       gulm-devel-1.0.10-0.x86_64.rpm
> perl-Net-Telnet-3.03-3.noarch.rpm
> cman-1.0.17-0.x86_64.rpm               dlm-devel-1.0.7-1.x86_64.rpm
>       iddev-2.0.0-4.x86_64.rpm        rgmanager-1.9.72-1.x86_64.rpm
> cman-devel-1.0.17-0.x86_64.rpm
> dlm-kernel-2.6.9-52.2.x86_64.rpm        iddev-devel-2.0.0-4.x86_64.rpm
> system-config-cluster-1.0.51-2.0.noarch.rpm
> cman-kernel-2.6.9-53.5.x86_64.rpm
> dlm-kernel-smp-2.6.9-52.2.x86_64.rpm    luci-0.11.0-3.x86_64.rpm
> cman-kernel-smp-2.6.9-53.5.x86_64.rpm  fence-1.32.50-2.x86_64.rpm
>       magma-1.0.8-1.x86_64.rpm
>
> Should i be missing any other important package for cluster ? I
> installed packages using rpm -ivh *.rpm .
> Also i stopped lock_glumd service as i am using lock_dlm lock manager.
>
> Later i tried using just IP in service part w/o mount points and
> application service. Then also on reboot it doesnt startup.Here are
> the logs :-
>
> Jun 27 19:44:37 mb1 clurgmgrd[12737]: <notice> Resource Group  
> Manager Starting
> Jun 27 19:44:37 mb1 clurgmgrd[12737]: <info> Loading Service Data
> Jun 27 19:44:37 mb1 fstab-sync[12738]: removed all generated mount  
> points
> Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Initializing Services
> Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Services Initialized
> Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Logged in SG  
> "usrm::manager"
> Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Magma Event: Membership  
> Change
> Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> State change: Local UP
> Jun 27 19:44:38 mb1 rgmanager: clurgmgrd startup succeeded
> Jun 27 19:44:41 mb1 clurgmgrd[12737]: <info> Magma Event: Membership  
> Change
> Jun 27 19:44:41 mb1 clurgmgrd[12737]: <info> State change:
> mbstandby.ku.edu.kw UP
> Jun 27 19:44:43 mb1 clurgmgrd[12737]: <info> Magma Event: Membership  
> Change
> Jun 27 19:44:43 mb1 clurgmgrd[12737]: <info> State change:  
> mb2.ku.edu.kw UP
>
> Attached is also cluster.conf for this
>
> Please guide what could be the issue. Thanks in advance.
>
> Regards,
> -Steven
> <cluster-with-IP.txt><cluster-with-service.txt>--
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




More information about the Linux-cluster mailing list