[Linux-cluster] really reliable?
Steven Dake
sdake at redhat.com
Wed Apr 15 01:19:59 UTC 2009
On Tue, 2009-04-14 at 20:16 -0500, Vu Pham wrote:
> Ryan Golhar wrote:
> > I'm running RHEL 5.3 64-bit. So far, I only want to see that the
> > cluster can run. I'll worry about getting GFS after I'm confident this
> > works.
> >
> > I've got three nodes: pico, vail, and whistler. They each have two NIC
> > cards, one that provides a public IP address, and another that provides
> > private communications. All cluster traffic will go over the private
> > network, 192.168.20.0.
> >
> > I've installed only the following components:
> > system-config-cluster-1.0.52-1.1, cman-2.0.98-1, and rgmanager-2.0.38-2.
> >
> > I've created my cluster.conf file to include these three nodees and
> > fence them using a brocade fibre switch (for GFS).
> >
> > When I start the cluster services on all 3 nodes using the manually
> > method of:
> >
> > /sbin/ccsd; /usr/sbin/cman_tool join
> >
> > The nodes successfully form a cluster. I am able to leave the cluster
> > and kill ccsd as well.
> >
> > If I try to start the cman service I see:
> >
> > [root at pico cluster]# /sbin/service cman start
> > Starting cluster:
> > Loading modules... done
> > Mounting configfs... done
> > Starting ccsd... done
> > Starting cman... done
> > Starting daemons... done
> > Starting fencing...
> >
> >
> > And it just hangs. I know my fencing is set up correctly because I've
> > had nodes fence other nodes before (when I was trying with 6 members).
> > If I let it sit for long enough sometimes it finishes successfully. I'm
> > not sure what its doing because fence_tool is called and its a binary...
> >
>
> Ryan,
>
> Anything suspicious in the log when it hangs at fencing ?
> Could you show your cluster.conf ?
>
A hang in fencing may indicate that the cluster does not have quorum.
run cman_tool nodes to see a list of nodes and see if half+1 are in the
cluster.
Regards
-steve
> Vu
>
> > Ryan
> >
> >
> > Gordan Bobic wrote:
> >> What distro are you using? I've found that:
> >>
> >> 1) Distros other than RHEL/CentOS can be quirky when it comes to using
> >> RHCS. I've even run into problems on Fedora more than once (not to
> >> mention
> >> that FC hasn't shipped GFS1 since FC5 and GFS2 hasn't been deemed
> >> production stable until last month - and we're up to FC10 now).
> >>
> >> 2) Starting RHCS components using anything except the intended init
> >> scripts
> >> tends to cause problems.
> >>
> >> 3) Source of 99% of problems in the rest of the cases (i.e. not
> >> covered by
> >> 1) and 2) above) is incorrectly configured fencing.
> >>
> >> Does your setup fall under either of the first two categories?
> >> Have you verified beyond doubt that your fencing is configured correctly
> >> and that the fencing script gets verification upon success?
> >>
> >> Gordan
> >>
> >> On Tue, 14 Apr 2009 12:17:44 -0400, Ryan Golhar <golharam at umdnj.edu>
> >> wrote:
> >>> Hi all,
> >>>
> >>> Is redhat cluster suite really reliable? I've been having so much
> >>> trouble getting a cluster up and running, I'm beginning to second
> >>> guess my decision to use this software stack.
> >>>
> >>> I have 3 nodes (eventually 10) running and set up. The fencing
> >>> method is by a brocade fibre switch. The ultimate goal of this
> >>> cluster is to shared a SAN connected by fibre.
> >>>
> >>> I've installed just the bare minimum (before even getting to GFS) to
> >>> test the cluster software. Just starting cman cluster services fails
> >>> on two of the nodes.
> >>>
> >>> Even when I try to reboot the nodes, I can't because the whole system
> >>> hangs on various processes that don't ever shut down. I have to
> >>> physically reboot these boxes.
> >>>
> >>> The logs fill up with errors about not being able to connect to cman,
> >> etc.
> >>> I've been at it for awhile now and am not sure this is the best route
> >>> anymore.
> >>>
> >>> Ryan
> >>
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
More information about the Linux-cluster
mailing list