[Linux-cluster] having problems trying to setup a two node cluster
Brynnen R Owen
owen at isrl.uiuc.edu
Wed Dec 1 18:49:20 UTC 2004
One possibility may be that the hostnames used in your cluster.conf
file resolve to 127.0.0.1 in /etc/hosts. Then the system will try to
broadcast to the lo device.
By the way, we're using CVS from Nov 21 with broadcast on
dual-nic's.
On Wed, Dec 01, 2004 at 10:39:49AM -0800, Rick Stevens wrote:
> vahram wrote:
> >Rick Stevens wrote:
> >
> >>
> >>I had a similar issue. The problem was with the multicast routing.
> >>I was using two NICs on each node...one public (eth0) and one private
> >>(eth1), with the default gateway going out eth0.
> >>
> >>The route for the multicast (224.x.x.x) was going out the default
> >>gateway and not reaching the other machine. By putting in a fixed route
> >>in for multicast:
> >>
> >> route add -net 224.0.0.0/8 dev eth1
> >>
> >>it all started working. This was my fix, it may not work for you.
> >>Also, I use the CVS code from http://sources.redhat.com/cluster and
> >>not the source RPMs from where you specified.
> >>----------------------------------------------------------------------
> >>- Rick Stevens, Senior Systems Engineer rstevens at vitalstream.com -
> >>- VitalStream, Inc. http://www.vitalstream.com -
> >>- -
> >>- Veni, Vidi, VISA: I came, I saw, I did a little shopping. -
> >>----------------------------------------------------------------------
> >>
> >>--
> >>Linux-cluster mailing list
> >>Linux-cluster at redhat.com
> >>http://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> >Yeap, both boxes have two NICs. eth0 is public, and eth1 is private
> >(192.168.2.x). I tried adding the route, and that didn't fix it. I've
> >also tried disabling the private NIC before and running with one public
> >NIC, and that didn't fix it either. One other interesting thing I
> >noticed...when I run cman_tool join on nodeA, netstat shows ccsd trying
> >to do this:
> >
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:739
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:738
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:737
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:736
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:743
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:742
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:741
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:740
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:727
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:731
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:730
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:729
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:728
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:735
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:734
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:733
> >TIME_WAIT -
> >tcp 0 0 127.0.0.1:50006 127.0.0.1:732
> >TIME_WAIT -
> >
>
> Looking back at your cluster.conf, I see you're using broadcast. I used
> multicast because, in the first CVS checkout I did, broadcast didn't
> work properly. It's possible your SRPMs also have that flaw. Why not
> try multicast and see if that works. Add that route I mentioned and
> here's my cluster.conf which you can crib:
>
> <?xml version="1.0"?>
> <cluster name="test" config_version="1">
>
>
> <cman two-node="1" expected_votes="1">
> <multicast addr="224.0.0.1"/>
> </cman>
>
>
> <nodes>
> <node name="gfs-01-001" votes="1">
> <multicast addr="224.0.0.1" interface="eth1"/>
> <fence>
> <method name="single">
> <device name="human" ipaddr="gfs-01-001"/>
> </method>
> </fence>
> </node>
>
>
> <node name="gfs-01-002" votes="1">
> <multicast addr="224.0.0.1" interface="eth1"/>
> <fence>
> <method name="single">
> <device name="human" ipaddr="gfs-01-002"/>
> </method>
> </fence>
> </node>
> </nodes>
>
>
> <fence_devices>
> <device name="human" agent="fence_manual"/>
> </fence_devices>
> </cluster>
>
> ----------------------------------------------------------------------
> - Rick Stevens, Senior Systems Engineer rstevens at vitalstream.com -
> - VitalStream, Inc. http://www.vitalstream.com -
> - -
> - What's small, yellow and very, VERY dangerous? The root canary! -
> ----------------------------------------------------------------------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
--
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
<> Brynnen Owen ( this space for rent )<>
<> owen at uiuc.edu ( )<>
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
More information about the Linux-cluster
mailing list