[Linux-cluster] Cman hang

Thu Aug 13 15:23:54 UTC 2009

On Thu, Aug 13, 2009 at 5:13 PM, NTOUGHE GUY-SERGE <ntoughe at hotmail.com>wrote:

>  Hi, this is my cluster.conf
> <?xml version="1.0"?>
> <cluster alias="arevclust" config_version="21" name="arevclust">
>   <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
>   <clusternodes>
>   <clusternode name="host1" nodeid="1" votes="1">
>   <fence>
>   <method name="2">
>

>   <device name=""/>
>
You should configure a valid fencing method, and if you don't have any, use
fence_manual until you get it.

>
>   </method>
>   </fence>
>   <multicast addr="" interface=""/>
>
I am not sure, but I think you should erase this <multicast ....> tag

Greetings,
Juanra

>
>   </clusternode>
>

>   <clusternode name="host2" nodeid="2" votes="1">
>   <fence>
>   <method name="1">
>   <device name=""/>
>   </method>
>   <method name=""/>
>   </fence>
>   <multicast addr="" interface=""/>
>   </clusternode>
>   </clusternodes>
>   <cman expected_votes="" two_node="">
>   <multicast addr=""/>
>   </cman>
>   <fencedevices>
>   <fencedevice agent="fence_brocade" ipaddr="" login="" name="" passwd=""/>
>   </fencedevices>
>   <rm>
>   <failoverdomains>
>   </failoverdomains>
>   <resources>
>   </resources>
>   </rm>
> </cluster>
>
> Regards
>
>
>
>
> ntoughe at hotmail.com
>
>
>
>
> > From: linux-cluster-request at redhat.com
> > Subject: Linux-cluster Digest, Vol 64, Issue 16
> > To: linux-cluster at redhat.com
> > Date: Thu, 13 Aug 2009 11:02:36 -0400
> >
> > Send Linux-cluster mailing list submissions to
> > linux-cluster at redhat.com
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > or, via email, send a message with subject or body 'help' to
> > linux-cluster-request at redhat.com
> >
> > You can reach the person managing the list at
> > linux-cluster-owner at redhat.com
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Linux-cluster digest..."
> >
> >
> > Today's Topics:
> >
> > 1. Re: do I have a fence DRAC device? (ESGLinux)
> > 2. clusterservice stays in 'recovering' state (mark benschop)
> > 3. Re: Is there any backup heartbeat channel (Hakan VELIOGLU)
> > 4. Re: Is there any backup heartbeat channel
> > (Juan Ramon Martin Blanco)
> > 5. RHCS on KVM (Nehemias Jahcob)
> > 6. Cman hang (NTOUGHE GUY-SERGE)
> > 7. Re: gfs2 mount hangs (David Teigland)
> > 8. Re: Qdisk question (Lon Hohberger)
> > 9. Re: Cman hang (Juan Ramon Martin Blanco)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Thu, 13 Aug 2009 13:27:16 +0200
> > From: ESGLinux <esggrupos at gmail.com>
> > Subject: Re: [Linux-cluster] do I have a fence DRAC device?
> > To: linux clustering <linux-cluster at redhat.com>
> > Message-ID:
> > <3128ba140908130427i6ab85406ye6da34073e6a6e97 at mail.gmail.com>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> > Hi,
> > I couldn´t reboot my system yet but I have installed the openmanage
> > packages:
> >
> > srvadmin-omacore-5.4.0-260
> > srvadmin-iws-5.4.0-260
> > srvadmin-syscheck-5.4.0-260
> > srvadmin-rac5-components-5.4.0-260
> > srvadmin-deng-5.4.0-260
> > srvadmin-ipmi-5.4.0-260.DUP
> > srvadmin-racadm5-5.4.0-260
> > srvadmin-omauth-5.4.0-260.rhel5
> > srvadmin-hapi-5.4.0-260
> > srvadmin-cm-5.4.0-260
> > srvadmin-racdrsc5-5.4.0-260
> > srvadmin-omilcore-5.4.0-260
> > srvadmin-isvc-5.4.0-260
> > srvadmin-storage-5.4.0-260
> > srvadmin-jre-5.4.0-260
> > srvadmin-omhip-5.4.0-260
> >
> > Now I have the command racadm but when I try to execut it I get this:
> >
> > racadm config -g cfgSerial -o cfgSerialTelnetEnable 1
> > ERROR: RACADM is unable to process the requested subcommand because there
> is
> > no
> > local RAC configuration to communicate with.
> >
> > Local RACADM subcommand execution requires the following:
> >
> > 1. A Remote Access Controller (RAC) must be present on the managed server
> > 2. Appropriate managed node software must be installed and running on the
> > server
> >
> >
> > What do I need to install/start? or until I configure the bios I can´t
> get
> > this work?
> >
> > Greetings
> >
> > ESG
> >
> >
> > 2009/8/11 <bergman at merctech.com>
> >
> > >
> > >
> > > In the message dated: Tue, 11 Aug 2009 14:14:03 +0200,
> > > The pithy ruminations from Juan Ramon Martin Blanco on
> > > <Re: [Linux-cluster] do I have a fence DRAC device?> were:
> > > => --===============1917368601==
> > > => Content-Type: multipart/alternative;
> > > boundary=0016364c7c07663f600470dca3b8
> > > =>
> > > => --0016364c7c07663f600470dca3b8
> > > => Content-Type: text/plain; charset=ISO-8859-1
> > > => Content-Transfer-Encoding: quoted-printable
> > > =>
> > > => On Tue, Aug 11, 2009 at 2:03 PM, ESGLinux <esggrupos at gmail.com>
> wrote:
> > > =>
> > > => > Thanks
> > > => > I=B4ll check it when I could reboot the server.
> > > => >
> > > => > greetings,
> > > => >
> > > => You have a BMC ipmi in the first network interface, it can be
> configured
> > > at
> > > => boot time (I don't remember if inside the BIOS or pressing
> > > cntrl+something
> > > => during boot)
> > > =>
> > >
> > > Based on my notes, here's how I configured the DRAC interface on a Dell
> > > 1950
> > > for use as a fence device:
> > >
> > > Configuring the card from Linux depending on the installation of
> > > Dell's
> > > OMSA package. Once that's installed, use the following
> > > commands:
> > >
> > > racadm config -g cfgSerial -o cfgSerialTelnetEnable 1
> > > racadm config -g cfgLanNetworking -o cfgDNSRacName
> > > HOSTNAME_FOR_INTERFACE
> > > racadm config -g cfgDNSDomainName DOMAINNAME_FOR_INTERFACE
> > > racadm config -g cfgUserAdmin -o cfgUserAdminPassword -i 2
> > > PASSWORD
> > > racadm config -g cfgNicEnable 1
> > > racadm config -g cfgNicIpAddress WWW.XXX.YYY.ZZZ
> > > racadm config -g cfgNicNetmask WWW.XXX.YYY.ZZZ
> > > racadm config -g cfgNicGateway WWW.XXX.YYY.ZZZ
> > > racadm config -g cfgNicUseDhcp 0
> > >
> > >
> > > I also save a backup of the configuration with:
> > >
> > > racadm getconfig -f ~/drac_config
> > >
> > >
> > > Hope this helps,
> > >
> > > Mark
> > >
> > > ----
> > > Mark Bergman voice: 215-662-7310
> > > mark.bergman at uphs.upenn.edu fax: 215-614-0266
> > > System Administrator Section of Biomedical Image Analysis
> > > Department of Radiology University of Pennsylvania
> > > PGP Key: https://www.rad.upenn.edu/sbia/bergman
> > >
> > >
> > > => Greetings,
> > > => Juanra
> > > =>
> > > => >
> > > => > ESG
> > > => >
> > > => > 2009/8/10 Paras pradhan <pradhanparas at gmail.com>
> > > => >
> > > => > On Mon, Aug 10, 2009 at 5:24 AM, ESGLinux<esggrupos at gmail.com>
> wrote:
> > > => >> > Hi all,
> > > => >> > I was designing a 2 node cluster and I was going to use 2
> servers
> > > DELL
> > > => >> > PowerEdge 1950. I was going to buy a DRAC card to use for
> fencing
> > > but
> > > => >> > running several commands in the servers I have noticed that
> when I
> > > run
> > > => >> this
> > > => >> > command:
> > > => >> > #ipmitool lan print
> > > => >> > Set in Progress : Set Complete
> > > => >> > Auth Type Support : NONE MD2 MD5 PASSWORD
> > > => >> > Auth Type Enable : Callback : MD2 MD5
> > > => >> > : User : MD2 MD5
> > > => >> > : Operator : MD2 MD5
> > > => >> > : Admin : MD2 MD5
> > > => >> > : OEM : MD2 MD5
> > > => >> > IP Address Source : Static Address
> > > => >> > IP Address : 0.0.0.0
> > > => >> > Subnet Mask : 0.0.0.0
> > > => >> > MAC Address : 00:1e:c9:ae:6f:7e
> > > => >> > SNMP Community String : public
> > > => >> > IP Header : TTL=0x40 Flags=0x40 Precedence=0x00 TOS=0x10
> > > => >> > Default Gateway IP : 0.0.0.0
> > > => >> > Default Gateway MAC : 00:00:00:00:00:00
> > > => >> > Backup Gateway IP : 0.0.0.0
> > > => >> > Backup Gateway MAC : 00:00:00:00:00:00
> > > => >> > 802.1q VLAN ID : Disabled
> > > => >> > 802.1q VLAN Priority : 0
> > > => >> > RMCP+ Cipher Suites : 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14
> > > => >> > Cipher Suite Priv Max : aaaaaaaaaaaaaaa
> > > => >> > : X=Cipher Suite Unused
> > > => >> > : c=CALLBACK
> > > => >> > : u=USER
> > > => >> > : o=OPERATOR
> > > => >> > : a=ADMIN
> > > => >> > : O=OEM
> > > => >> > does this mean that I already have an ipmi card (not
> configured)
> > > that
> > > => I
> > > => >> can
> > > => >> > use for fencing? if the anwser is yes, where hell must I
> configure
> > > it?
> > > => I
> > > => >> > don=B4t see wher can I do it.
> > > => >> > If I haven=B4t a fencing device which one do you recommed to
> use?
> > > => >> > Thanks in advance
> > > => >> > ESG
> > > => >> >
> > > => >> > --
> > > => >> > Linux-cluster mailing list
> > > => >> > Linux-cluster at redhat.com
> > > => >> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > > => >> >
> > > => >>
> > > => >> Yes you have IPMI and if you are using 1950 Dell, DRAC should be
> > > there
> > > => >> too. You can see if you have DRAC or not when the server starts
> and
> > > => >> before the loading of the OS.
> > > => >>
> > > => >> I have 1850s and I am using DRAC for fencing.
> > > => >>
> > > => >>
> > > => >> Paras.
> > > => >>
> > > => >> --
> > > => >> Linux-cluster mailing list
> > > => >> Linux-cluster at redhat.com
> > > => >> https://www.redhat.com/mailman/listinfo/linux-cluster
> > > => >>
> > > => >
> > > => >
> > >
> > >
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20090813/a4558d27/attachment.html
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Thu, 13 Aug 2009 14:45:13 +0200
> > From: mark benschop <mark.benschop.lists at gmail.com>
> > Subject: [Linux-cluster] clusterservice stays in 'recovering' state
> > To: linux-cluster at redhat.com
> > Message-ID:
> > <f97c3a70908130545n11ce442ej17d74c9cdc450e45 at mail.gmail.com>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> > Hi All,
> >
> > I've a problem with a clusterservice. The service was started up while
> one
> > of the resources, an NFS, export was not accessible.
> > Therefore the service never started up right but got into the
> 'recovering'
> > state.
> > In the mean time the NFS exports are setup properly but to no avail.
> > Stopping the clusterservice, using clusvcadm -d <service>, will result in
> > the service going down but staying in the 'recovering' state.
> > Starting it again doesn't work. The service doesn't start and stays in
> the
> > recovery status.
> > I'm suspecting rgmanager lost it somehow.
> >
> > Anybody had any ideas on what could be the problem and how to resolve it
> ?
> >
> > Thanks in advance,
> > Mark
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20090813/31731cd3/attachment.html
> >
> > ------------------------------
> >
> > Message: 3
> > Date: Thu, 13 Aug 2009 16:13:12 +0300
> > From: Hakan VELIOGLU <veliogluh at itu.edu.tr>
> > Subject: Re: [Linux-cluster] Is there any backup heartbeat channel
> > To: linux-cluster at redhat.com
> > Message-ID: <20090813161312.11546h2sp6psr814 at webmail.itu.edu.tr>
> > Content-Type: text/plain; charset=ISO-8859-9; DelSp="Yes";
> > format="flowed"
> >
> > Thanks for all the answers.
> >
> > I think there is realy no backup heartbeat channel. Maybe the reason
> > is GFS. DLM works on the heartbeat channel. If you lost your heartbeat
> > you lose your lock consistency so it is better to fence the other
> > node. For this reason I think if you don't have enough network
> > interface on server and switch, loosing the heartbeat network may shut
> > all the cluster members.
> >
> > Hakan VELÝOÐLU
> >
> >
> > ----- robejrm at gmail.com den ileti ---------
> > Tarih: Thu, 13 Aug 2009 10:42:11 +0200
> > Kimden: Juan Ramon Martin Blanco <robejrm at gmail.com>
> > Yanýt Adresi:linux clustering <linux-cluster at redhat.com>
> > Konu: Re: [Linux-cluster] Is there any backup heartbeat channel
> > Kime: linux clustering <linux-cluster at redhat.com>
> >
> >
> > > 2009/8/13 Hakan VELIOGLU <veliogluh at itu.edu.tr>
> > >
> > >> ----- raju.rajsand at gmail.com den ileti ---------
> > >> Tarih: Thu, 13 Aug 2009 08:57:15 +0530
> > >> Kimden: Rajagopal Swaminathan <raju.rajsand at gmail.com>
> > >> Yanýt Adresi:linux clustering <linux-cluster at redhat.com>
> > >> Konu: Re: [Linux-cluster] Is there any backup heartbeat channel
> > >> Kime: linux clustering <linux-cluster at redhat.com>
> > >>
> > >>
> > >> Greetings,
> > >>>
> > >>> 2009/8/12 Hakan VELIOGLU <veliogluh at itu.edu.tr>:
> > >>>
> > >>>> Hi list,
> > >>>>
> > >>>> I am trying a two node cluster with RH 5.3 on Sun X4150 hardware. I
> use a
> > >>>>
> > >>>
> > >>> IIRC, Sun x4150 has four ethernet ports. Two can be used for outside
> > >>> networking and two can be bonded and used for heartbeat.
> > >>>
> > >> I think, I couldn't explain my networking. I use two ethernet ports
> for xen
> > >> vm which are trunk and bonded ports. Then there left two. Our network
> > >> topology (which is out of my control) available for one port for
> server
> > >> control (SSH).
> > >
> > > So you can't use a bonded port for both server management and cluster
> > > communications, can you? You can configure an active-passive bonding
> and
> > > then you can have many virtual interfaces on top of that, i.e: bond0:0,
> > > bond0:1 and assign them the ip addesses you need.
> > >
> > >
> > > I use the other one with a cross over cable for heartbeat. So there is
> no
> > >> way for bonding these two interfaces. Of course if I buy an extra
> switch I
> > >> may do this.
> > >
> > > You can connect them to the same switch (though you lost kind of
> > > redundancy), or you can use two crossover cables and move the
> management IP
> > > to the same ports you are using for the vm's.
> > >
> > > Greetings,
> > > Juanra
> > >
> > >>
> > >> I don't realy understand why there is no backup heartbeat channel. LVS
> and
> > >> MS cluster has this ability.
> > >>
> > >>>
> > >>> ALOM can be used for fencing and can be on a seperate subnet if
> required.
> > >>>
> > >> I used this for fencing_ipmilan.
> > >>
> > >>>
> > >>> Regards
> > >>>
> > >>> Rajagopal
> > >>>
> > >>> --
> > >>> Linux-cluster mailing list
> > >>> Linux-cluster at redhat.com
> > >>> https://www.redhat.com/mailman/listinfo/linux-cluster
> > >>>
> > >>>
> > >>
> > >> ----- raju.rajsand at gmail.com den iletiyi bitir -----
> > >>
> > >>
> > >>
> > >>
> > >> --
> > >> Linux-cluster mailing list
> > >> Linux-cluster at redhat.com
> > >> https://www.redhat.com/mailman/listinfo/linux-cluster
> > >>
> > >
> >
> >
> > ----- robejrm at gmail.com den iletiyi bitir -----
> >
> >
> >
> >
> >
> > ------------------------------
> >
> > Message: 4
> > Date: Thu, 13 Aug 2009 15:29:43 +0200
> > From: Juan Ramon Martin Blanco <robejrm at gmail.com>
> > Subject: Re: [Linux-cluster] Is there any backup heartbeat channel
> > To: linux clustering <linux-cluster at redhat.com>
> > Message-ID:
> > <8a5668960908130629n6ec05a88n463a3b03da331dae at mail.gmail.com>
> > Content-Type: text/plain; charset="iso-8859-9"
> >
> > 2009/8/13 Hakan VELIOGLU <veliogluh at itu.edu.tr>
> >
> > > Thanks for all the answers.
> > >
> > > I think there is realy no backup heartbeat channel. Maybe the reason is
> > > GFS. DLM works on the heartbeat channel. If you lost your heartbeat you
> lose
> > > your lock consistency so it is better to fence the other node. For this
> > > reason I think if you don't have enough network interface on server and
> > > switch, loosing the heartbeat network may shut all the cluster members.
> > >
> > There is no backup heartbeat channel because you should do the backup at
> a
> > operating system level, i.e: bonding
> > That's why you should use a bonded interface for the heartbeat channel
> with
> > at least 2 ethernet slaves; going further (for better redundancy) each of
> > the slaves should be on a different network card and you should connect
> the
> > each slave to a different switch.
> > But what I am trying to explain, is that you can use that bonded logical
> > interface also for things different from hearbeat. ;)
> >
> > Greetings,
> > Juanra
> >
> >
> > > Hakan VELÝOÐLU
> > >
> > >
> > > ----- robejrm at gmail.com den ileti ---------
> > > Tarih: Thu, 13 Aug 2009 10:42:11 +0200
> > > Kimden: Juan Ramon Martin Blanco <robejrm at gmail.com>
> > >
> > > Yanýt Adresi:linux clustering <linux-cluster at redhat.com>
> > > Konu: Re: [Linux-cluster] Is there any backup heartbeat channel
> > > Kime: linux clustering <linux-cluster at redhat.com>
> > >
> > >
> > > 2009/8/13 Hakan VELIOGLU <veliogluh at itu.edu.tr>
> > >>
> > >> ----- raju.rajsand at gmail.com den ileti ---------
> > >>> Tarih: Thu, 13 Aug 2009 08:57:15 +0530
> > >>> Kimden: Rajagopal Swaminathan <raju.rajsand at gmail.com>
> > >>> Yanýt Adresi:linux clustering <linux-cluster at redhat.com>
> > >>> Konu: Re: [Linux-cluster] Is there any backup heartbeat channel
> > >>> Kime: linux clustering <linux-cluster at redhat.com>
> > >>>
> > >>>
> > >>> Greetings,
> > >>>
> > >>>>
> > >>>> 2009/8/12 Hakan VELIOGLU <veliogluh at itu.edu.tr>:
> > >>>>
> > >>>> Hi list,
> > >>>>>
> > >>>>> I am trying a two node cluster with RH 5.3 on Sun X4150 hardware. I
> use
> > >>>>> a
> > >>>>>
> > >>>>>
> > >>>> IIRC, Sun x4150 has four ethernet ports. Two can be used for outside
> > >>>> networking and two can be bonded and used for heartbeat.
> > >>>>
> > >>>> I think, I couldn't explain my networking. I use two ethernet ports
> for
> > >>> xen
> > >>> vm which are trunk and bonded ports. Then there left two. Our network
> > >>> topology (which is out of my control) available for one port for
> server
> > >>> control (SSH).
> > >>>
> > >>
> > >> So you can't use a bonded port for both server management and cluster
> > >> communications, can you? You can configure an active-passive bonding
> and
> > >> then you can have many virtual interfaces on top of that, i.e:
> bond0:0,
> > >> bond0:1 and assign them the ip addesses you need.
> > >>
> > >>
> > >> I use the other one with a cross over cable for heartbeat. So there is
> no
> > >>
> > >>> way for bonding these two interfaces. Of course if I buy an extra
> switch
> > >>> I
> > >>> may do this.
> > >>>
> > >>
> > >> You can connect them to the same switch (though you lost kind of
> > >> redundancy), or you can use two crossover cables and move the
> management
> > >> IP
> > >> to the same ports you are using for the vm's.
> > >>
> > >> Greetings,
> > >> Juanra
> > >>
> > >>
> > >>> I don't realy understand why there is no backup heartbeat channel.
> LVS
> > >>> and
> > >>> MS cluster has this ability.
> > >>>
> > >>>
> > >>>> ALOM can be used for fencing and can be on a seperate subnet if
> > >>>> required.
> > >>>>
> > >>>> I used this for fencing_ipmilan.
> > >>>
> > >>>
> > >>>> Regards
> > >>>>
> > >>>> Rajagopal
> > >>>>
> > >>>> --
> > >>>> Linux-cluster mailing list
> > >>>> Linux-cluster at redhat.com
> > >>>> https://www.redhat.com/mailman/listinfo/linux-cluster
> > >>>>
> > >>>>
> > >>>>
> > >>> ----- raju.rajsand at gmail.com den iletiyi bitir -----
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Linux-cluster mailing list
> > >>> Linux-cluster at redhat.com
> > >>> https://www.redhat.com/mailman/listinfo/linux-cluster
> > >>>
> > >>>
> > >>
> > >
> > > ----- robejrm at gmail.com den iletiyi bitir -----
> > >
> > >
> > >
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20090813/cd4dc079/attachment.html
> >
> > ------------------------------
> >
> > Message: 5
> > Date: Thu, 13 Aug 2009 10:07:13 -0400
> > From: Nehemias Jahcob <nehemiasjahcob at gmail.com>
> > Subject: [Linux-cluster] RHCS on KVM
> > To: linux clustering <linux-cluster at redhat.com>
> > Message-ID:
> > <5f61ab380908130707q5c936504k7351d0d6b3459090 at mail.gmail.com>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> > Hi.
> >
> > How to create a cluster of 2 nodes in rhel5.4 (or Fedora 10) with KVM?
> >
> > With XEN follow this guide:
> > http://sources.redhat.com/cluster/wiki/VMClusterCookbook?highlight =
> > (CategoryHowTo).
> >
> > Do you have a guide to implementation of RHCS in KVM?
> >
> > Thank you all.
> > NJ
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20090813/f3a69a80/attachment.html
> >
> > ------------------------------
> >
> > Message: 6
> > Date: Thu, 13 Aug 2009 14:16:47 +0000
> > From: NTOUGHE GUY-SERGE <ntoughe at hotmail.com>
> > Subject: [Linux-cluster] Cman hang
> > To: <linux-cluster at redhat.com>
> > Message-ID: <BAY119-W410E2F250E8B461752CFC9A5050 at phx.gbl>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> >
> >
> > Hi gurus,
> >
> > i installed RHEL 5.3 on 2 servers which participating to a cluster
> composed of these 2 nodes:
> > kernel version:
> > kernel-headers-2.6.18-128.el5
> > kernel-devel-2.6.18-128.el5
> > kernel-2.6.18-128.el5
> > cman-devel-2.0.98-1.el5_3.1
> > cman-2.0.98-1.el5_3.1
> > cluster-cim-0.12.1-2.el5
> > lvm2-cluster-2.02.40-7.el5
> > cluster-snmp-0.12.1-2.el5
> > modcluster-0.12.1-2.el5
> > When i want to start cman the following message is sent:
> > cman not started: Multicast and node address families differ.
> /usr/sbin/cman_tool: aisexec daemon didn't start
> > [FAILED]
> >
> > I trier to mount gfs2
> > and i got theses messages:
> > # mount -t gfs2 /dev/VolGroup01/LogVol01 /appli/prod --o
> lockTablename=arvclust:/appli/prod, Lockproto=lock_dlm
> >
> > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused
> >
> > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused
> >
> > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused
> >
> > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused
> >
> > do you have any clues?
> > Please it's an hurry, i waste long time to lok for solution help
> > regards
> >
> >
> >
> >
> >
> >
> >
> >
> > ntoughe at hotmail.com
> >
> >
> > _________________________________________________________________
> > With Windows Live, you can organize, edit, and share your photos.
> >
> http://www.microsoft.com/middleeast/windows/windowslive/products/photo-gallery-edit.aspx
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20090813/0a55101d/attachment.html
> >
> > ------------------------------
> >
> > Message: 7
> > Date: Thu, 13 Aug 2009 09:14:24 -0500
> > From: David Teigland <teigland at redhat.com>
> > Subject: Re: [Linux-cluster] gfs2 mount hangs
> > To: Wengang Wang <wen.gang.wang at oracle.com>
> > Cc: linux clustering <linux-cluster at redhat.com>
> > Message-ID: <20090813141424.GA8148 at redhat.com>
> > Content-Type: text/plain; charset=us-ascii
> >
> > On Thu, Aug 13, 2009 at 02:22:11PM +0800, Wengang Wang wrote:
> > > <cman two_node="1" expected_votes="2"/>
> >
> > That's not a valid combination, two_node="1" requires expected_votes="1".
> >
> > You didn't mention which userspace cluster version/release you're using,
> or
> > include any status about the cluster. Before trying to mount gfs on
> either
> > node, collect from both nodes:
> >
> > cman_tool status
> > cman_tool nodes
> > group_tool
> >
> > Then mount on the first node and collect the same information, then try
> > mounting on the second node, collect the same information, and look for
> any
> > errors in /var/log/messages.
> >
> > Since you're using new kernels, you need to be using the cluster 3.0
> userspace
> > code. You're using the old manual fencing config. There is no more
> > fence_manual; the new way to configure manual fencing is to not configure
> any
> > fencing at all. So, your cluster.conf should look like this:
> >
> > <?xml version="1.0"?>
> > <cluster name="testgfs2" config_version="1">
> > <cman two_node="1" expected_votes="1"/>
> > <clusternodes>
> > <clusternode name="cool" nodeid="1"/>
> > <clusternode name="desk" nodeid="2"/>
> > </clusternodes>
> > </cluster>
> >
> > Dave
> >
> >
> >
> > ------------------------------
> >
> > Message: 8
> > Date: Thu, 13 Aug 2009 10:39:46 -0400
> > From: Lon Hohberger <lhh at redhat.com>
> > Subject: Re: [Linux-cluster] Qdisk question
> > To: linux clustering <linux-cluster at redhat.com>
> > Message-ID: <1250174386.23376.1440.camel at localhost.localdomain>
> > Content-Type: text/plain
> >
> > On Thu, 2009-08-13 at 00:45 +0200, brem belguebli wrote:
> >
> > > My understanding of qdisk is that it is used as a tie-breaker, but it
> > > looks like it is more a heatbeat vector than a simple tie-breaker.
> >
> > Right, it's a secondary membership algorithm.
> >
> >
> > > Until here, no real problem indeed, if the site gets apart from the
> > > other prod site and also from the third site (hosting the iscsi target
> > > qdisk) the 2 nodes from the failing site get evicted from the cluster.
> > >
> > >
> > > But, what if my third site gets isolated while the 2 prod ones are
> > > fine ?
> >
> > Qdisk votes will not be presented to CMAN any more, but the two sites
> > should remain online if they still have a "majority" of votes.
> >
> >
> > > The real question is what happens in case all the nodes loose access
> > > to the qdisk while they're still able to see each others ?
> >
> > Qdisk is just a vote like other voting mechanisms. If all nodes lose
> > access at the same time, it should behave like a node death. However,
> > the default action if _one_ node loses access is to kill that node (even
> > if CMAN still sees it).
> >
> >
> > > The 4 nodes have each 1 vote and the qdisk 1 vote. The expected quorum
> > > is 3.
> >
> >
> > > If I loose the qdisk, the number of votes falls to 4, the cluster is
> > > quorate (4>3) but it looks like everything goes bad, each node
> > > deactivate itself as it can't write its alive status (--> heartbeat
> > > vector) to the qdisk even if the network heartbeating is working
> > > fine.
> >
> > What happens specifically? Most of the actions qdiskd performs are
> > configurable. For example, if the nodes are rebooting, you can turn
> > that behavior off.
> >
> >
> >
> > I wrote a simple 'ping' tiebreaker based the behaviors in RHEL3. It
> > functions in many ways in the same manner as qdiskd with respect to vote
> > advertisement to CMAN, but without needing a disk - maybe you would find
> > it useful?
> >
> > http://people.redhat.com/lhh/qnet.tar.gz
> >
> > -- Lon
> >
> >
> >
> > ------------------------------
> >
> > Message: 9
> > Date: Thu, 13 Aug 2009 17:02:15 +0200
> > From: Juan Ramon Martin Blanco <robejrm at gmail.com>
> > Subject: Re: [Linux-cluster] Cman hang
> > To: linux clustering <linux-cluster at redhat.com>
> > Message-ID:
> > <8a5668960908130802p4f5168cbueda86d1e6f1324bb at mail.gmail.com>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> > On Thu, Aug 13, 2009 at 4:16 PM, NTOUGHE GUY-SERGE <ntoughe at hotmail.com
> >wrote:
> >
> > >
> > > Hi gurus,
> > >
> > > i installed RHEL 5.3 on 2 servers which participating to a cluster
> > > composed of these 2 nodes:
> > > kernel version:
> > > kernel-headers-2.6.18-128.el5
> > > kernel-devel-2.6.18-128.el5
> > > kernel-2.6.18-128.el5
> > > cman-devel-2.0.98-1.el5_3.1
> > > cman-2.0.98-1.el5_3.1
> > > cluster-cim-0.12.1-2.el5
> > > lvm2-cluster-2.02.40-7.el5
> > > cluster-snmp-0.12.1-2.el5
> > > modcluster-0.12.1-2.el5
> > > When i want to start cman the following message is sent:
> > > cman not started: Multicast and node address families differ.
> > > /usr/sbin/cman_tool: aisexec daemon didn't start
> > > [FAILED]
> > >
> > Please, show us your cluster.conf file so we can help.
> >
> > Regards,
> > Juanra
> >
> > >
> > > I trier to mount gfs2
> > > and i got theses messages:
> > > # mount -t gfs2 /dev/VolGroup01/LogVol01 /appli/prod --o
> > > lockTablename=arvclust:/appli/prod, Lockproto=lock_dlm
> > >
> > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused
> > >
> > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused
> > >
> > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused
> > >
> > > /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused
> > >
> > > do you have any clues?
> > > Please it's an hurry, i waste long time to lok for solution help
> > > regards
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > ntoughe at hotmail.com
> > >
> > >
> > >
> > > ------------------------------
> > > With Windows Live, you can organize, edit, and share your photos.<
> http://www.microsoft.com/middleeast/windows/windowslive/products/photo-gallery-edit.aspx
> >
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20090813/9ecbcab1/attachment.html
> >
> > ------------------------------
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > End of Linux-cluster Digest, Vol 64, Issue 16
> > *********************************************
>
> ------------------------------
> See all the ways you can stay connected to friends and family<http://www.microsoft.com/windows/windowslive/default.aspx>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090813/f9557411/attachment.htm>