From mgrac at redhat.com Mon Sep 2 14:33:39 2013 From: mgrac at redhat.com (Marek Grac) Date: Mon, 02 Sep 2013 16:33:39 +0200 Subject: [Linux-cluster] fence-agents-4.0.3 stable release Message-ID: <5224A1C3.4070401@redhat.com> Welcome to the fence-agents 4.0.3 release. This release includes minor bug fixes: * Login failure on IBM Bladecenter because of 'Last login: ' line * Fix fencing_snmp library and usage of long options (--a instead of --address) * In fence_scsi documentation of "delay" was added and error in XML metadata was fixed * Users of fence_ilo were not able to login if password contained character " * Fence agent for Brocade switches was rewritten to use fencing library and now it has all standard features like timeouts, ssh support, ... The new source tarball can be downloaded here: https://fedorahosted.org/releases/f/e/fence-agents/fence-agents-4.0.3.tar.xz To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. m, From Micah.Schaefer at jhuapl.edu Thu Sep 5 15:24:17 2013 From: Micah.Schaefer at jhuapl.edu (Schaefer, Micah) Date: Thu, 5 Sep 2013 11:24:17 -0400 Subject: [Linux-cluster] GFS2 File Permissions Message-ID: Hello, I am running a cluster with two nodes. Each node is importing an iSCSI block device. Using clustered logical volume management, they are sharing several logical volumes that are formatted with GFS2. I have attempted to synchronize the user id's and groups id's between the two servers, to provide persistent access to the shared volumes. Once I changed the entries in /etc/passwd and /etc/group on the second node, I am now receiving a permission denied message for access any of the shared files. I have verified the user id and group id of the files match the user account's id's and am at a loss. Is there something I am missing, and is there a better way of accomplishing this task? Regards, ------- Micah Schaefer JHU/ APL From swhiteho at redhat.com Thu Sep 5 15:32:12 2013 From: swhiteho at redhat.com (Steven Whitehouse) Date: Thu, 05 Sep 2013 16:32:12 +0100 Subject: [Linux-cluster] GFS2 File Permissions In-Reply-To: References: Message-ID: <1378395132.2698.10.camel@menhir> Hi, On Thu, 2013-09-05 at 11:24 -0400, Schaefer, Micah wrote: > Hello, > I am running a cluster with two nodes. Each node is importing an iSCSI > block device. Using clustered logical volume management, they are sharing > several logical volumes that are formatted with GFS2. > > I have attempted to synchronize the user id's and groups id's between the > two servers, to provide persistent access to the shared volumes. > > Once I changed the entries in /etc/passwd and /etc/group on the second > node, I am now receiving a permission denied message for access any of the > shared files. > > I have verified the user id and group id of the files match the user > account's id's and am at a loss. > > Is there something I am missing, and is there a better way of > accomplishing this task? > Well it should work in the absence of any other complicating factors (such as selinux) and if the uid/gid are the same in both cases. Can you post an example with the full permissions? I assume that you are not using ACLs but just normal unix permissions? Steve. > > Regards, > ------- > Micah Schaefer > JHU/ APL > > From Micah.Schaefer at jhuapl.edu Thu Sep 5 15:39:28 2013 From: Micah.Schaefer at jhuapl.edu (Schaefer, Micah) Date: Thu, 5 Sep 2013 11:39:28 -0400 Subject: [Linux-cluster] GFS2 File Permissions In-Reply-To: <1378395132.2698.10.camel@menhir> Message-ID: On 9/5/13 11:32 AM, "Steven Whitehouse" wrote: >Hi, > >On Thu, 2013-09-05 at 11:24 -0400, Schaefer, Micah wrote: >> Hello, >> I am running a cluster with two nodes. Each node is importing an iSCSI >> block device. Using clustered logical volume management, they are >>sharing >> several logical volumes that are formatted with GFS2. >> >> I have attempted to synchronize the user id's and groups id's between >>the >> two servers, to provide persistent access to the shared volumes. >> >> Once I changed the entries in /etc/passwd and /etc/group on the second >> node, I am now receiving a permission denied message for access any of >>the >> shared files. >> >> I have verified the user id and group id of the files match the user >> account's id's and am at a loss. >> >> Is there something I am missing, and is there a better way of >> accomplishing this task? >> >Well it should work in the absence of any other complicating factors >(such as selinux) and if the uid/gid are the same in both cases. Can you >post an example with the full permissions? I assume that you are not >using ACLs but just normal unix permissions? > >Steve. > >> >> Regards, >> ------- >> Micah Schaefer >> JHU/ APL >> >> > > Thanks for the fast response. Selinux is disabled, and I am not using any ACL's, just standard unix permissions. Does GFS2 care about or handle permissions at all? I also tried mounting the volumes with the acl option with no difference. -bash-4.1$ sudo ls -alnd ~/ drwxrwx--- 21 500 500 3864 Sep 5 10:21 /itc/data/home/user/ -bash-4.1$ ls ~/ ls: cannot access /itc/data/home/user/: Permission denied -bash-4.1$ id uid=500(schaemj1) gid=500(user) groups=500(user),10(wheel),48(apache) -bash-4.1$ sestatus SELinux status: disabled -bash-4.1$ sudo getfacl ~/ getfacl: Removing leading '/' from absolute path names # file: itc/data/home/user/ # owner: user # group: user user::rwx group::rwx other::--- ****note 'user' was substituted to sanitize user name***** -bash-4.1$ sudo mount | grep gfs2 /dev/mapper/vg_itc--stor1-lv_html on /itc/html type gfs2 (rw,relatime,hostdata=jid=0) /dev/mapper/vg_itc--stor1-lv_db on /itc/db type gfs2 (rw,relatime,hostdata=jid=0) /dev/mapper/vg_itc--stor1-lv_data on /itc/data type gfs2 (rw,relatime,hostdata=jid=0,acl) From tmg at redhat.com Thu Sep 5 16:18:13 2013 From: tmg at redhat.com (Thom Gardner) Date: Thu, 5 Sep 2013 12:18:13 -0400 Subject: [Linux-cluster] GFS2 File Permissions In-Reply-To: References: <1378395132.2698.10.camel@menhir> Message-ID: <20130905161813.GA4036@Hungry.rdu.redhat.com> On Thu, Sep 05, 2013 at 11:39:28AM -0400, Schaefer, Micah wrote: > Thanks for the fast response. Selinux is disabled, and I am not using any > ACL's, just standard unix permissions. > > Does GFS2 care about or handle permissions at all? Sure it does. > I also tried mounting the volumes with the acl option with no difference. > > -bash-4.1$ sudo ls -alnd ~/ > drwxrwx--- 21 500 500 3864 Sep 5 10:21 /itc/data/home/user/ > > -bash-4.1$ ls ~/ > ls: cannot access /itc/data/home/user/: Permission denied > -bash-4.1$ id > > uid=500(schaemj1) gid=500(user) groups=500(user),10(wheel),48(apache) > > -bash-4.1$ sestatus > SELinux status: disabled > > -bash-4.1$ sudo getfacl ~/ > getfacl: Removing leading '/' from absolute path names > # file: itc/data/home/user/ > # owner: user > # group: user > user::rwx > group::rwx > other::--- > > ****note 'user' was substituted to sanitize user name***** Wow, that really should work. One more thing: Check permissions on directories leading up to your $HOME (i.e.: /itc/data/home, /itc/data, and /itc, maybe even /). If any of those block your UID, you won't be able to see /itc/data/home/user either. /me pines for the olden days of Unix when this weren't so.... tg. From Micah.Schaefer at jhuapl.edu Thu Sep 5 16:29:59 2013 From: Micah.Schaefer at jhuapl.edu (Schaefer, Micah) Date: Thu, 5 Sep 2013 12:29:59 -0400 Subject: [Linux-cluster] GFS2 File Permissions In-Reply-To: <20130905161813.GA4036@Hungry.rdu.redhat.com> Message-ID: On 9/5/13 12:18 PM, "Thom Gardner" wrote: >On Thu, Sep 05, 2013 at 11:39:28AM -0400, Schaefer, Micah wrote: >> Thanks for the fast response. Selinux is disabled, and I am not using >>any >> ACL's, just standard unix permissions. >> >> Does GFS2 care about or handle permissions at all? > >Sure it does. > >> I also tried mounting the volumes with the acl option with no >>difference. >> >> -bash-4.1$ sudo ls -alnd ~/ >> drwxrwx--- 21 500 500 3864 Sep 5 10:21 /itc/data/home/user/ >> >> -bash-4.1$ ls ~/ >> ls: cannot access /itc/data/home/user/: Permission denied >> -bash-4.1$ id >> >> uid=500(schaemj1) gid=500(user) groups=500(user),10(wheel),48(apache) >> >> -bash-4.1$ sestatus >> SELinux status: disabled >> >> -bash-4.1$ sudo getfacl ~/ >> getfacl: Removing leading '/' from absolute path names >> # file: itc/data/home/user/ >> # owner: user >> # group: user >> user::rwx >> group::rwx >> other::--- >> >> ****note 'user' was substituted to sanitize user name***** > >Wow, that really should work. One more thing: Check permissions on >directories leading up to your $HOME (i.e.: /itc/data/home, /itc/data, >and /itc, maybe even /). If any of those block your UID, you won't >be able to see /itc/data/home/user either. > >/me pines for the olden days of Unix when this weren't so.... > >tg. > >-- >Linux-cluster mailing list >Linux-cluster at redhat.com >https://www.redhat.com/mailman/listinfo/linux-cluster Permissions on the leading directory was the issue. I assigned group ownership to the leading directories and added the executable bit and it is working. Thanks for your help, I'm glad it was a simple (careless on my part) issue and not a bigger one. From shashikanth.komandoor at gmail.com Fri Sep 6 08:23:40 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 13:53:40 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing Message-ID: Hi Team, I am trying to implement the RHEL 6 Cluster along with fencing using the fence device as hp_ilo. I configured the /etc/cluster/cluster.conf properly with the proper credentials. But I am unable to fence the node1. When I am trying to run the command manually using "fence_ilo" form the second node, it is showing as "Unable to connect/login to the fence device". Please suggest me accordingly so that I can work over the RHEL cluster. Thanks in advance. -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Sep 6 09:10:40 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 11:10:40 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: Ilo version? cluster.conf? whith this kind of information nobody can't help you 2013/9/6 Shashikanth Komandoor > Hi Team, > > I am trying to implement the RHEL 6 Cluster along with fencing > using the fence device as hp_ilo. I configured the > /etc/cluster/cluster.conf properly with the proper credentials. > > But I am unable to fence the node1. When I am trying to run the > command manually using "fence_ilo" form the second node, it is showing as > "Unable to connect/login to the fence device". Please suggest me > accordingly so that I can work over the RHEL cluster. > > Thanks in advance. > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 12:15:26 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 17:45:26 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: Hi Emmanuel, Thank you for your immediate response. I have attached my /etc/cluster/cluster.conf file with the mail. And I am working over the *"HP Proliant BL685c G7" *blades and the ILO version is 1.28. Please let me know if you need any more details. Thanks in advance. On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: > Ilo version? > cluster.conf? > > whith this kind of information nobody can't help you > > > 2013/9/6 Shashikanth Komandoor > >> Hi Team, >> >> I am trying to implement the RHEL 6 Cluster along with fencing >> using the fence device as hp_ilo. I configured the >> /etc/cluster/cluster.conf properly with the proper credentials. >> >> But I am unable to fence the node1. When I am trying to run the >> command manually using "fence_ilo" form the second node, it is showing as >> "Unable to connect/login to the fence device". Please suggest me >> accordingly so that I can work over the RHEL cluster. >> >> Thanks in advance. >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cluster.conf Type: application/octet-stream Size: 1950 bytes Desc: not available URL: From emi2fast at gmail.com Fri Sep 6 12:28:35 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 14:28:35 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: Hello Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o status and send me the output Thanks 2013/9/6 Shashikanth Komandoor > Hi Emmanuel, > > Thank you for your immediate response. I have attached my > /etc/cluster/cluster.conf file with the mail. > > And I am working over the *"HP Proliant BL685c G7" *blades > and the ILO version is 1.28. > > Please let me know if you need any more details. Thanks in > advance. > > > On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: > >> Ilo version? >> cluster.conf? >> >> whith this kind of information nobody can't help you >> >> >> 2013/9/6 Shashikanth Komandoor >> >>> Hi Team, >>> >>> I am trying to implement the RHEL 6 Cluster along with fencing >>> using the fence device as hp_ilo. I configured the >>> /etc/cluster/cluster.conf properly with the proper credentials. >>> >>> But I am unable to fence the node1. When I am trying to run the >>> command manually using "fence_ilo" form the second node, it is showing as >>> "Unable to connect/login to the fence device". Please suggest me >>> accordingly so that I can work over the RHEL cluster. >>> >>> Thanks in advance. >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From adel.benzarrouk at gmail.com Fri Sep 6 12:42:37 2013 From: adel.benzarrouk at gmail.com (Adel Ben Zarrouk) Date: Fri, 6 Sep 2013 13:42:37 +0100 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: Hello, The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable the "lanplus" and use the agent "fence_ipmilan" instead. --Adel On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura wrote: > Hello > > Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o > status and send me the output > > Thanks > > > 2013/9/6 Shashikanth Komandoor > >> Hi Emmanuel, >> >> Thank you for your immediate response. I have attached my >> /etc/cluster/cluster.conf file with the mail. >> >> And I am working over the *"HP Proliant BL685c G7" *blades >> and the ILO version is 1.28. >> >> Please let me know if you need any more details. Thanks in >> advance. >> >> >> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >> >>> Ilo version? >>> cluster.conf? >>> >>> whith this kind of information nobody can't help you >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> Hi Team, >>>> >>>> I am trying to implement the RHEL 6 Cluster along with fencing >>>> using the fence device as hp_ilo. I configured the >>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>> >>>> But I am unable to fence the node1. When I am trying to run >>>> the command manually using "fence_ilo" form the second node, it is showing >>>> as "Unable to connect/login to the fence device". Please suggest me >>>> accordingly so that I can work over the RHEL cluster. >>>> >>>> Thanks in advance. >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 12:56:08 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 18:26:08 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: I tried the same command but I am getting the output as "Unable to connect/login to fencing device" On Fri, Sep 6, 2013 at 5:58 PM, emmanuel segura wrote: > Hello > > Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o > status and send me the output > > Thanks > > > 2013/9/6 Shashikanth Komandoor > >> Hi Emmanuel, >> >> Thank you for your immediate response. I have attached my >> /etc/cluster/cluster.conf file with the mail. >> >> And I am working over the *"HP Proliant BL685c G7" *blades >> and the ILO version is 1.28. >> >> Please let me know if you need any more details. Thanks in >> advance. >> >> >> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >> >>> Ilo version? >>> cluster.conf? >>> >>> whith this kind of information nobody can't help you >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> Hi Team, >>>> >>>> I am trying to implement the RHEL 6 Cluster along with fencing >>>> using the fence device as hp_ilo. I configured the >>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>> >>>> But I am unable to fence the node1. When I am trying to run >>>> the command manually using "fence_ilo" form the second node, it is showing >>>> as "Unable to connect/login to the fence device". Please suggest me >>>> accordingly so that I can work over the RHEL cluster. >>>> >>>> Thanks in advance. >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 12:59:23 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 18:29:23 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: I tried with fence_ipmilan the below output is shown *ipmilan: Failed to connect after 20 seconds* *Chassis power = Unknown* *Failed* On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk wrote: > Hello, > > The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable the > "lanplus" and use the agent "fence_ipmilan" instead. > > --Adel > > > > On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura wrote: > >> Hello >> >> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o >> status and send me the output >> >> Thanks >> >> >> 2013/9/6 Shashikanth Komandoor >> >>> Hi Emmanuel, >>> >>> Thank you for your immediate response. I have attached my >>> /etc/cluster/cluster.conf file with the mail. >>> >>> And I am working over the *"HP Proliant BL685c G7" *blades >>> and the ILO version is 1.28. >>> >>> Please let me know if you need any more details. Thanks in >>> advance. >>> >>> >>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >>> >>>> Ilo version? >>>> cluster.conf? >>>> >>>> whith this kind of information nobody can't help you >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> Hi Team, >>>>> >>>>> I am trying to implement the RHEL 6 Cluster along with >>>>> fencing using the fence device as hp_ilo. I configured the >>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>> >>>>> But I am unable to fence the node1. When I am trying to run >>>>> the command manually using "fence_ilo" form the second node, it is showing >>>>> as "Unable to connect/login to the fence device". Please suggest me >>>>> accordingly so that I can work over the RHEL cluster. >>>>> >>>>> Thanks in advance. >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Sep 6 13:02:22 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 15:02:22 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: If tried with same command, are you sure your user and password are ok? 2013/9/6 Shashikanth Komandoor > I tried the same command but I am getting the output as > "Unable to connect/login to fencing device" > > > On Fri, Sep 6, 2013 at 5:58 PM, emmanuel segura wrote: > >> Hello >> >> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o >> status and send me the output >> >> Thanks >> >> >> 2013/9/6 Shashikanth Komandoor >> >>> Hi Emmanuel, >>> >>> Thank you for your immediate response. I have attached my >>> /etc/cluster/cluster.conf file with the mail. >>> >>> And I am working over the *"HP Proliant BL685c G7" *blades >>> and the ILO version is 1.28. >>> >>> Please let me know if you need any more details. Thanks in >>> advance. >>> >>> >>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >>> >>>> Ilo version? >>>> cluster.conf? >>>> >>>> whith this kind of information nobody can't help you >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> Hi Team, >>>>> >>>>> I am trying to implement the RHEL 6 Cluster along with >>>>> fencing using the fence device as hp_ilo. I configured the >>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>> >>>>> But I am unable to fence the node1. When I am trying to run >>>>> the command manually using "fence_ilo" form the second node, it is showing >>>>> as "Unable to connect/login to the fence device". Please suggest me >>>>> accordingly so that I can work over the RHEL cluster. >>>>> >>>>> Thanks in advance. >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Sep 6 13:05:09 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 15:05:09 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: check the user privileges in your ilo console or your username and password, try to login to your in ssh username at ipilo 2013/9/6 Shashikanth Komandoor > I tried with fence_ipmilan the below output is shown > *ipmilan: Failed to connect after 20 seconds* > *Chassis power = Unknown* > *Failed* > > > > On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < > adel.benzarrouk at gmail.com> wrote: > >> Hello, >> >> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable the >> "lanplus" and use the agent "fence_ipmilan" instead. >> >> --Adel >> >> >> >> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura wrote: >> >>> Hello >>> >>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o >>> status and send me the output >>> >>> Thanks >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> Hi Emmanuel, >>>> >>>> Thank you for your immediate response. I have attached my >>>> /etc/cluster/cluster.conf file with the mail. >>>> >>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>> and the ILO version is 1.28. >>>> >>>> Please let me know if you need any more details. Thanks in >>>> advance. >>>> >>>> >>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >>>> >>>>> Ilo version? >>>>> cluster.conf? >>>>> >>>>> whith this kind of information nobody can't help you >>>>> >>>>> >>>>> 2013/9/6 Shashikanth Komandoor >>>>> >>>>>> Hi Team, >>>>>> >>>>>> I am trying to implement the RHEL 6 Cluster along with >>>>>> fencing using the fence device as hp_ilo. I configured the >>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>> >>>>>> But I am unable to fence the node1. When I am trying to run >>>>>> the command manually using "fence_ilo" form the second node, it is showing >>>>>> as "Unable to connect/login to the fence device". Please suggest me >>>>>> accordingly so that I can work over the RHEL cluster. >>>>>> >>>>>> Thanks in advance. >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Shashi Kanth.K >>>>>> 9052671936 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 13:23:42 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 18:53:42 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: The credentials are correct because I am able to login to the hpilo using ssh and able power on and power off the OS. On Fri, Sep 6, 2013 at 6:32 PM, emmanuel segura wrote: > If tried with same command, are you sure your user and password are ok? > > > > > 2013/9/6 Shashikanth Komandoor > >> I tried the same command but I am getting the output as >> "Unable to connect/login to fencing device" >> >> >> On Fri, Sep 6, 2013 at 5:58 PM, emmanuel segura wrote: >> >>> Hello >>> >>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o >>> status and send me the output >>> >>> Thanks >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> Hi Emmanuel, >>>> >>>> Thank you for your immediate response. I have attached my >>>> /etc/cluster/cluster.conf file with the mail. >>>> >>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>> and the ILO version is 1.28. >>>> >>>> Please let me know if you need any more details. Thanks in >>>> advance. >>>> >>>> >>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >>>> >>>>> Ilo version? >>>>> cluster.conf? >>>>> >>>>> whith this kind of information nobody can't help you >>>>> >>>>> >>>>> 2013/9/6 Shashikanth Komandoor >>>>> >>>>>> Hi Team, >>>>>> >>>>>> I am trying to implement the RHEL 6 Cluster along with >>>>>> fencing using the fence device as hp_ilo. I configured the >>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>> >>>>>> But I am unable to fence the node1. When I am trying to run >>>>>> the command manually using "fence_ilo" form the second node, it is showing >>>>>> as "Unable to connect/login to the fence device". Please suggest me >>>>>> accordingly so that I can work over the RHEL cluster. >>>>>> >>>>>> Thanks in advance. >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Shashi Kanth.K >>>>>> 9052671936 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 13:24:52 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 18:54:52 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: The credentials are correct because I am able to login to the hpilo using ssh and able power on and power off the OS. But what user privileges are you talking about? If it is about connectivity permissions there are port 22 and 443 opened between ILO IPs and the OS IPs. On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: > check the user privileges in your ilo console or your username and > password, try to login to your in ssh username at ipilo > > > 2013/9/6 Shashikanth Komandoor > >> I tried with fence_ipmilan the below output is shown >> *ipmilan: Failed to connect after 20 seconds* >> *Chassis power = Unknown* >> *Failed* >> >> >> >> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >> adel.benzarrouk at gmail.com> wrote: >> >>> Hello, >>> >>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable the >>> "lanplus" and use the agent "fence_ipmilan" instead. >>> >>> --Adel >>> >>> >>> >>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura wrote: >>> >>>> Hello >>>> >>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o >>>> status and send me the output >>>> >>>> Thanks >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> Hi Emmanuel, >>>>> >>>>> Thank you for your immediate response. I have attached my >>>>> /etc/cluster/cluster.conf file with the mail. >>>>> >>>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>>> and the ILO version is 1.28. >>>>> >>>>> Please let me know if you need any more details. Thanks in >>>>> advance. >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >>>>> >>>>>> Ilo version? >>>>>> cluster.conf? >>>>>> >>>>>> whith this kind of information nobody can't help you >>>>>> >>>>>> >>>>>> 2013/9/6 Shashikanth Komandoor >>>>>> >>>>>>> Hi Team, >>>>>>> >>>>>>> I am trying to implement the RHEL 6 Cluster along with >>>>>>> fencing using the fence device as hp_ilo. I configured the >>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>> >>>>>>> But I am unable to fence the node1. When I am trying to run >>>>>>> the command manually using "fence_ilo" form the second node, it is showing >>>>>>> as "Unable to connect/login to the fence device". Please suggest me >>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>> >>>>>>> Thanks in advance. >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Shashi Kanth.K >>>>>>> 9052671936 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Sep 6 13:32:19 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 15:32:19 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ 2013/9/6 Shashikanth Komandoor > The credentials are correct because I am able to login to the hpilo using > ssh and able power on and power off the OS. > > But what user privileges are you talking about? If it is about > connectivity permissions there are port 22 and 443 opened between ILO IPs > and the OS IPs. > > > On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: > >> check the user privileges in your ilo console or your username and >> password, try to login to your in ssh username at ipilo >> >> >> 2013/9/6 Shashikanth Komandoor >> >>> I tried with fence_ipmilan the below output is shown >>> *ipmilan: Failed to connect after 20 seconds* >>> *Chassis power = Unknown* >>> *Failed* >>> >>> >>> >>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>> adel.benzarrouk at gmail.com> wrote: >>> >>>> Hello, >>>> >>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable the >>>> "lanplus" and use the agent "fence_ipmilan" instead. >>>> >>>> --Adel >>>> >>>> >>>> >>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura wrote: >>>> >>>>> Hello >>>>> >>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o >>>>> status and send me the output >>>>> >>>>> Thanks >>>>> >>>>> >>>>> 2013/9/6 Shashikanth Komandoor >>>>> >>>>>> Hi Emmanuel, >>>>>> >>>>>> Thank you for your immediate response. I have attached my >>>>>> /etc/cluster/cluster.conf file with the mail. >>>>>> >>>>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>>>> and the ILO version is 1.28. >>>>>> >>>>>> Please let me know if you need any more details. Thanks >>>>>> in advance. >>>>>> >>>>>> >>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >>>>>> >>>>>>> Ilo version? >>>>>>> cluster.conf? >>>>>>> >>>>>>> whith this kind of information nobody can't help you >>>>>>> >>>>>>> >>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>> >>>>>>>> Hi Team, >>>>>>>> >>>>>>>> I am trying to implement the RHEL 6 Cluster along with >>>>>>>> fencing using the fence device as hp_ilo. I configured the >>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>> >>>>>>>> But I am unable to fence the node1. When I am trying to >>>>>>>> run the command manually using "fence_ilo" form the second node, it is >>>>>>>> showing as "Unable to connect/login to the fence device". Please suggest me >>>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>>> >>>>>>>> Thanks in advance. >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks & Regards, >>>>>>>> Shashi Kanth.K >>>>>>>> 9052671936 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Shashi Kanth.K >>>>>> 9052671936 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Sep 6 13:46:39 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 15:46:39 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: in others words, your ilo user need to has the privileges of power down and power up the blade, if remember well fence_ipmilan calls ipmilan, you can use ipmilan command directly with -d options for debug 2013/9/6 emmanuel segura > > http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ > > > 2013/9/6 Shashikanth Komandoor > >> The credentials are correct because I am able to login to the hpilo using >> ssh and able power on and power off the OS. >> >> But what user privileges are you talking about? If it is about >> connectivity permissions there are port 22 and 443 opened between ILO IPs >> and the OS IPs. >> >> >> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: >> >>> check the user privileges in your ilo console or your username and >>> password, try to login to your in ssh username at ipilo >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> I tried with fence_ipmilan the below output is shown >>>> *ipmilan: Failed to connect after 20 seconds* >>>> *Chassis power = Unknown* >>>> *Failed* >>>> >>>> >>>> >>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>> adel.benzarrouk at gmail.com> wrote: >>>> >>>>> Hello, >>>>> >>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable the >>>>> "lanplus" and use the agent "fence_ipmilan" instead. >>>>> >>>>> --Adel >>>>> >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura wrote: >>>>> >>>>>> Hello >>>>>> >>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o >>>>>> status and send me the output >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> 2013/9/6 Shashikanth Komandoor >>>>>> >>>>>>> Hi Emmanuel, >>>>>>> >>>>>>> Thank you for your immediate response. I have attached >>>>>>> my /etc/cluster/cluster.conf file with the mail. >>>>>>> >>>>>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>>>>> and the ILO version is 1.28. >>>>>>> >>>>>>> Please let me know if you need any more details. Thanks >>>>>>> in advance. >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >>>>>>> >>>>>>>> Ilo version? >>>>>>>> cluster.conf? >>>>>>>> >>>>>>>> whith this kind of information nobody can't help you >>>>>>>> >>>>>>>> >>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>> >>>>>>>>> Hi Team, >>>>>>>>> >>>>>>>>> I am trying to implement the RHEL 6 Cluster along with >>>>>>>>> fencing using the fence device as hp_ilo. I configured the >>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>> >>>>>>>>> But I am unable to fence the node1. When I am trying to >>>>>>>>> run the command manually using "fence_ilo" form the second node, it is >>>>>>>>> showing as "Unable to connect/login to the fence device". Please suggest me >>>>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>>>> >>>>>>>>> Thanks in advance. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks & Regards, >>>>>>>>> Shashi Kanth.K >>>>>>>>> 9052671936 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Shashi Kanth.K >>>>>>> 9052671936 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 13:49:01 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 19:19:01 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: unable to access the above URL.. On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: > > http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ > > > 2013/9/6 Shashikanth Komandoor > >> The credentials are correct because I am able to login to the hpilo using >> ssh and able power on and power off the OS. >> >> But what user privileges are you talking about? If it is about >> connectivity permissions there are port 22 and 443 opened between ILO IPs >> and the OS IPs. >> >> >> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: >> >>> check the user privileges in your ilo console or your username and >>> password, try to login to your in ssh username at ipilo >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> I tried with fence_ipmilan the below output is shown >>>> *ipmilan: Failed to connect after 20 seconds* >>>> *Chassis power = Unknown* >>>> *Failed* >>>> >>>> >>>> >>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>> adel.benzarrouk at gmail.com> wrote: >>>> >>>>> Hello, >>>>> >>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable the >>>>> "lanplus" and use the agent "fence_ipmilan" instead. >>>>> >>>>> --Adel >>>>> >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura wrote: >>>>> >>>>>> Hello >>>>>> >>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o >>>>>> status and send me the output >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> 2013/9/6 Shashikanth Komandoor >>>>>> >>>>>>> Hi Emmanuel, >>>>>>> >>>>>>> Thank you for your immediate response. I have attached >>>>>>> my /etc/cluster/cluster.conf file with the mail. >>>>>>> >>>>>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>>>>> and the ILO version is 1.28. >>>>>>> >>>>>>> Please let me know if you need any more details. Thanks >>>>>>> in advance. >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura wrote: >>>>>>> >>>>>>>> Ilo version? >>>>>>>> cluster.conf? >>>>>>>> >>>>>>>> whith this kind of information nobody can't help you >>>>>>>> >>>>>>>> >>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>> >>>>>>>>> Hi Team, >>>>>>>>> >>>>>>>>> I am trying to implement the RHEL 6 Cluster along with >>>>>>>>> fencing using the fence device as hp_ilo. I configured the >>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>> >>>>>>>>> But I am unable to fence the node1. When I am trying to >>>>>>>>> run the command manually using "fence_ilo" form the second node, it is >>>>>>>>> showing as "Unable to connect/login to the fence device". Please suggest me >>>>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>>>> >>>>>>>>> Thanks in advance. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks & Regards, >>>>>>>>> Shashi Kanth.K >>>>>>>>> 9052671936 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Shashi Kanth.K >>>>>>> 9052671936 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Sep 6 14:06:22 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 16:06:22 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: Did you tried ipmilan with -d options, if you have nmap scan the ilo port, if i remember you a remote port on ilo, but i don't remember wich port 2013/9/6 Shashikanth Komandoor > unable to access the above URL.. > > > On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: > >> >> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >> >> >> 2013/9/6 Shashikanth Komandoor >> >>> The credentials are correct because I am able to login to the hpilo >>> using ssh and able power on and power off the OS. >>> >>> But what user privileges are you talking about? If it is about >>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>> and the OS IPs. >>> >>> >>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: >>> >>>> check the user privileges in your ilo console or your username and >>>> password, try to login to your in ssh username at ipilo >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> I tried with fence_ipmilan the below output is shown >>>>> *ipmilan: Failed to connect after 20 seconds* >>>>> *Chassis power = Unknown* >>>>> *Failed* >>>>> >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>> adel.benzarrouk at gmail.com> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable the >>>>>> "lanplus" and use the agent "fence_ipmilan" instead. >>>>>> >>>>>> --Adel >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura wrote: >>>>>> >>>>>>> Hello >>>>>>> >>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v -o >>>>>>> status and send me the output >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>> >>>>>>>> Hi Emmanuel, >>>>>>>> >>>>>>>> Thank you for your immediate response. I have attached >>>>>>>> my /etc/cluster/cluster.conf file with the mail. >>>>>>>> >>>>>>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>>>>>> and the ILO version is 1.28. >>>>>>>> >>>>>>>> Please let me know if you need any more details. Thanks >>>>>>>> in advance. >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura >>>>>>> > wrote: >>>>>>>> >>>>>>>>> Ilo version? >>>>>>>>> cluster.conf? >>>>>>>>> >>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>> >>>>>>>>>> Hi Team, >>>>>>>>>> >>>>>>>>>> I am trying to implement the RHEL 6 Cluster along with >>>>>>>>>> fencing using the fence device as hp_ilo. I configured the >>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>> >>>>>>>>>> But I am unable to fence the node1. When I am trying to >>>>>>>>>> run the command manually using "fence_ilo" form the second node, it is >>>>>>>>>> showing as "Unable to connect/login to the fence device". Please suggest me >>>>>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>>>>> >>>>>>>>>> Thanks in advance. >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks & Regards, >>>>>>>>>> Shashi Kanth.K >>>>>>>>>> 9052671936 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks & Regards, >>>>>>>> Shashi Kanth.K >>>>>>>> 9052671936 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 14:19:06 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 19:49:06 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: I tried using ipmilan instead of fence_ipmilan, it showed *-bash: ipmilan: command not found* * * Then I tried fence_ipmilan command using the option -d but it showed *fence_ipmilan: invalid option -- 'd'* On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: > Did you tried ipmilan with -d options, if you have nmap scan the ilo port, > if i remember you a remote port on ilo, but i don't remember wich port > > > 2013/9/6 Shashikanth Komandoor > >> unable to access the above URL.. >> >> >> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: >> >>> >>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> The credentials are correct because I am able to login to the hpilo >>>> using ssh and able power on and power off the OS. >>>> >>>> But what user privileges are you talking about? If it is about >>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>> and the OS IPs. >>>> >>>> >>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: >>>> >>>>> check the user privileges in your ilo console or your username and >>>>> password, try to login to your in ssh username at ipilo >>>>> >>>>> >>>>> 2013/9/6 Shashikanth Komandoor >>>>> >>>>>> I tried with fence_ipmilan the below output is shown >>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>> *Chassis power = Unknown* >>>>>> *Failed* >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable >>>>>>> the "lanplus" and use the agent "fence_ipmilan" instead. >>>>>>> >>>>>>> --Adel >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura wrote: >>>>>>> >>>>>>>> Hello >>>>>>>> >>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v >>>>>>>> -o status and send me the output >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> >>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>> >>>>>>>>> Hi Emmanuel, >>>>>>>>> >>>>>>>>> Thank you for your immediate response. I have attached >>>>>>>>> my /etc/cluster/cluster.conf file with the mail. >>>>>>>>> >>>>>>>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>>>>>>> and the ILO version is 1.28. >>>>>>>>> >>>>>>>>> Please let me know if you need any more details. >>>>>>>>> Thanks in advance. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Ilo version? >>>>>>>>>> cluster.conf? >>>>>>>>>> >>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>> >>>>>>>>>>> Hi Team, >>>>>>>>>>> >>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along with >>>>>>>>>>> fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>> >>>>>>>>>>> But I am unable to fence the node1. When I am trying to >>>>>>>>>>> run the command manually using "fence_ilo" form the second node, it is >>>>>>>>>>> showing as "Unable to connect/login to the fence device". Please suggest me >>>>>>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>>>>>> >>>>>>>>>>> Thanks in advance. >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thanks & Regards, >>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>> 9052671936 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks & Regards, >>>>>>>>> Shashi Kanth.K >>>>>>>>> 9052671936 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Shashi Kanth.K >>>>>> 9052671936 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Sep 6 14:30:33 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 16:30:33 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: rpm -qf /usr/bin/ipmilan OpenIPMI-2.0.16-11.el5_7.2 2013/9/6 Shashikanth Komandoor > I tried using ipmilan instead of fence_ipmilan, it showed > > *-bash: ipmilan: command not found* > * > * > Then I tried fence_ipmilan command using the option -d but it showed > > *fence_ipmilan: invalid option -- 'd'* > > > On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: > >> Did you tried ipmilan with -d options, if you have nmap scan the ilo >> port, if i remember you a remote port on ilo, but i don't remember wich port >> >> >> 2013/9/6 Shashikanth Komandoor >> >>> unable to access the above URL.. >>> >>> >>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: >>> >>>> >>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> The credentials are correct because I am able to login to the hpilo >>>>> using ssh and able power on and power off the OS. >>>>> >>>>> But what user privileges are you talking about? If it is about >>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>> and the OS IPs. >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: >>>>> >>>>>> check the user privileges in your ilo console or your username and >>>>>> password, try to login to your in ssh username at ipilo >>>>>> >>>>>> >>>>>> 2013/9/6 Shashikanth Komandoor >>>>>> >>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>> *Chassis power = Unknown* >>>>>>> *Failed* >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable >>>>>>>> the "lanplus" and use the agent "fence_ipmilan" instead. >>>>>>>> >>>>>>>> --Adel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura >>>>>>> > wrote: >>>>>>>> >>>>>>>>> Hello >>>>>>>>> >>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v >>>>>>>>> -o status and send me the output >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>> >>>>>>>>>> Hi Emmanuel, >>>>>>>>>> >>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>> >>>>>>>>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>>>>>>>> and the ILO version is 1.28. >>>>>>>>>> >>>>>>>>>> Please let me know if you need any more details. >>>>>>>>>> Thanks in advance. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Ilo version? >>>>>>>>>>> cluster.conf? >>>>>>>>>>> >>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>>> >>>>>>>>>>>> Hi Team, >>>>>>>>>>>> >>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along with >>>>>>>>>>>> fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>> >>>>>>>>>>>> But I am unable to fence the node1. When I am trying >>>>>>>>>>>> to run the command manually using "fence_ilo" form the second node, it is >>>>>>>>>>>> showing as "Unable to connect/login to the fence device". Please suggest me >>>>>>>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>> >>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>> 9052671936 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks & Regards, >>>>>>>>>> Shashi Kanth.K >>>>>>>>>> 9052671936 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Shashi Kanth.K >>>>>>> 9052671936 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 14:42:54 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 20:12:54 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: I installed the package and tried to run the command ipmilan using the option -d it showed *Unable to open configuration file '/etc/ipmi_lan.conf'* On Fri, Sep 6, 2013 at 8:00 PM, emmanuel segura wrote: > rpm -qf /usr/bin/ipmilan > OpenIPMI-2.0.16-11.el5_7.2 > > > > 2013/9/6 Shashikanth Komandoor > >> I tried using ipmilan instead of fence_ipmilan, it showed >> >> *-bash: ipmilan: command not found* >> * >> * >> Then I tried fence_ipmilan command using the option -d but it showed >> >> *fence_ipmilan: invalid option -- 'd'* >> >> >> On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: >> >>> Did you tried ipmilan with -d options, if you have nmap scan the ilo >>> port, if i remember you a remote port on ilo, but i don't remember wich port >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> unable to access the above URL.. >>>> >>>> >>>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: >>>> >>>>> >>>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>>> >>>>> >>>>> 2013/9/6 Shashikanth Komandoor >>>>> >>>>>> The credentials are correct because I am able to login to the hpilo >>>>>> using ssh and able power on and power off the OS. >>>>>> >>>>>> But what user privileges are you talking about? If it is about >>>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>>> and the OS IPs. >>>>>> >>>>>> >>>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: >>>>>> >>>>>>> check the user privileges in your ilo console or your username and >>>>>>> password, try to login to your in ssh username at ipilo >>>>>>> >>>>>>> >>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>> >>>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>>> *Chassis power = Unknown* >>>>>>>> *Failed* >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable >>>>>>>>> the "lanplus" and use the agent "fence_ipmilan" instead. >>>>>>>>> >>>>>>>>> --Adel >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura < >>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hello >>>>>>>>>> >>>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword -v >>>>>>>>>> -o status and send me the output >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>> >>>>>>>>>>> Hi Emmanuel, >>>>>>>>>>> >>>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>>> >>>>>>>>>>> And I am working over the *"HP Proliant BL685c G7" *blades >>>>>>>>>>> and the ILO version is 1.28. >>>>>>>>>>> >>>>>>>>>>> Please let me know if you need any more details. >>>>>>>>>>> Thanks in advance. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Ilo version? >>>>>>>>>>>> cluster.conf? >>>>>>>>>>>> >>>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>>> > >>>>>>>>>>>> >>>>>>>>>>>>> Hi Team, >>>>>>>>>>>>> >>>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along >>>>>>>>>>>>> with fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>>> >>>>>>>>>>>>> But I am unable to fence the node1. When I am trying >>>>>>>>>>>>> to run the command manually using "fence_ilo" form the second node, it is >>>>>>>>>>>>> showing as "Unable to connect/login to the fence device". Please suggest me >>>>>>>>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thanks & Regards, >>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>> 9052671936 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks & Regards, >>>>>>>> Shashi Kanth.K >>>>>>>> 9052671936 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Shashi Kanth.K >>>>>> 9052671936 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Sep 6 14:56:55 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 16:56:55 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: man ipmilan 2013/9/6 Shashikanth Komandoor > I installed the package and tried to run the command ipmilan using the > option -d it showed > *Unable to open configuration file '/etc/ipmi_lan.conf'* > > > On Fri, Sep 6, 2013 at 8:00 PM, emmanuel segura wrote: > >> rpm -qf /usr/bin/ipmilan >> OpenIPMI-2.0.16-11.el5_7.2 >> >> >> >> 2013/9/6 Shashikanth Komandoor >> >>> I tried using ipmilan instead of fence_ipmilan, it showed >>> >>> *-bash: ipmilan: command not found* >>> * >>> * >>> Then I tried fence_ipmilan command using the option -d but it showed >>> >>> *fence_ipmilan: invalid option -- 'd'* >>> >>> >>> On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: >>> >>>> Did you tried ipmilan with -d options, if you have nmap scan the ilo >>>> port, if i remember you a remote port on ilo, but i don't remember wich port >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> unable to access the above URL.. >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: >>>>> >>>>>> >>>>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>>>> >>>>>> >>>>>> 2013/9/6 Shashikanth Komandoor >>>>>> >>>>>>> The credentials are correct because I am able to login to the hpilo >>>>>>> using ssh and able power on and power off the OS. >>>>>>> >>>>>>> But what user privileges are you talking about? If it is about >>>>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>>>> and the OS IPs. >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: >>>>>>> >>>>>>>> check the user privileges in your ilo console or your username and >>>>>>>> password, try to login to your in ssh username at ipilo >>>>>>>> >>>>>>>> >>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>> >>>>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>>>> *Chassis power = Unknown* >>>>>>>>> *Failed* >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable >>>>>>>>>> the "lanplus" and use the agent "fence_ipmilan" instead. >>>>>>>>>> >>>>>>>>>> --Adel >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura < >>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hello >>>>>>>>>>> >>>>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword >>>>>>>>>>> -v -o status and send me the output >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>>> >>>>>>>>>>>> Hi Emmanuel, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>>>> >>>>>>>>>>>> And I am working over the *"HP Proliant BL685c G7" >>>>>>>>>>>> *blades and the ILO version is 1.28. >>>>>>>>>>>> >>>>>>>>>>>> Please let me know if you need any more details. >>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Ilo version? >>>>>>>>>>>>> cluster.conf? >>>>>>>>>>>>> >>>>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Team, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along >>>>>>>>>>>>>> with fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>>>> >>>>>>>>>>>>>> But I am unable to fence the node1. When I am trying >>>>>>>>>>>>>> to run the command manually using "fence_ilo" form the second node, it is >>>>>>>>>>>>>> showing as "Unable to connect/login to the fence device". Please suggest me >>>>>>>>>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>> 9052671936 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks & Regards, >>>>>>>>> Shashi Kanth.K >>>>>>>>> 9052671936 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Shashi Kanth.K >>>>>>> 9052671936 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From adel.benzarrouk at gmail.com Fri Sep 6 15:01:56 2013 From: adel.benzarrouk at gmail.com (Adel Ben Zarrouk) Date: Fri, 6 Sep 2013 16:01:56 +0100 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: Please try to test using the following command; fence_ipmilan -a IP -p PASSWORD -P -l Administrator -o reboot On Fri, Sep 6, 2013 at 3:42 PM, Shashikanth Komandoor < shashikanth.komandoor at gmail.com> wrote: > I installed the package and tried to run the command ipmilan using the > option -d it showed > *Unable to open configuration file '/etc/ipmi_lan.conf'* > > > On Fri, Sep 6, 2013 at 8:00 PM, emmanuel segura wrote: > >> rpm -qf /usr/bin/ipmilan >> OpenIPMI-2.0.16-11.el5_7.2 >> >> >> >> 2013/9/6 Shashikanth Komandoor >> >>> I tried using ipmilan instead of fence_ipmilan, it showed >>> >>> *-bash: ipmilan: command not found* >>> * >>> * >>> Then I tried fence_ipmilan command using the option -d but it showed >>> >>> *fence_ipmilan: invalid option -- 'd'* >>> >>> >>> On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: >>> >>>> Did you tried ipmilan with -d options, if you have nmap scan the ilo >>>> port, if i remember you a remote port on ilo, but i don't remember wich port >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> unable to access the above URL.. >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: >>>>> >>>>>> >>>>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>>>> >>>>>> >>>>>> 2013/9/6 Shashikanth Komandoor >>>>>> >>>>>>> The credentials are correct because I am able to login to the hpilo >>>>>>> using ssh and able power on and power off the OS. >>>>>>> >>>>>>> But what user privileges are you talking about? If it is about >>>>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>>>> and the OS IPs. >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura wrote: >>>>>>> >>>>>>>> check the user privileges in your ilo console or your username and >>>>>>>> password, try to login to your in ssh username at ipilo >>>>>>>> >>>>>>>> >>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>> >>>>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>>>> *Chassis power = Unknown* >>>>>>>>> *Failed* >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must enable >>>>>>>>>> the "lanplus" and use the agent "fence_ipmilan" instead. >>>>>>>>>> >>>>>>>>>> --Adel >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura < >>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hello >>>>>>>>>>> >>>>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword >>>>>>>>>>> -v -o status and send me the output >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>>> >>>>>>>>>>>> Hi Emmanuel, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>>>> >>>>>>>>>>>> And I am working over the *"HP Proliant BL685c G7" >>>>>>>>>>>> *blades and the ILO version is 1.28. >>>>>>>>>>>> >>>>>>>>>>>> Please let me know if you need any more details. >>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Ilo version? >>>>>>>>>>>>> cluster.conf? >>>>>>>>>>>>> >>>>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Team, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along >>>>>>>>>>>>>> with fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>>>> >>>>>>>>>>>>>> But I am unable to fence the node1. When I am trying >>>>>>>>>>>>>> to run the command manually using "fence_ilo" form the second node, it is >>>>>>>>>>>>>> showing as "Unable to connect/login to the fence device". Please suggest me >>>>>>>>>>>>>> accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>> 9052671936 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks & Regards, >>>>>>>>> Shashi Kanth.K >>>>>>>>> 9052671936 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Shashi Kanth.K >>>>>>> 9052671936 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 15:11:44 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 20:41:44 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: I ran but the following output came: *Rebooting machine @ IPMI:...Failed* On Fri, Sep 6, 2013 at 8:31 PM, Adel Ben Zarrouk wrote: > Please try to test using the following command; > > fence_ipmilan -a IP -p PASSWORD -P -l Administrator -o reboot > > > On Fri, Sep 6, 2013 at 3:42 PM, Shashikanth Komandoor < > shashikanth.komandoor at gmail.com> wrote: > >> I installed the package and tried to run the command ipmilan using the >> option -d it showed >> *Unable to open configuration file '/etc/ipmi_lan.conf'* >> >> >> On Fri, Sep 6, 2013 at 8:00 PM, emmanuel segura wrote: >> >>> rpm -qf /usr/bin/ipmilan >>> OpenIPMI-2.0.16-11.el5_7.2 >>> >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> I tried using ipmilan instead of fence_ipmilan, it showed >>>> >>>> *-bash: ipmilan: command not found* >>>> * >>>> * >>>> Then I tried fence_ipmilan command using the option -d but it showed >>>> >>>> *fence_ipmilan: invalid option -- 'd'* >>>> >>>> >>>> On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: >>>> >>>>> Did you tried ipmilan with -d options, if you have nmap scan the ilo >>>>> port, if i remember you a remote port on ilo, but i don't remember wich port >>>>> >>>>> >>>>> 2013/9/6 Shashikanth Komandoor >>>>> >>>>>> unable to access the above URL.. >>>>>> >>>>>> >>>>>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: >>>>>> >>>>>>> >>>>>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>>>>> >>>>>>> >>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>> >>>>>>>> The credentials are correct because I am able to login to the hpilo >>>>>>>> using ssh and able power on and power off the OS. >>>>>>>> >>>>>>>> But what user privileges are you talking about? If it is about >>>>>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>>>>> and the OS IPs. >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura >>>>>>> > wrote: >>>>>>>> >>>>>>>>> check the user privileges in your ilo console or your username and >>>>>>>>> password, try to login to your in ssh username at ipilo >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>> >>>>>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>>>>> *Chassis power = Unknown* >>>>>>>>>> *Failed* >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must >>>>>>>>>>> enable the "lanplus" and use the agent "fence_ipmilan" instead. >>>>>>>>>>> >>>>>>>>>>> --Adel >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura < >>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello >>>>>>>>>>>> >>>>>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword >>>>>>>>>>>> -v -o status and send me the output >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>>> > >>>>>>>>>>>> >>>>>>>>>>>>> Hi Emmanuel, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>>>>> >>>>>>>>>>>>> And I am working over the *"HP Proliant BL685c >>>>>>>>>>>>> G7" *blades and the ILO version is 1.28. >>>>>>>>>>>>> >>>>>>>>>>>>> Please let me know if you need any more details. >>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Ilo version? >>>>>>>>>>>>>> cluster.conf? >>>>>>>>>>>>>> >>>>>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Team, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along >>>>>>>>>>>>>>> with fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> But I am unable to fence the node1. When I am >>>>>>>>>>>>>>> trying to run the command manually using "fence_ilo" form the second node, >>>>>>>>>>>>>>> it is showing as "Unable to connect/login to the fence device". Please >>>>>>>>>>>>>>> suggest me accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks & Regards, >>>>>>>>>> Shashi Kanth.K >>>>>>>>>> 9052671936 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks & Regards, >>>>>>>> Shashi Kanth.K >>>>>>>> 9052671936 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Shashi Kanth.K >>>>>> 9052671936 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Fri Sep 6 15:13:58 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Fri, 6 Sep 2013 20:43:58 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: I could not get any point from man. Can you suggest anything? Do we need to create that file? Do we need to provide any entries in that file? It has not automatically created. On Fri, Sep 6, 2013 at 8:26 PM, emmanuel segura wrote: > man ipmilan > > > 2013/9/6 Shashikanth Komandoor > >> I installed the package and tried to run the command ipmilan using the >> option -d it showed >> *Unable to open configuration file '/etc/ipmi_lan.conf'* >> >> >> On Fri, Sep 6, 2013 at 8:00 PM, emmanuel segura wrote: >> >>> rpm -qf /usr/bin/ipmilan >>> OpenIPMI-2.0.16-11.el5_7.2 >>> >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> I tried using ipmilan instead of fence_ipmilan, it showed >>>> >>>> *-bash: ipmilan: command not found* >>>> * >>>> * >>>> Then I tried fence_ipmilan command using the option -d but it showed >>>> >>>> *fence_ipmilan: invalid option -- 'd'* >>>> >>>> >>>> On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: >>>> >>>>> Did you tried ipmilan with -d options, if you have nmap scan the ilo >>>>> port, if i remember you a remote port on ilo, but i don't remember wich port >>>>> >>>>> >>>>> 2013/9/6 Shashikanth Komandoor >>>>> >>>>>> unable to access the above URL.. >>>>>> >>>>>> >>>>>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: >>>>>> >>>>>>> >>>>>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>>>>> >>>>>>> >>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>> >>>>>>>> The credentials are correct because I am able to login to the hpilo >>>>>>>> using ssh and able power on and power off the OS. >>>>>>>> >>>>>>>> But what user privileges are you talking about? If it is about >>>>>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>>>>> and the OS IPs. >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura >>>>>>> > wrote: >>>>>>>> >>>>>>>>> check the user privileges in your ilo console or your username and >>>>>>>>> password, try to login to your in ssh username at ipilo >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>> >>>>>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>>>>> *Chassis power = Unknown* >>>>>>>>>> *Failed* >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must >>>>>>>>>>> enable the "lanplus" and use the agent "fence_ipmilan" instead. >>>>>>>>>>> >>>>>>>>>>> --Adel >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura < >>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello >>>>>>>>>>>> >>>>>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword >>>>>>>>>>>> -v -o status and send me the output >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>>> > >>>>>>>>>>>> >>>>>>>>>>>>> Hi Emmanuel, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>>>>> >>>>>>>>>>>>> And I am working over the *"HP Proliant BL685c >>>>>>>>>>>>> G7" *blades and the ILO version is 1.28. >>>>>>>>>>>>> >>>>>>>>>>>>> Please let me know if you need any more details. >>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Ilo version? >>>>>>>>>>>>>> cluster.conf? >>>>>>>>>>>>>> >>>>>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Team, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along >>>>>>>>>>>>>>> with fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> But I am unable to fence the node1. When I am >>>>>>>>>>>>>>> trying to run the command manually using "fence_ilo" form the second node, >>>>>>>>>>>>>>> it is showing as "Unable to connect/login to the fence device". Please >>>>>>>>>>>>>>> suggest me accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks & Regards, >>>>>>>>>> Shashi Kanth.K >>>>>>>>>> 9052671936 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks & Regards, >>>>>>>> Shashi Kanth.K >>>>>>>> 9052671936 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Shashi Kanth.K >>>>>> 9052671936 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Sep 6 15:20:34 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 6 Sep 2013 17:20:34 +0200 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: fence_ipmilan -A password -a IP -p PASSWORD -P -l Administrator -o reboot -v man fence_ipmilan When you use a command and it doesn't work, read the man pages and use different options 2013/9/6 Shashikanth Komandoor > I ran but the following output came: > *Rebooting machine @ IPMI:...Failed* > > > On Fri, Sep 6, 2013 at 8:31 PM, Adel Ben Zarrouk < > adel.benzarrouk at gmail.com> wrote: > >> Please try to test using the following command; >> >> fence_ipmilan -a IP -p PASSWORD -P -l Administrator -o reboot >> >> >> On Fri, Sep 6, 2013 at 3:42 PM, Shashikanth Komandoor < >> shashikanth.komandoor at gmail.com> wrote: >> >>> I installed the package and tried to run the command ipmilan using the >>> option -d it showed >>> *Unable to open configuration file '/etc/ipmi_lan.conf'* >>> >>> >>> On Fri, Sep 6, 2013 at 8:00 PM, emmanuel segura wrote: >>> >>>> rpm -qf /usr/bin/ipmilan >>>> OpenIPMI-2.0.16-11.el5_7.2 >>>> >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> I tried using ipmilan instead of fence_ipmilan, it showed >>>>> >>>>> *-bash: ipmilan: command not found* >>>>> * >>>>> * >>>>> Then I tried fence_ipmilan command using the option -d but it showed >>>>> >>>>> *fence_ipmilan: invalid option -- 'd'* >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: >>>>> >>>>>> Did you tried ipmilan with -d options, if you have nmap scan the ilo >>>>>> port, if i remember you a remote port on ilo, but i don't remember wich port >>>>>> >>>>>> >>>>>> 2013/9/6 Shashikanth Komandoor >>>>>> >>>>>>> unable to access the above URL.. >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: >>>>>>> >>>>>>>> >>>>>>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>>>>>> >>>>>>>> >>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>> >>>>>>>>> The credentials are correct because I am able to login to the >>>>>>>>> hpilo using ssh and able power on and power off the OS. >>>>>>>>> >>>>>>>>> But what user privileges are you talking about? If it is about >>>>>>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>>>>>> and the OS IPs. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura < >>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> check the user privileges in your ilo console or your username >>>>>>>>>> and password, try to login to your in ssh username at ipilo >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>> >>>>>>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>>>>>> *Chassis power = Unknown* >>>>>>>>>>> *Failed* >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must >>>>>>>>>>>> enable the "lanplus" and use the agent "fence_ipmilan" instead. >>>>>>>>>>>> >>>>>>>>>>>> --Adel >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura < >>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello >>>>>>>>>>>>> >>>>>>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword >>>>>>>>>>>>> -v -o status and send me the output >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Emmanuel, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>>>>>> >>>>>>>>>>>>>> And I am working over the *"HP Proliant BL685c >>>>>>>>>>>>>> G7" *blades and the ILO version is 1.28. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please let me know if you need any more details. >>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ilo version? >>>>>>>>>>>>>>> cluster.conf? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Team, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along >>>>>>>>>>>>>>>> with fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> But I am unable to fence the node1. When I am >>>>>>>>>>>>>>>> trying to run the command manually using "fence_ilo" form the second node, >>>>>>>>>>>>>>>> it is showing as "Unable to connect/login to the fence device". Please >>>>>>>>>>>>>>>> suggest me accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thanks & Regards, >>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>> 9052671936 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks & Regards, >>>>>>>>> Shashi Kanth.K >>>>>>>>> 9052671936 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Shashi Kanth.K >>>>>>> 9052671936 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From adel.benzarrouk at gmail.com Fri Sep 6 15:22:42 2013 From: adel.benzarrouk at gmail.com (Adel Ben Zarrouk) Date: Fri, 6 Sep 2013 16:22:42 +0100 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: Please make sure the following: -The user Administrator have the right to execute such command -The host can ping the ip @ of the ilo -You can ssh the ilo from the host -I suggest to create another ilo user with full access for testing purpose Thanks --Adel On Fri, Sep 6, 2013 at 4:13 PM, Shashikanth Komandoor < shashikanth.komandoor at gmail.com> wrote: > I could not get any point from man. Can you suggest anything? Do we need > to create that file? Do we need to provide any entries in that file? It has > not automatically created. > > > On Fri, Sep 6, 2013 at 8:26 PM, emmanuel segura wrote: > >> man ipmilan >> >> >> 2013/9/6 Shashikanth Komandoor >> >>> I installed the package and tried to run the command ipmilan using the >>> option -d it showed >>> *Unable to open configuration file '/etc/ipmi_lan.conf'* >>> >>> >>> On Fri, Sep 6, 2013 at 8:00 PM, emmanuel segura wrote: >>> >>>> rpm -qf /usr/bin/ipmilan >>>> OpenIPMI-2.0.16-11.el5_7.2 >>>> >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> I tried using ipmilan instead of fence_ipmilan, it showed >>>>> >>>>> *-bash: ipmilan: command not found* >>>>> * >>>>> * >>>>> Then I tried fence_ipmilan command using the option -d but it showed >>>>> >>>>> *fence_ipmilan: invalid option -- 'd'* >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: >>>>> >>>>>> Did you tried ipmilan with -d options, if you have nmap scan the ilo >>>>>> port, if i remember you a remote port on ilo, but i don't remember wich port >>>>>> >>>>>> >>>>>> 2013/9/6 Shashikanth Komandoor >>>>>> >>>>>>> unable to access the above URL.. >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura wrote: >>>>>>> >>>>>>>> >>>>>>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>>>>>> >>>>>>>> >>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>> >>>>>>>>> The credentials are correct because I am able to login to the >>>>>>>>> hpilo using ssh and able power on and power off the OS. >>>>>>>>> >>>>>>>>> But what user privileges are you talking about? If it is about >>>>>>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>>>>>> and the OS IPs. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura < >>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> check the user privileges in your ilo console or your username >>>>>>>>>> and password, try to login to your in ssh username at ipilo >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>> >>>>>>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>>>>>> *Chassis power = Unknown* >>>>>>>>>>> *Failed* >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must >>>>>>>>>>>> enable the "lanplus" and use the agent "fence_ipmilan" instead. >>>>>>>>>>>> >>>>>>>>>>>> --Adel >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura < >>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello >>>>>>>>>>>>> >>>>>>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p ilopassword >>>>>>>>>>>>> -v -o status and send me the output >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Emmanuel, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>>>>>> >>>>>>>>>>>>>> And I am working over the *"HP Proliant BL685c >>>>>>>>>>>>>> G7" *blades and the ILO version is 1.28. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please let me know if you need any more details. >>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ilo version? >>>>>>>>>>>>>>> cluster.conf? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Team, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along >>>>>>>>>>>>>>>> with fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> But I am unable to fence the node1. When I am >>>>>>>>>>>>>>>> trying to run the command manually using "fence_ilo" form the second node, >>>>>>>>>>>>>>>> it is showing as "Unable to connect/login to the fence device". Please >>>>>>>>>>>>>>>> suggest me accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thanks & Regards, >>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>> 9052671936 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks & Regards, >>>>>>>>> Shashi Kanth.K >>>>>>>>> 9052671936 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Shashi Kanth.K >>>>>>> 9052671936 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shashikanth.komandoor at gmail.com Sat Sep 7 04:45:26 2013 From: shashikanth.komandoor at gmail.com (Shashikanth Komandoor) Date: Sat, 7 Sep 2013 10:15:26 +0530 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: --The user Administrator has the right to execute that command --The host can ping the ilo ip address --I am able to ssh to the ilo ip from the host and i can even power off the machine. Do you still ask me to create one more user? On Fri, Sep 6, 2013 at 8:52 PM, Adel Ben Zarrouk wrote: > Please make sure the following: > > -The user Administrator have the right to execute such command > -The host can ping the ip @ of the ilo > -You can ssh the ilo from the host > -I suggest to create another ilo user with full access for testing purpose > > Thanks > > --Adel > > > > On Fri, Sep 6, 2013 at 4:13 PM, Shashikanth Komandoor < > shashikanth.komandoor at gmail.com> wrote: > >> I could not get any point from man. Can you suggest anything? Do we need >> to create that file? Do we need to provide any entries in that file? It has >> not automatically created. >> >> >> On Fri, Sep 6, 2013 at 8:26 PM, emmanuel segura wrote: >> >>> man ipmilan >>> >>> >>> 2013/9/6 Shashikanth Komandoor >>> >>>> I installed the package and tried to run the command ipmilan using the >>>> option -d it showed >>>> *Unable to open configuration file '/etc/ipmi_lan.conf'* >>>> >>>> >>>> On Fri, Sep 6, 2013 at 8:00 PM, emmanuel segura wrote: >>>> >>>>> rpm -qf /usr/bin/ipmilan >>>>> OpenIPMI-2.0.16-11.el5_7.2 >>>>> >>>>> >>>>> >>>>> 2013/9/6 Shashikanth Komandoor >>>>> >>>>>> I tried using ipmilan instead of fence_ipmilan, it showed >>>>>> >>>>>> *-bash: ipmilan: command not found* >>>>>> * >>>>>> * >>>>>> Then I tried fence_ipmilan command using the option -d but it showed >>>>>> >>>>>> *fence_ipmilan: invalid option -- 'd'* >>>>>> >>>>>> >>>>>> On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: >>>>>> >>>>>>> Did you tried ipmilan with -d options, if you have nmap scan the ilo >>>>>>> port, if i remember you a remote port on ilo, but i don't remember wich port >>>>>>> >>>>>>> >>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>> >>>>>>>> unable to access the above URL.. >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura >>>>>>> > wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>> >>>>>>>>>> The credentials are correct because I am able to login to the >>>>>>>>>> hpilo using ssh and able power on and power off the OS. >>>>>>>>>> >>>>>>>>>> But what user privileges are you talking about? If it is about >>>>>>>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>>>>>>> and the OS IPs. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura < >>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> check the user privileges in your ilo console or your username >>>>>>>>>>> and password, try to login to your in ssh username at ipilo >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>>> >>>>>>>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>>>>>>> *Chassis power = Unknown* >>>>>>>>>>>> *Failed* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello, >>>>>>>>>>>>> >>>>>>>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must >>>>>>>>>>>>> enable the "lanplus" and use the agent "fence_ipmilan" >>>>>>>>>>>>> instead. >>>>>>>>>>>>> >>>>>>>>>>>>> --Adel >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura < >>>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello >>>>>>>>>>>>>> >>>>>>>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p >>>>>>>>>>>>>> ilopassword -v -o status and send me the output >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Emmanuel, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And I am working over the *"HP Proliant BL685c >>>>>>>>>>>>>>> G7" *blades and the ILO version is 1.28. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please let me know if you need any more details. >>>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ilo version? >>>>>>>>>>>>>>>> cluster.conf? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Team, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster along >>>>>>>>>>>>>>>>> with fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> But I am unable to fence the node1. When I am >>>>>>>>>>>>>>>>> trying to run the command manually using "fence_ilo" form the second node, >>>>>>>>>>>>>>>>> it is showing as "Unable to connect/login to the fence device". Please >>>>>>>>>>>>>>>>> suggest me accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>> 9052671936 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks & Regards, >>>>>>>>>> Shashi Kanth.K >>>>>>>>>> 9052671936 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks & Regards, >>>>>>>> Shashi Kanth.K >>>>>>>> 9052671936 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Shashi Kanth.K >>>>>> 9052671936 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Shashi Kanth.K >>>> 9052671936 >>>> >>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> >> -- >> Thanks & Regards, >> Shashi Kanth.K >> 9052671936 >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Thanks & Regards, Shashi Kanth.K 9052671936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From adel.benzarrouk at gmail.com Sat Sep 7 06:04:26 2013 From: adel.benzarrouk at gmail.com (Adel Ben Zarrouk) Date: Sat, 7 Sep 2013 07:04:26 +0100 Subject: [Linux-cluster] Issue regarding RHEL 6 Cluster Fencing In-Reply-To: References: Message-ID: Please attach the sos report to check with you the config and logs On Sep 7, 2013 5:53 AM, "Shashikanth Komandoor" < shashikanth.komandoor at gmail.com> wrote: > --The user Administrator has the right to execute that command > --The host can ping the ilo ip address > --I am able to ssh to the ilo ip from the host and i can even power off > the machine. > > Do you still ask me to create one more user? > > > On Fri, Sep 6, 2013 at 8:52 PM, Adel Ben Zarrouk < > adel.benzarrouk at gmail.com> wrote: > >> Please make sure the following: >> >> -The user Administrator have the right to execute such command >> -The host can ping the ip @ of the ilo >> -You can ssh the ilo from the host >> -I suggest to create another ilo user with full access for testing purpose >> >> Thanks >> >> --Adel >> >> >> >> On Fri, Sep 6, 2013 at 4:13 PM, Shashikanth Komandoor < >> shashikanth.komandoor at gmail.com> wrote: >> >>> I could not get any point from man. Can you suggest anything? Do we need >>> to create that file? Do we need to provide any entries in that file? It has >>> not automatically created. >>> >>> >>> On Fri, Sep 6, 2013 at 8:26 PM, emmanuel segura wrote: >>> >>>> man ipmilan >>>> >>>> >>>> 2013/9/6 Shashikanth Komandoor >>>> >>>>> I installed the package and tried to run the command ipmilan using the >>>>> option -d it showed >>>>> *Unable to open configuration file '/etc/ipmi_lan.conf'* >>>>> >>>>> >>>>> On Fri, Sep 6, 2013 at 8:00 PM, emmanuel segura wrote: >>>>> >>>>>> rpm -qf /usr/bin/ipmilan >>>>>> OpenIPMI-2.0.16-11.el5_7.2 >>>>>> >>>>>> >>>>>> >>>>>> 2013/9/6 Shashikanth Komandoor >>>>>> >>>>>>> I tried using ipmilan instead of fence_ipmilan, it showed >>>>>>> >>>>>>> *-bash: ipmilan: command not found* >>>>>>> * >>>>>>> * >>>>>>> Then I tried fence_ipmilan command using the option -d but it showed >>>>>>> >>>>>>> *fence_ipmilan: invalid option -- 'd'* >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 6, 2013 at 7:36 PM, emmanuel segura wrote: >>>>>>> >>>>>>>> Did you tried ipmilan with -d options, if you have nmap scan the >>>>>>>> ilo port, if i remember you a remote port on ilo, but i don't remember wich >>>>>>>> port >>>>>>>> >>>>>>>> >>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>> >>>>>>>>> unable to access the above URL.. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Sep 6, 2013 at 7:02 PM, emmanuel segura < >>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://vinternals.wordpress.com/2009/08/22/create-users-with-the-hp-ilo-cli/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>> >>>>>>>>>>> The credentials are correct because I am able to login to the >>>>>>>>>>> hpilo using ssh and able power on and power off the OS. >>>>>>>>>>> >>>>>>>>>>> But what user privileges are you talking about? If it is about >>>>>>>>>>> connectivity permissions there are port 22 and 443 opened between ILO IPs >>>>>>>>>>> and the OS IPs. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 6, 2013 at 6:35 PM, emmanuel segura < >>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> check the user privileges in your ilo console or your username >>>>>>>>>>>> and password, try to login to your in ssh username at ipilo >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor >>>>>>>>>>> > >>>>>>>>>>>> >>>>>>>>>>>>> I tried with fence_ipmilan the below output is shown >>>>>>>>>>>>> *ipmilan: Failed to connect after 20 seconds* >>>>>>>>>>>>> *Chassis power = Unknown* >>>>>>>>>>>>> *Failed* >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Sep 6, 2013 at 6:12 PM, Adel Ben Zarrouk < >>>>>>>>>>>>> adel.benzarrouk at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>> >>>>>>>>>>>>>> The *HP Proliant BL685c G7 *comes with ILO3 ,so, you must >>>>>>>>>>>>>> enable the "lanplus" and use the agent "fence_ipmilan" >>>>>>>>>>>>>> instead. >>>>>>>>>>>>>> >>>>>>>>>>>>>> --Adel >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Sep 6, 2013 at 1:28 PM, emmanuel segura < >>>>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hello >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Try with fence_ilo -a ilo_ipaddress -l username -p >>>>>>>>>>>>>>> ilopassword -v -o status and send me the output >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Emmanuel, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you for your immediate response. I have >>>>>>>>>>>>>>>> attached my /etc/cluster/cluster.conf file with the mail. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And I am working over the *"HP Proliant BL685c >>>>>>>>>>>>>>>> G7" *blades and the ILO version is 1.28. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please let me know if you need any more >>>>>>>>>>>>>>>> details. Thanks in advance. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Sep 6, 2013 at 2:40 PM, emmanuel segura < >>>>>>>>>>>>>>>> emi2fast at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Ilo version? >>>>>>>>>>>>>>>>> cluster.conf? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> whith this kind of information nobody can't help you >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2013/9/6 Shashikanth Komandoor < >>>>>>>>>>>>>>>>> shashikanth.komandoor at gmail.com> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Team, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am trying to implement the RHEL 6 Cluster >>>>>>>>>>>>>>>>>> along with fencing using the fence device as hp_ilo. I configured the >>>>>>>>>>>>>>>>>> /etc/cluster/cluster.conf properly with the proper credentials. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> But I am unable to fence the node1. When I am >>>>>>>>>>>>>>>>>> trying to run the command manually using "fence_ilo" form the second node, >>>>>>>>>>>>>>>>>> it is showing as "Unable to connect/login to the fence device". Please >>>>>>>>>>>>>>>>>> suggest me accordingly so that I can work over the RHEL cluster. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks in advance. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Thanks & Regards, >>>>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>>>> 9052671936 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thanks & Regards, >>>>>>>>>>> Shashi Kanth.K >>>>>>>>>>> 9052671936 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Linux-cluster mailing list >>>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Linux-cluster mailing list >>>>>>>>>> Linux-cluster at redhat.com >>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks & Regards, >>>>>>>>> Shashi Kanth.K >>>>>>>>> 9052671936 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Linux-cluster mailing list >>>>>>>>> Linux-cluster at redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>>>> >>>>>>>> -- >>>>>>>> Linux-cluster mailing list >>>>>>>> Linux-cluster at redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> Shashi Kanth.K >>>>>>> 9052671936 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> esta es mi vida e me la vivo hasta que dios quiera >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Shashi Kanth.K >>>>> 9052671936 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> >>>> -- >>>> esta es mi vida e me la vivo hasta que dios quiera >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> Shashi Kanth.K >>> 9052671936 >>> >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Thanks & Regards, > Shashi Kanth.K > 9052671936 > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsosic at srce.hr Sun Sep 8 02:49:29 2013 From: jsosic at srce.hr (Jakov Sosic) Date: Sun, 08 Sep 2013 04:49:29 +0200 Subject: [Linux-cluster] BUG in PID dir creation??? Message-ID: <522BE5B9.5040706@srce.hr> Hi. I'm running CentOS 6.4 with resource-agents-3.9.2-21.el6_4.3.x86_64 I've noticed a problem that resource agents don't create pid directories correctly. Let's take samba for example. So, samba resource agent in start() function calls function 'create_pid_directory', whose body is in file: /usr/share/cluster/utils/config-utils.sh and relevant lines looks like this: declare RA_COMMON_pid_dir=/var/run/cluster create_pid_directory() { declare program_name="$(basename $0 | sed 's/^\(.*\)\..*/\1/')" declare dirname="$RA_COMMON_pid_dir/$program_name" if [ -d "$dirname" ]; then return 0; fi chmod 711 "$RA_COMMON_pid_dir" mkdir -p "$dirname" return 0; } So, if a samba.sh resource agent calls create_pid_directory(), dirname variable will have the value of '/var/run/cluster/samba', thus, that directory will be created. BUT, samba.sh resource agent also has the following line: declare SAMBA_pid_dir="`generate_name_for_pid_dir`" Looking at generate_name_for_pid_dir() function, it's obvious that the value of SAMBA_pid_dir is '/var/run/cluster/samba/samba:${name}'. For this snippet of cluster.conf: SAMBA_pid_dir is /var/run/cluster/samba/samba:ha_smb That means that service won't be able to put pid file in that directory because directory does not exit... create_pid_directory() creates only the parent directory, so the user has to create it manually. I think this is a huge bug, but am really mystified how did it survive for so long? The solution is pretty straightforward and simple: --- /usr/share/cluster/utils/config-utils.sh.orig 2013-09-08 04:48:47.000000000 +0200 +++ /usr/share/cluster/utils/config-utils.sh 2013-09-08 04:49:02.000000000 +0200 @@ -230,7 +230,7 @@ fi chmod 711 "$RA_COMMON_pid_dir" - mkdir -p "$dirname" + mkdir -p "$dirname/$OCF_RESOURCE_INSTANCE" if [ "$program_name" = "mysql" ]; then chown mysql.root "$dirname" Should I file a bug report ?! -- Jakov Sosic www.srce.unizg.hr From jsosic at srce.hr Sun Sep 8 02:52:41 2013 From: jsosic at srce.hr (Jakov Sosic) Date: Sun, 08 Sep 2013 04:52:41 +0200 Subject: [Linux-cluster] BUG in PID dir creation??? In-Reply-To: <522BE5B9.5040706@srce.hr> References: <522BE5B9.5040706@srce.hr> Message-ID: <522BE679.6060302@srce.hr> On 09/08/2013 04:49 AM, Jakov Sosic wrote: > Should I file a bug report ?! This is a fixed patch: --- /usr/share/cluster/utils/config-utils.sh.orig 2013-09-08 04:48:47.000000000 +0200 +++ /usr/share/cluster/utils/config-utils.sh 2013-09-08 04:51:31.000000000 +0200 @@ -225,12 +225,12 @@ declare program_name="$(basename $0 | sed 's/^\(.*\)\..*/\1/')" declare dirname="$RA_COMMON_pid_dir/$program_name" - if [ -d "$dirname" ]; then + if [ -d "$dirname/$OCF_RESOURCE_INSTANCE" ]; then return 0; fi chmod 711 "$RA_COMMON_pid_dir" - mkdir -p "$dirname" + mkdir -p "$dirname/$OCF_RESOURCE_INSTANCE" if [ "$program_name" = "mysql" ]; then chown mysql.root "$dirname" -- Jakov Sosic www.srce.unizg.hr From pascal at hacksrus.net Wed Sep 11 11:03:07 2013 From: pascal at hacksrus.net (Pascal Ehlert) Date: Wed, 11 Sep 2013 13:03:07 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot Message-ID: <52304DEB.9090806@hacksrus.net> Hi, I have recently setup an HA cluster with two nodes, IPMI based fencing and no quorum disk. Things worked nicely during the first tests, but to my very annoyance it blew up last night when I did another test of shutting down the network interface on my secondary node (node 2). The node was fenced as expected and came back online. This however resulted in an immediate fencing of the other node. Fencing went back and forth until I manually powered of node 2 and let node 1 a few minutes to settle down. Now when I switch node 2 back on, it looks like it joins the cluster and is kicked out immediately again, which again results in fencing of node 2. I have purposely set the post_join_delay to a high value, but it didn't help. Below are my cluster.conf and log files. My own guess would be that the problem is associated with the fact that the node tries to do a stateful merge, when it really should be joining without state after a clean reboot. (see fence_tool dump line 9). -------------- root at rmg-de-1:~# cat /etc/pve/cluster.conf -------------- -------------- root at rmg-de-1:~# fence_tool dump | tail -n 40 1378890849 daemon node 1 max 1.1.1.0 run 1.1.1.1 1378890849 daemon node 1 join 1378855487 left 0 local quorum 1378855487 1378890849 receive_start 1:12 len 152 1378890849 match_change 1:12 matches cg 12 1378890849 wait_messages cg 12 need 1 of 2 1378890850 receive_protocol from 2 max 1.1.1.0 run 1.1.1.1 1378890850 daemon node 2 max 0.0.0.0 run 0.0.0.0 1378890850 daemon node 2 join 1378890849 left 1378859110 local quorum 1378855487 1378890850 daemon node 2 stateful merge 1378890850 daemon node 2 kill due to stateful merge 1378890850 telling cman to remove nodeid 2 from cluster 1378890862 cluster node 2 removed seq 832 1378890862 fenced:daemon conf 1 0 1 memb 1 join left 2 1378890862 fenced:daemon ring 1:832 1 memb 1 1378890862 fenced:default conf 1 0 1 memb 1 join left 2 1378890862 add_change cg 13 remove nodeid 2 reason 3 1378890862 add_change cg 13 m 1 j 0 r 1 f 1 1378890862 add_victims node 2 1378890862 check_ringid cluster 832 cpg 1:828 1378890862 fenced:default ring 1:832 1 memb 1 1378890862 check_ringid done cluster 832 cpg 1:832 1378890862 check_quorum done 1378890862 send_start 1:13 flags 2 started 6 m 1 j 0 r 1 f 1 1378890862 cpg_mcast_joined retried 1 start 1378890862 receive_start 1:13 len 152 1378890862 match_change 1:13 skip cg 12 already start 1378890862 match_change 1:13 matches cg 13 1378890862 wait_messages cg 13 got all 1 1378890862 set_master from 1 to complete node 1 1378890862 delay post_join_delay 360 quorate_from_last_update 0 1378891222 delay of 360s leaves 1 victims 1378891222 rmg-de-2 not a cluster member after 360 sec post_join_delay 1378891222 fencing node rmg-de-2 1378891236 fence rmg-de-2 dev 0.0 agent fence_ipmilan result: success 1378891236 fence rmg-de-2 success 1378891236 send_victim_done cg 13 flags 2 victim nodeid 2 1378891236 send_complete 1:13 flags 2 started 6 m 1 j 0 r 1 f 1 1378891236 receive_victim_done 1:13 flags 2 len 80 1378891236 receive_victim_done 1:13 remove victim 2 time 1378891236 how 1 1378891236 receive_complete 1:13 len 152: -------------- -------------- root at rmg-de-1:~# tail -n 100 /var/log/cluster/corosync.log Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE Sep 11 11:14:09 corosync [CLM ] New Configuration: Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) Sep 11 11:14:09 corosync [CLM ] Members Left: Sep 11 11:14:09 corosync [CLM ] Members Joined: Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE Sep 11 11:14:09 corosync [CLM ] New Configuration: Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) Sep 11 11:14:09 corosync [CLM ] Members Left: Sep 11 11:14:09 corosync [CLM ] Members Joined: Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) Sep 11 11:14:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 Sep 11 11:14:09 corosync [CPG ] chosen downlist: sender r(0) ip(10.xx.xx.1) ; members(old:1 left:0) Sep 11 11:14:09 corosync [MAIN ] Completed service synchronization, ready to provide service. Sep 11 11:14:20 corosync [TOTEM ] A processor failed, forming new configuration. Sep 11 11:14:22 corosync [CLM ] CLM CONFIGURATION CHANGE Sep 11 11:14:22 corosync [CLM ] New Configuration: Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.1) Sep 11 11:14:22 corosync [CLM ] Members Left: Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.2) Sep 11 11:14:22 corosync [CLM ] Members Joined: Sep 11 11:14:22 corosync [QUORUM] Members[1]: 1 Sep 11 11:14:22 corosync [CLM ] CLM CONFIGURATION CHANGE Sep 11 11:14:22 corosync [CLM ] New Configuration: Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.1) Sep 11 11:14:22 corosync [CLM ] Members Left: Sep 11 11:14:22 corosync [CLM ] Members Joined: Sep 11 11:14:22 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Sep 11 11:14:22 corosync [CPG ] chosen downlist: sender r(0) ip(10.xx.xx.1) ; members(old:2 left:1) Sep 11 11:14:22 corosync [MAIN ] Completed service synchronization, ready to provide service. -------------- -------------- root at rmg-de-1:~# dlm_tool ls dlm lockspaces name rgmanager id 0x5231f3eb flags 0x00000000 change member 1 joined 0 remove 1 failed 1 seq 12,13 members 1 -------------- Unfortunately I only have the output of the currently operational node, as the other one is fenced very quickly and the logs are hard to retrieve. If someone has an idea however, I'll do my best to provide these as well. Thanks, Pascal From tmg at redhat.com Wed Sep 11 12:06:30 2013 From: tmg at redhat.com (Thom Gardner) Date: Wed, 11 Sep 2013 08:06:30 -0400 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <52304DEB.9090806@hacksrus.net> References: <52304DEB.9090806@hacksrus.net> Message-ID: <20130911120630.GA3321@Hungry.redhat.com> Classic fence loop. Try this doc: https://access.redhat.com/site/solutions/272913 tg. On Wed, Sep 11, 2013 at 01:03:07PM +0200, Pascal Ehlert wrote: > Hi, > > I have recently setup an HA cluster with two nodes, IPMI based fencing > and no quorum disk. Things worked nicely during the first tests, but to my > very annoyance it blew up last night when I did another test of shutting > down the network interface on my secondary node (node 2). > > The node was fenced as expected and came back online. This however > resulted in an immediate fencing of the other node. > Fencing went back and forth until I manually powered of node 2 and let > node 1 a few minutes to settle down. > > Now when I switch node 2 back on, it looks like it joins the cluster and > is kicked out immediately again, which again results in fencing of node > 2. I have purposely set the post_join_delay to a high value, but it > didn't help. > > Below are my cluster.conf and log files. My own guess would be that the > problem is associated with the fact that the node tries to do a stateful > merge, when it really should be joining without state after a clean > reboot. (see fence_tool dump line 9). > > -------------- > root at rmg-de-1:~# cat /etc/pve/cluster.conf > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- > > -------------- > root at rmg-de-1:~# fence_tool dump | tail -n 40 > 1378890849 daemon node 1 max 1.1.1.0 run 1.1.1.1 > 1378890849 daemon node 1 join 1378855487 left 0 local quorum 1378855487 > 1378890849 receive_start 1:12 len 152 > 1378890849 match_change 1:12 matches cg 12 > 1378890849 wait_messages cg 12 need 1 of 2 > 1378890850 receive_protocol from 2 max 1.1.1.0 run 1.1.1.1 > 1378890850 daemon node 2 max 0.0.0.0 run 0.0.0.0 > 1378890850 daemon node 2 join 1378890849 left 1378859110 local quorum 1378855487 > 1378890850 daemon node 2 stateful merge > 1378890850 daemon node 2 kill due to stateful merge > 1378890850 telling cman to remove nodeid 2 from cluster > 1378890862 cluster node 2 removed seq 832 > 1378890862 fenced:daemon conf 1 0 1 memb 1 join left 2 > 1378890862 fenced:daemon ring 1:832 1 memb 1 > 1378890862 fenced:default conf 1 0 1 memb 1 join left 2 > 1378890862 add_change cg 13 remove nodeid 2 reason 3 > 1378890862 add_change cg 13 m 1 j 0 r 1 f 1 > 1378890862 add_victims node 2 > 1378890862 check_ringid cluster 832 cpg 1:828 > 1378890862 fenced:default ring 1:832 1 memb 1 > 1378890862 check_ringid done cluster 832 cpg 1:832 > 1378890862 check_quorum done > 1378890862 send_start 1:13 flags 2 started 6 m 1 j 0 r 1 f 1 > 1378890862 cpg_mcast_joined retried 1 start > 1378890862 receive_start 1:13 len 152 > 1378890862 match_change 1:13 skip cg 12 already start > 1378890862 match_change 1:13 matches cg 13 > 1378890862 wait_messages cg 13 got all 1 > 1378890862 set_master from 1 to complete node 1 > 1378890862 delay post_join_delay 360 quorate_from_last_update 0 > 1378891222 delay of 360s leaves 1 victims > 1378891222 rmg-de-2 not a cluster member after 360 sec post_join_delay > 1378891222 fencing node rmg-de-2 > 1378891236 fence rmg-de-2 dev 0.0 agent fence_ipmilan result: success > 1378891236 fence rmg-de-2 success > 1378891236 send_victim_done cg 13 flags 2 victim nodeid 2 > 1378891236 send_complete 1:13 flags 2 started 6 m 1 j 0 r 1 f 1 > 1378891236 receive_victim_done 1:13 flags 2 len 80 > 1378891236 receive_victim_done 1:13 remove victim 2 time 1378891236 how 1 > 1378891236 receive_complete 1:13 len 152: > -------------- > > -------------- > root at rmg-de-1:~# tail -n 100 /var/log/cluster/corosync.log > Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE > Sep 11 11:14:09 corosync [CLM ] New Configuration: > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) > Sep 11 11:14:09 corosync [CLM ] Members Left: > Sep 11 11:14:09 corosync [CLM ] Members Joined: > Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE > Sep 11 11:14:09 corosync [CLM ] New Configuration: > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) > Sep 11 11:14:09 corosync [CLM ] Members Left: > Sep 11 11:14:09 corosync [CLM ] Members Joined: > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) > Sep 11 11:14:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. > Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 > Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 > Sep 11 11:14:09 corosync [CPG ] chosen downlist: sender r(0) ip(10.xx.xx.1) ; members(old:1 left:0) > Sep 11 11:14:09 corosync [MAIN ] Completed service synchronization, ready to provide service. > Sep 11 11:14:20 corosync [TOTEM ] A processor failed, forming new configuration. > Sep 11 11:14:22 corosync [CLM ] CLM CONFIGURATION CHANGE > Sep 11 11:14:22 corosync [CLM ] New Configuration: > Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.1) > Sep 11 11:14:22 corosync [CLM ] Members Left: > Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.2) > Sep 11 11:14:22 corosync [CLM ] Members Joined: > Sep 11 11:14:22 corosync [QUORUM] Members[1]: 1 > Sep 11 11:14:22 corosync [CLM ] CLM CONFIGURATION CHANGE > Sep 11 11:14:22 corosync [CLM ] New Configuration: > Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.1) > Sep 11 11:14:22 corosync [CLM ] Members Left: > Sep 11 11:14:22 corosync [CLM ] Members Joined: > Sep 11 11:14:22 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. > Sep 11 11:14:22 corosync [CPG ] chosen downlist: sender r(0) ip(10.xx.xx.1) ; members(old:2 left:1) > Sep 11 11:14:22 corosync [MAIN ] Completed service synchronization, ready to provide service. > -------------- > > -------------- > root at rmg-de-1:~# dlm_tool ls > dlm lockspaces > name rgmanager > id 0x5231f3eb > flags 0x00000000 > change member 1 joined 0 remove 1 failed 1 seq 12,13 > members 1 > -------------- > > Unfortunately I only have the output of the currently operational node, > as the other one is fenced very quickly and the logs are hard to > retrieve. If someone has an idea however, I'll do my best to provide > these as well. > > Thanks, > > Pascal > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From pascal at hacksrus.net Wed Sep 11 12:34:03 2013 From: pascal at hacksrus.net (Pascal Ehlert) Date: Wed, 11 Sep 2013 14:34:03 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <20130911120630.GA3321@Hungry.redhat.com> References: <52304DEB.9090806@hacksrus.net> <20130911120630.GA3321@Hungry.redhat.com> Message-ID: <5230633B.6090507@hacksrus.net> > Classic fence loop. Try this doc: > https://access.redhat.com/site/solutions/272913 I don't have a Red Hat subscription (apologies if that is expected to participate in this list). My understanding was, that this scenario should only happen in the case that networking between the two nodes does not work properly. Would you mind explaining why it happens to me where the nodes can (and do) communicate with each other and the post_join_delay is very high? On Wed, Sep 11, 2013 at 01:03:07PM +0200, Pascal Ehlert wrote: >> Hi, >> >> I have recently setup an HA cluster with two nodes, IPMI based fencing >> and no quorum disk. Things worked nicely during the first tests, but to my >> very annoyance it blew up last night when I did another test of shutting >> down the network interface on my secondary node (node 2). >> >> The node was fenced as expected and came back online. This however >> resulted in an immediate fencing of the other node. >> Fencing went back and forth until I manually powered of node 2 and let >> node 1 a few minutes to settle down. >> >> Now when I switch node 2 back on, it looks like it joins the cluster and >> is kicked out immediately again, which again results in fencing of node >> 2. I have purposely set the post_join_delay to a high value, but it >> didn't help. >> >> Below are my cluster.conf and log files. My own guess would be that the >> problem is associated with the fact that the node tries to do a stateful >> merge, when it really should be joining without state after a clean >> reboot. (see fence_tool dump line 9). >> >> -------------- >> root at rmg-de-1:~# cat /etc/pve/cluster.conf >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -------------- >> >> -------------- >> root at rmg-de-1:~# fence_tool dump | tail -n 40 >> 1378890849 daemon node 1 max 1.1.1.0 run 1.1.1.1 >> 1378890849 daemon node 1 join 1378855487 left 0 local quorum 1378855487 >> 1378890849 receive_start 1:12 len 152 >> 1378890849 match_change 1:12 matches cg 12 >> 1378890849 wait_messages cg 12 need 1 of 2 >> 1378890850 receive_protocol from 2 max 1.1.1.0 run 1.1.1.1 >> 1378890850 daemon node 2 max 0.0.0.0 run 0.0.0.0 >> 1378890850 daemon node 2 join 1378890849 left 1378859110 local quorum 1378855487 >> 1378890850 daemon node 2 stateful merge >> 1378890850 daemon node 2 kill due to stateful merge >> 1378890850 telling cman to remove nodeid 2 from cluster >> 1378890862 cluster node 2 removed seq 832 >> 1378890862 fenced:daemon conf 1 0 1 memb 1 join left 2 >> 1378890862 fenced:daemon ring 1:832 1 memb 1 >> 1378890862 fenced:default conf 1 0 1 memb 1 join left 2 >> 1378890862 add_change cg 13 remove nodeid 2 reason 3 >> 1378890862 add_change cg 13 m 1 j 0 r 1 f 1 >> 1378890862 add_victims node 2 >> 1378890862 check_ringid cluster 832 cpg 1:828 >> 1378890862 fenced:default ring 1:832 1 memb 1 >> 1378890862 check_ringid done cluster 832 cpg 1:832 >> 1378890862 check_quorum done >> 1378890862 send_start 1:13 flags 2 started 6 m 1 j 0 r 1 f 1 >> 1378890862 cpg_mcast_joined retried 1 start >> 1378890862 receive_start 1:13 len 152 >> 1378890862 match_change 1:13 skip cg 12 already start >> 1378890862 match_change 1:13 matches cg 13 >> 1378890862 wait_messages cg 13 got all 1 >> 1378890862 set_master from 1 to complete node 1 >> 1378890862 delay post_join_delay 360 quorate_from_last_update 0 >> 1378891222 delay of 360s leaves 1 victims >> 1378891222 rmg-de-2 not a cluster member after 360 sec post_join_delay >> 1378891222 fencing node rmg-de-2 >> 1378891236 fence rmg-de-2 dev 0.0 agent fence_ipmilan result: success >> 1378891236 fence rmg-de-2 success >> 1378891236 send_victim_done cg 13 flags 2 victim nodeid 2 >> 1378891236 send_complete 1:13 flags 2 started 6 m 1 j 0 r 1 f 1 >> 1378891236 receive_victim_done 1:13 flags 2 len 80 >> 1378891236 receive_victim_done 1:13 remove victim 2 time 1378891236 how 1 >> 1378891236 receive_complete 1:13 len 152: >> -------------- >> >> -------------- >> root at rmg-de-1:~# tail -n 100 /var/log/cluster/corosync.log >> Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE >> Sep 11 11:14:09 corosync [CLM ] New Configuration: >> Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) >> Sep 11 11:14:09 corosync [CLM ] Members Left: >> Sep 11 11:14:09 corosync [CLM ] Members Joined: >> Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE >> Sep 11 11:14:09 corosync [CLM ] New Configuration: >> Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) >> Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) >> Sep 11 11:14:09 corosync [CLM ] Members Left: >> Sep 11 11:14:09 corosync [CLM ] Members Joined: >> Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) >> Sep 11 11:14:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. >> Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 >> Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 >> Sep 11 11:14:09 corosync [CPG ] chosen downlist: sender r(0) ip(10.xx.xx.1) ; members(old:1 left:0) >> Sep 11 11:14:09 corosync [MAIN ] Completed service synchronization, ready to provide service. >> Sep 11 11:14:20 corosync [TOTEM ] A processor failed, forming new configuration. >> Sep 11 11:14:22 corosync [CLM ] CLM CONFIGURATION CHANGE >> Sep 11 11:14:22 corosync [CLM ] New Configuration: >> Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.1) >> Sep 11 11:14:22 corosync [CLM ] Members Left: >> Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.2) >> Sep 11 11:14:22 corosync [CLM ] Members Joined: >> Sep 11 11:14:22 corosync [QUORUM] Members[1]: 1 >> Sep 11 11:14:22 corosync [CLM ] CLM CONFIGURATION CHANGE >> Sep 11 11:14:22 corosync [CLM ] New Configuration: >> Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.1) >> Sep 11 11:14:22 corosync [CLM ] Members Left: >> Sep 11 11:14:22 corosync [CLM ] Members Joined: >> Sep 11 11:14:22 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. >> Sep 11 11:14:22 corosync [CPG ] chosen downlist: sender r(0) ip(10.xx.xx.1) ; members(old:2 left:1) >> Sep 11 11:14:22 corosync [MAIN ] Completed service synchronization, ready to provide service. >> -------------- >> >> -------------- >> root at rmg-de-1:~# dlm_tool ls >> dlm lockspaces >> name rgmanager >> id 0x5231f3eb >> flags 0x00000000 >> change member 1 joined 0 remove 1 failed 1 seq 12,13 >> members 1 >> -------------- >> >> Unfortunately I only have the output of the currently operational node, >> as the other one is fenced very quickly and the logs are hard to >> retrieve. If someone has an idea however, I'll do my best to provide >> these as well. >> >> Thanks, >> >> Pascal >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster From lists at alteeve.ca Wed Sep 11 12:37:19 2013 From: lists at alteeve.ca (Digimer) Date: Wed, 11 Sep 2013 08:37:19 -0400 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <5230633B.6090507@hacksrus.net> References: <52304DEB.9090806@hacksrus.net> <20130911120630.GA3321@Hungry.redhat.com> <5230633B.6090507@hacksrus.net> Message-ID: <523063FF.5050908@alteeve.ca> On 11/09/13 08:34, Pascal Ehlert wrote: >> Classic fence loop. Try this doc: >> https://access.redhat.com/site/solutions/272913 > I don't have a Red Hat subscription (apologies if that is expected to > participate in this list). > > My understanding was, that this scenario should only happen in the case > that networking between the two nodes does not work properly. Would you > mind explaining why it happens to me where the nodes can (and do) > communicate with each other and the post_join_delay is very high? It's not required or expected. The problem is that, if you enable cman on boot, the fenced node will try to join the cluster, fail to reach it's peer after post_join_delay (default 6 seconds, iirc) and fence it's peer. That peer reboots, starts cman, tries to connect, fenced it's peer... The easiest way to avoid this in 2-node clusters is to not let cman/rgmanager start automatically. That way, if a node is fenced, it will boot back up and you can log into remotely (assuming it's not totally dead). When you know things are fixed, manually start cman. hth -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From pascal at hacksrus.net Wed Sep 11 12:50:11 2013 From: pascal at hacksrus.net (Pascal Ehlert) Date: Wed, 11 Sep 2013 14:50:11 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <523063FF.5050908@alteeve.ca> References: <52304DEB.9090806@hacksrus.net> <20130911120630.GA3321@Hungry.redhat.com> <5230633B.6090507@hacksrus.net> <523063FF.5050908@alteeve.ca> Message-ID: <52306703.1030101@hacksrus.net> > The problem is that, if you enable cman on boot, the fenced node will > try to join the cluster, fail to reach it's peer after post_join_delay > (default 6 seconds, iirc) and fence it's peer. That peer reboots, > starts cman, tries to connect, fenced it's peer... > > The easiest way to avoid this in 2-node clusters is to not let > cman/rgmanager start automatically. That way, if a node is fenced, it > will boot back up and you can log into remotely (assuming it's not > totally dead). When you know things are fixed, manually start cman. > I my case however, the node which is trying to join is fully operational and has network access. Also if you look at the configuration that I had in my original email, my post_join_delay is 360 (for testing purposes), so there is no way that a timeout occurs. I might be wrong here, but judging from corosync's log file, the other node even joins the cluster successfully, before being marked for fencing by dlm_controld: Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE Sep 11 11:14:09 corosync [CLM ] New Configuration: Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) Sep 11 11:14:09 corosync [CLM ] Members Left: Sep 11 11:14:09 corosync [CLM ] Members Joined: Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) Sep 11 11:14:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Wed Sep 11 16:04:24 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Wed, 11 Sep 2013 18:04:24 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <52304DEB.9090806@hacksrus.net> References: <52304DEB.9090806@hacksrus.net> Message-ID: Hello Pascal For disable startup fencing you need clean_start=1 in the fence_daemon tag, i saw in your previous mail you are using expected_votes="1", with this setting every cluster node will be partitioned into two clusters and operate independently, i recommended using a quorim disk with master_wins parameter 2013/9/11 Pascal Ehlert > Hi, > > I have recently setup an HA cluster with two nodes, IPMI based fencing > and no quorum disk. Things worked nicely during the first tests, but to my > very annoyance it blew up last night when I did another test of shutting > down the network interface on my secondary node (node 2). > > The node was fenced as expected and came back online. This however > resulted in an immediate fencing of the other node. > Fencing went back and forth until I manually powered of node 2 and let > node 1 a few minutes to settle down. > > Now when I switch node 2 back on, it looks like it joins the cluster and > is kicked out immediately again, which again results in fencing of node > 2. I have purposely set the post_join_delay to a high value, but it > didn't help. > > Below are my cluster.conf and log files. My own guess would be that the > problem is associated with the fact that the node tries to do a stateful > merge, when it really should be joining without state after a clean > reboot. (see fence_tool dump line 9). > > -------------- > root at rmg-de-1:~# cat /etc/pve/cluster.conf > > > two_node="1"/> > > login="FENCING" name="fenceNode1" passwd="abc"/> > login="FENCING" name="fenceNode2" passwd="abc"/> > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- > > -------------- > root at rmg-de-1:~# fence_tool dump | tail -n 40 > 1378890849 daemon node 1 max 1.1.1.0 run 1.1.1.1 > 1378890849 daemon node 1 join 1378855487 left 0 local quorum 1378855487 > 1378890849 receive_start 1:12 len 152 > 1378890849 match_change 1:12 matches cg 12 > 1378890849 wait_messages cg 12 need 1 of 2 > 1378890850 receive_protocol from 2 max 1.1.1.0 run 1.1.1.1 > 1378890850 daemon node 2 max 0.0.0.0 run 0.0.0.0 > 1378890850 daemon node 2 join 1378890849 left 1378859110 local quorum > 1378855487 > 1378890850 daemon node 2 stateful merge > 1378890850 daemon node 2 kill due to stateful merge > 1378890850 telling cman to remove nodeid 2 from cluster > 1378890862 cluster node 2 removed seq 832 > 1378890862 fenced:daemon conf 1 0 1 memb 1 join left 2 > 1378890862 fenced:daemon ring 1:832 1 memb 1 > 1378890862 fenced:default conf 1 0 1 memb 1 join left 2 > 1378890862 add_change cg 13 remove nodeid 2 reason 3 > 1378890862 add_change cg 13 m 1 j 0 r 1 f 1 > 1378890862 add_victims node 2 > 1378890862 check_ringid cluster 832 cpg 1:828 > 1378890862 fenced:default ring 1:832 1 memb 1 > 1378890862 check_ringid done cluster 832 cpg 1:832 > 1378890862 check_quorum done > 1378890862 send_start 1:13 flags 2 started 6 m 1 j 0 r 1 f 1 > 1378890862 cpg_mcast_joined retried 1 start > 1378890862 receive_start 1:13 len 152 > 1378890862 match_change 1:13 skip cg 12 already start > 1378890862 match_change 1:13 matches cg 13 > 1378890862 wait_messages cg 13 got all 1 > 1378890862 set_master from 1 to complete node 1 > 1378890862 delay post_join_delay 360 quorate_from_last_update 0 > 1378891222 delay of 360s leaves 1 victims > 1378891222 rmg-de-2 not a cluster member after 360 sec post_join_delay > 1378891222 fencing node rmg-de-2 > 1378891236 fence rmg-de-2 dev 0.0 agent fence_ipmilan result: success > 1378891236 fence rmg-de-2 success > 1378891236 send_victim_done cg 13 flags 2 victim nodeid 2 > 1378891236 send_complete 1:13 flags 2 started 6 m 1 j 0 r 1 f 1 > 1378891236 receive_victim_done 1:13 flags 2 len 80 > 1378891236 receive_victim_done 1:13 remove victim 2 time 1378891236 how 1 > 1378891236 receive_complete 1:13 len 152: > -------------- > > -------------- > root at rmg-de-1:~# tail -n 100 /var/log/cluster/corosync.log > Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE > Sep 11 11:14:09 corosync [CLM ] New Configuration: > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) > Sep 11 11:14:09 corosync [CLM ] Members Left: > Sep 11 11:14:09 corosync [CLM ] Members Joined: > Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE > Sep 11 11:14:09 corosync [CLM ] New Configuration: > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) > Sep 11 11:14:09 corosync [CLM ] Members Left: > Sep 11 11:14:09 corosync [CLM ] Members Joined: > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) > Sep 11 11:14:09 corosync [TOTEM ] A processor joined or left the > membership and a new membership was formed. > Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 > Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 > Sep 11 11:14:09 corosync [CPG ] chosen downlist: sender r(0) > ip(10.xx.xx.1) ; members(old:1 left:0) > Sep 11 11:14:09 corosync [MAIN ] Completed service synchronization, ready > to provide service. > Sep 11 11:14:20 corosync [TOTEM ] A processor failed, forming new > configuration. > Sep 11 11:14:22 corosync [CLM ] CLM CONFIGURATION CHANGE > Sep 11 11:14:22 corosync [CLM ] New Configuration: > Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.1) > Sep 11 11:14:22 corosync [CLM ] Members Left: > Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.2) > Sep 11 11:14:22 corosync [CLM ] Members Joined: > Sep 11 11:14:22 corosync [QUORUM] Members[1]: 1 > Sep 11 11:14:22 corosync [CLM ] CLM CONFIGURATION CHANGE > Sep 11 11:14:22 corosync [CLM ] New Configuration: > Sep 11 11:14:22 corosync [CLM ] r(0) ip(10.xx.xx.1) > Sep 11 11:14:22 corosync [CLM ] Members Left: > Sep 11 11:14:22 corosync [CLM ] Members Joined: > Sep 11 11:14:22 corosync [TOTEM ] A processor joined or left the > membership and a new membership was formed. > Sep 11 11:14:22 corosync [CPG ] chosen downlist: sender r(0) > ip(10.xx.xx.1) ; members(old:2 left:1) > Sep 11 11:14:22 corosync [MAIN ] Completed service synchronization, ready > to provide service. > -------------- > > -------------- > root at rmg-de-1:~# dlm_tool ls > dlm lockspaces > name rgmanager > id 0x5231f3eb > flags 0x00000000 > change member 1 joined 0 remove 1 failed 1 seq 12,13 > members 1 > -------------- > > Unfortunately I only have the output of the currently operational node, > as the other one is fenced very quickly and the logs are hard to > retrieve. If someone has an idea however, I'll do my best to provide > these as well. > > Thanks, > > Pascal > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Wed Sep 11 17:27:10 2013 From: lists at alteeve.ca (Digimer) Date: Wed, 11 Sep 2013 13:27:10 -0400 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: References: <52304DEB.9090806@hacksrus.net> Message-ID: <5230A7EE.9010905@alteeve.ca> On 11/09/13 12:04, emmanuel segura wrote: > Hello Pascal > > For disable startup fencing you need clean_start=1 in the fence_daemon > tag, i saw in your previous mail you are using expected_votes="1", with > this setting every cluster node will be partitioned into two clusters > and operate independently, i recommended using a quorim disk with > master_wins parameter This is a very bad idea and is asking for a split-brain, the main reason fencing exists at all. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From lists at alteeve.ca Wed Sep 11 17:31:47 2013 From: lists at alteeve.ca (Digimer) Date: Wed, 11 Sep 2013 13:31:47 -0400 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <52306703.1030101@hacksrus.net> References: <52304DEB.9090806@hacksrus.net> <20130911120630.GA3321@Hungry.redhat.com> <5230633B.6090507@hacksrus.net> <523063FF.5050908@alteeve.ca> <52306703.1030101@hacksrus.net> Message-ID: <5230A903.2050605@alteeve.ca> On 11/09/13 08:50, Pascal Ehlert wrote: >> The problem is that, if you enable cman on boot, the fenced node will >> try to join the cluster, fail to reach it's peer after post_join_delay >> (default 6 seconds, iirc) and fence it's peer. That peer reboots, >> starts cman, tries to connect, fenced it's peer... >> >> The easiest way to avoid this in 2-node clusters is to not let >> cman/rgmanager start automatically. That way, if a node is fenced, it >> will boot back up and you can log into remotely (assuming it's not >> totally dead). When you know things are fixed, manually start cman. >> > I my case however, the node which is trying to join is fully operational > and has network access. Also if you look at the configuration that I had > in my original email, my post_join_delay is 360 (for testing purposes), > so there is no way that a timeout occurs. > > I might be wrong here, but judging from corosync's log file, the other > node even joins the cluster successfully, before being marked for > fencing by dlm_controld: > > Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE > Sep 11 11:14:09 corosync [CLM ] New Configuration: > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) > Sep 11 11:14:09 corosync [CLM ] Members Left: > Sep 11 11:14:09 corosync [CLM ] Members Joined: > Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) > Sep 11 11:14:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. > Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 > Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 Setting post_join_delay to 360 will buy you 6 minutes from the start of cman until the fence occurs. That log message does show the node joining. Can you reliably reproduce this? If so, can you please 'tail -f -n 0 /var/log/messages' on both nodes, break the cluster and wait for the node to restart, 'tail' the rebooted node's /var/log/messages, wait the six minutes and then, after the second fence occurs, post both node's logs? -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From lists at alteeve.ca Wed Sep 11 17:33:57 2013 From: lists at alteeve.ca (Digimer) Date: Wed, 11 Sep 2013 13:33:57 -0400 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <5230915C.5040104@ucl.ac.uk> References: <52304DEB.9090806@hacksrus.net> <20130911120630.GA3321@Hungry.redhat.com> <5230633B.6090507@hacksrus.net> <523063FF.5050908@alteeve.ca> <5230915C.5040104@ucl.ac.uk> Message-ID: <5230A985.6020605@alteeve.ca> On 11/09/13 11:50, Alan Brown wrote: > On 11/09/13 13:37, Digimer wrote: > >> The problem is that, if you enable cman on boot, the fenced node will >> try to join the cluster, fail to reach it's peer after post_join_delay >> (default 6 seconds, iirc) and fence it's peer. That peer reboots, starts >> cman, tries to connect, fenced it's peer... > > Qdisk is a good way of preventing this kind of problem. If you have a SAN. >> The easiest way to avoid this in 2-node clusters is to not let >> cman/rgmanager start automatically. > > For some values of "easy" > > Your solution means every startup requires manual intervention. > > Qdisk will let the cluster come up/restart nodes without needing human > help at startup. The way I see it, and I've had the clusters in production for years in various locations, fencing happens extremely rarely. If a node gets fenced, *something* went wrong and I will want to investigate before I rejoin the node. So the fact that I have to manually start cman/rgmanager is a trivial cost. Out of about 20 2-node clusters, I've had maybe three or four fence events in four years, and all of them where from failing equipment. So in all cases, not rejoining the cluster was safest anyway. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From emi2fast at gmail.com Wed Sep 11 22:21:28 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Thu, 12 Sep 2013 00:21:28 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <5230A7EE.9010905@alteeve.ca> References: <52304DEB.9090806@hacksrus.net> <5230A7EE.9010905@alteeve.ca> Message-ID: clean_start=1 disable the startup fencing and if you use a quorum disk in your cluster without expected_votes=1, when a node start after it has been fenced, the node dosn't try to fence di remain node and try to start the service, because rgmanager need a cluster quorate, so many people around say clean_start=1 is dangerous, but no one give a clear reason, in my production cluster a i have clvm+vg in exclusive mode+(clean_start=1)+(master_wins). so if you can explain me where is the problem :) i apriciate 2013/9/11 Digimer > On 11/09/13 12:04, emmanuel segura wrote: > >> Hello Pascal >> >> For disable startup fencing you need clean_start=1 in the fence_daemon >> tag, i saw in your previous mail you are using expected_votes="1", with >> this setting every cluster node will be partitioned into two clusters >> and operate independently, i recommended using a quorim disk with >> master_wins parameter >> > > This is a very bad idea and is asking for a split-brain, the main reason > fencing exists at all. > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Wed Sep 11 22:24:17 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Thu, 12 Sep 2013 00:24:17 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <5230A7EE.9010905@alteeve.ca> References: <52304DEB.9090806@hacksrus.net> <5230A7EE.9010905@alteeve.ca> Message-ID: Fixed previous mail clean_start=1 disable the startup fencing and if you use a quorum disk in your cluster without expected_votes=1, when a node start after it has been fenced, the node dosn't try to fence di remain node and doesn't try to start the service, because rgmanager need a cluster quorate, so many people around say clean_start=1 is dangerous, but no one give a clear reason, in my production cluster a i have clvm+vg in exclusive mode+(clean_start=1)+(master_ wins). so if you can explain me where is the problem :) i apriciate 2013/9/11 Digimer > On 11/09/13 12:04, emmanuel segura wrote: > >> Hello Pascal >> >> For disable startup fencing you need clean_start=1 in the fence_daemon >> tag, i saw in your previous mail you are using expected_votes="1", with >> this setting every cluster node will be partitioned into two clusters >> and operate independently, i recommended using a quorim disk with >> master_wins parameter >> > > This is a very bad idea and is asking for a split-brain, the main reason > fencing exists at all. > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Wed Sep 11 22:42:12 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Thu, 12 Sep 2013 00:42:12 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: References: <52304DEB.9090806@hacksrus.net> <5230A7EE.9010905@alteeve.ca> Message-ID: *clean_start:* Used to prevent any start up fencing the daemon might do. It indicates that the daemon should assume all nodes are in a clean state to start. So if the cluster network has problem and my cluster is in the spli-brain situation, fencing loop happen, for avoid this situation you have the following choices 1: disable you cluster sotfware to start in the boot time 2: use clean_start=1 with qdisk without expected_votes=1(rgmanager never start the service on node without enaugh vote) 3:change the cluster software :) Nobody explain me with a technical reason why clean_start=1 is dangerous, so configurare and test well with different parameters your cluster and like that you can understand RHCS 2013/9/12 emmanuel segura > Fixed previous mail > > clean_start=1 disable the startup fencing and if you use a quorum disk in > your cluster without expected_votes=1, when a node start after it has been > fenced, the node dosn't try to fence di remain node and doesn't try to > start the service, because rgmanager need a cluster quorate, so many people > around say clean_start=1 is dangerous, but no one give a clear reason, in > my production cluster a i have clvm+vg in exclusive > mode+(clean_start=1)+(master_ > wins). so if you can explain me where is the problem :) i apriciate > > > > 2013/9/11 Digimer > >> On 11/09/13 12:04, emmanuel segura wrote: >> >>> Hello Pascal >>> >>> For disable startup fencing you need clean_start=1 in the fence_daemon >>> tag, i saw in your previous mail you are using expected_votes="1", with >>> this setting every cluster node will be partitioned into two clusters >>> and operate independently, i recommended using a quorim disk with >>> master_wins parameter >>> >> >> This is a very bad idea and is asking for a split-brain, the main reason >> fencing exists at all. >> >> >> -- >> Digimer >> Papers and Projects: https://alteeve.ca/w/ >> What if the cure for cancer is trapped in the mind of a person without >> access to education? >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Wed Sep 11 22:44:45 2013 From: lists at alteeve.ca (Digimer) Date: Wed, 11 Sep 2013 18:44:45 -0400 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: References: <52304DEB.9090806@hacksrus.net> <5230A7EE.9010905@alteeve.ca> Message-ID: <5230F25D.7000107@alteeve.ca> The problem that Pascal has is that the node sees the peer, joins and fences anyway. So in this case, clean_start won't help. Even with a SAN/qdisk though, it's not needed to enable this. If the remaining node can't talk to qdisk, it won't have quorum and will not be offering services, so fencing it won't hurt. It's *always* better to put nodes into a known state, regardless of quorum. Consider that the hung/failed node was in the middle of a write to the SAN and froze. Now imagine at some point in the future it recovers, having no idea that time passed it has no reason to doubt that it's locks are still valid so it just finishes the writes. Congrats, you could have just corrupted your storage. _Never_ assume _anything_. "The only thing you don't know is what you don't know." digimer On 11/09/13 18:24, emmanuel segura wrote: > Fixed previous mail > > clean_start=1 disable the startup fencing and if you use a quorum disk > in your cluster without expected_votes=1, when a node start after it has > been fenced, the node dosn't try to fence di remain node and doesn't try > to start the service, because rgmanager need a cluster quorate, so many > people around say clean_start=1 is dangerous, but no one give a clear > reason, in my production cluster a i have clvm+vg in exclusive > mode+(clean_start=1)+(master_ > wins). so if you can explain me where is the problem :) i apriciate > > > > 2013/9/11 Digimer > > > On 11/09/13 12:04, emmanuel segura wrote: > > Hello Pascal > > For disable startup fencing you need clean_start=1 in the > fence_daemon > tag, i saw in your previous mail you are using > expected_votes="1", with > this setting every cluster node will be partitioned into two > clusters > and operate independently, i recommended using a quorim disk with > master_wins parameter > > > This is a very bad idea and is asking for a split-brain, the main > reason fencing exists at all. > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person > without access to education? > > > > > -- > esta es mi vida e me la vivo hasta que dios quiera -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From emi2fast at gmail.com Wed Sep 11 22:57:49 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Thu, 12 Sep 2013 00:57:49 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <5230F25D.7000107@alteeve.ca> References: <52304DEB.9090806@hacksrus.net> <5230A7EE.9010905@alteeve.ca> <5230F25D.7000107@alteeve.ca> Message-ID: Consider that the hung/failed node was in the middle of a write to the SAN and froze. Now imagine at some point in the future it recovers, having no idea that time passed it has no reason to doubt that it's locks are still valid so it just finishes the writes. Congrats, you could have just corrupted your storage. UMMMMMM I use ext3(LV)->(VG=exclusive=true with clvmd)->(pv)->(multipath)->(SAN), so as you know the redhat cluster only support failover resource, so your example is not very clear, how can i corrupte the storare with clean_start=1? 2013/9/12 Digimer > The problem that Pascal has is that the node sees the peer, joins and > fences anyway. So in this case, clean_start won't help. > > Even with a SAN/qdisk though, it's not needed to enable this. If the > remaining node can't talk to qdisk, it won't have quorum and will not be > offering services, so fencing it won't hurt. It's *always* better to put > nodes into a known state, regardless of quorum. > > Consider that the hung/failed node was in the middle of a write to the SAN > and froze. Now imagine at some point in the future it recovers, having no > idea that time passed it has no reason to doubt that it's locks are still > valid so it just finishes the writes. Congrats, you could have just > corrupted your storage. > > _Never_ assume _anything_. > > "The only thing you don't know is what you don't know." > > digimer > > On 11/09/13 18:24, emmanuel segura wrote: > >> Fixed previous mail >> >> clean_start=1 disable the startup fencing and if you use a quorum disk >> in your cluster without expected_votes=1, when a node start after it has >> been fenced, the node dosn't try to fence di remain node and doesn't try >> to start the service, because rgmanager need a cluster quorate, so many >> people around say clean_start=1 is dangerous, but no one give a clear >> reason, in my production cluster a i have clvm+vg in exclusive >> mode+(clean_start=1)+(master_ >> wins). so if you can explain me where is the problem :) i apriciate >> >> >> >> 2013/9/11 Digimer > >> >> On 11/09/13 12:04, emmanuel segura wrote: >> >> Hello Pascal >> >> For disable startup fencing you need clean_start=1 in the >> fence_daemon >> tag, i saw in your previous mail you are using >> expected_votes="1", with >> this setting every cluster node will be partitioned into two >> clusters >> and operate independently, i recommended using a quorim disk with >> master_wins parameter >> >> >> This is a very bad idea and is asking for a split-brain, the main >> reason fencing exists at all. >> >> >> -- >> Digimer >> Papers and Projects: https://alteeve.ca/w/ >> What if the cure for cancer is trapped in the mind of a person >> without access to education? >> >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmg at redhat.com Wed Sep 11 23:01:04 2013 From: tmg at redhat.com (Thom Gardner) Date: Wed, 11 Sep 2013 19:01:04 -0400 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: References: <52304DEB.9090806@hacksrus.net> <5230A7EE.9010905@alteeve.ca> Message-ID: <20130911230104.GB3321@Hungry.redhat.com> On Thu, Sep 12, 2013 at 12:21:28AM +0200, emmanuel segura wrote: > clean_start=1 disable the startup fencing and if you use a quorum disk in your > cluster without expected_votes=1, when a node start after it has been fenced, > the node dosn't try to fence di remain node and try to start the service, > because rgmanager need a cluster quorate, so many people around say clean_start > =1 is dangerous, but no one give a clear reason, Listen to Digimer. clean_start=1 is dangerous. It will allow a node to join a cluster "with state" and thus opens the door to split-brain. We (Red Hat) will not support a production cluster with clean_start turned on. > in my production cluster a i > have clvm+vg in exclusive mode+(clean_start=1)+(master_wins). so if you can > explain me where is the problem :) i apriciate > It specifically targets a safety mechanism, and basically turns off the check for a node trying to join the cluster "with state", so it will gladly allow you to split-brain a FS or something really ugly like that. It's there for testing/debugging purposes only, and should never be used on a production system. As for leaving services turned off, Digimer is spot on with that one, too (that was you, wasn't it?). It is one of the ways we recommend getting around this fence loop problem. It's also the simplest one and, IMHO, the one with the most tolerable list of potential side effects, which again Digimer did a fine job of explaining (basically you have to start your cluster services manually, but if you have a fence event, you're going to probably be fixing something anyway on that machine, so, it's probably good that they're not coming up on their own, and it's no big thing to start things up when you're done). L8r, tg. > 2013/9/11 Digimer > > On 11/09/13 12:04, emmanuel segura wrote: > > Hello Pascal > > For disable startup fencing you need clean_start=1 in the fence_daemon > tag, i saw in your previous mail you are using expected_votes="1", with > this setting every cluster node will be partitioned into two clusters > and operate independently, i recommended using a quorim disk with > master_wins parameter > > > This is a very bad idea and is asking for a split-brain, the main reason > fencing exists at all. > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From emi2fast at gmail.com Wed Sep 11 23:14:46 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Thu, 12 Sep 2013 01:14:46 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <20130911230104.GB3321@Hungry.redhat.com> References: <52304DEB.9090806@hacksrus.net> <5230A7EE.9010905@alteeve.ca> <20130911230104.GB3321@Hungry.redhat.com> Message-ID: The solution is don't start the cluster in the boot time? :) Nice Redhat support expected_votes="1" but doesn't support clean_start=1? :) Nice I would like to have more details when redhat doesn't support one cluster option with e technical example, no only Redhat doesn't support 2013/9/12 Thom Gardner > On Thu, Sep 12, 2013 at 12:21:28AM +0200, emmanuel segura wrote: > > clean_start=1 disable the startup fencing and if you use a quorum disk > in your > > cluster without expected_votes=1, when a node start after it has been > fenced, > > the node dosn't try to fence di remain node and try to start the service, > > because rgmanager need a cluster quorate, so many people around say > clean_start > > =1 is dangerous, but no one give a clear reason, > > Listen to Digimer. clean_start=1 is dangerous. It will allow a node > to join a cluster "with state" and thus opens the door to split-brain. > We (Red Hat) will not support a production cluster with clean_start > turned on. > > > in my production > cluster a i > > have clvm+vg in exclusive mode+(clean_start=1)+(master_wins). so if you > can > > explain me where is the problem :) i apriciate > > > > It specifically targets a safety mechanism, and basically turns > off the check for a node trying to join the cluster "with state", > so it will gladly allow you to split-brain a FS or something really > ugly like that. It's there for testing/debugging purposes only, > and should never be used on a production system. > > As for leaving services turned off, Digimer is spot on with that one, > too (that was you, wasn't it?). It is one of the ways we recommend > getting around this fence loop problem. It's also the simplest one > and, IMHO, the one with the most tolerable list of potential side > effects, which again Digimer did a fine job of explaining (basically > you have to start your cluster services manually, but if you have a > fence event, you're going to probably be fixing something anyway on > that machine, so, it's probably good that they're not coming up on > their own, and it's no big thing to start things up when you're done). > > L8r, > tg. > > > 2013/9/11 Digimer > > > > On 11/09/13 12:04, emmanuel segura wrote: > > > > Hello Pascal > > > > For disable startup fencing you need clean_start=1 in the > fence_daemon > > tag, i saw in your previous mail you are using > expected_votes="1", with > > this setting every cluster node will be partitioned into two > clusters > > and operate independently, i recommended using a quorim disk with > > master_wins parameter > > > > > > This is a very bad idea and is asking for a split-brain, the main > reason > > fencing exists at all. > > > > > > -- > > Digimer > > Papers and Projects: https://alteeve.ca/w/ > > What if the cure for cancer is trapped in the mind of a person > without > > access to education? > > > > > > > > > > -- > > esta es mi vida e me la vivo hasta que dios quiera > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From pascal at hacksrus.net Thu Sep 12 06:57:22 2013 From: pascal at hacksrus.net (Pascal Ehlert) Date: Thu, 12 Sep 2013 08:57:22 +0200 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <5230A903.2050605@alteeve.ca> References: <52304DEB.9090806@hacksrus.net> <20130911120630.GA3321@Hungry.redhat.com> <5230633B.6090507@hacksrus.net> <523063FF.5050908@alteeve.ca> <52306703.1030101@hacksrus.net> <5230A903.2050605@alteeve.ca> Message-ID: <523165D2.4020005@hacksrus.net> On 11/09/13 7:31 PM, Digimer wrote: > That log message does show the node joining. Can you reliably > reproduce this? If so, can you please 'tail -f -n 0 /var/log/messages' > on both nodes, break the cluster and wait for the node to restart, > 'tail' the rebooted node's /var/log/messages, wait the six minutes and > then, after the second fence occurs, post both node's logs? > I was indeed able to reliably reproduce this and that's where my confusion came from. I don't understand why the node seems to be joining (and leaving immediately afterwards as per the log), all within the 360secs post join fence delay and still gets fenced. As this is a semi-production system (we had to move quickly), I went with a qdisk based approach now, using a small iscsi disk from a remote site. This works very well and reliable as far as I can tell from the testing that I have done so far. I would still be interested to hear why the initial approach failed. How would have manually starting the cluster services a difference anyway? Does that mean that one should join the cluster and fence domain first to ensure a stateless join and only then start rgmanager? Isn't that something that could be achieved with some delays in the startup scripts as well? Either way, thank you all for helping out this quick! From lists at alteeve.ca Thu Sep 12 17:25:29 2013 From: lists at alteeve.ca (Digimer) Date: Thu, 12 Sep 2013 13:25:29 -0400 Subject: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot In-Reply-To: <523165D2.4020005@hacksrus.net> References: <52304DEB.9090806@hacksrus.net> <20130911120630.GA3321@Hungry.redhat.com> <5230633B.6090507@hacksrus.net> <523063FF.5050908@alteeve.ca> <52306703.1030101@hacksrus.net> <5230A903.2050605@alteeve.ca> <523165D2.4020005@hacksrus.net> Message-ID: <5231F909.5080500@alteeve.ca> On 12/09/13 02:57, Pascal Ehlert wrote: > > On 11/09/13 7:31 PM, Digimer wrote: >> That log message does show the node joining. Can you reliably >> reproduce this? If so, can you please 'tail -f -n 0 /var/log/messages' >> on both nodes, break the cluster and wait for the node to restart, >> 'tail' the rebooted node's /var/log/messages, wait the six minutes and >> then, after the second fence occurs, post both node's logs? >> > I was indeed able to reliably reproduce this and that's where my > confusion came from. I don't understand why the node seems to be joining > (and leaving immediately afterwards as per the log), all within the > 360secs post join fence delay and still gets fenced. > > As this is a semi-production system (we had to move quickly), I went > with a qdisk based approach now, using a small iscsi disk from a remote > site. This works very well and reliable as far as I can tell from the > testing that I have done so far. I would still be interested to hear why > the initial approach failed. > > How would have manually starting the cluster services a difference > anyway? Does that mean that one should join the cluster and fence domain > first to ensure a stateless join and only then start rgmanager? Isn't > that something that could be achieved with some delays in the startup > scripts as well? > > Either way, thank you all for helping out this quick! I honestly don't know why it wound join -> fence; That's most likely a network issue but I couldn't guess any more than that. Regardless, you have an issue as this behaviour is certainly not normal. You may have masked it with qdisk, but please don't leave things as they are. This is worthy of further investigation. In this case, manually starting the cluster would probably not change anything. It would, however, allow you to more easily debug because you could get the logs tail'ing before attempting to start the cluster. We'll really need to see the logs in order to go much further. If you can schedule a maintenance window, please reproduce this and post the logs here. I am very curious as to what might be going on. In the meantime, run 'cman_tool status', record the multicast address and make sure that group is persistent in your switches. There is a small chance that one of the services under rgmanager's control that is causing an interruption. Again; guessing. digimer -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From jpokorny at redhat.com Tue Sep 17 12:06:04 2013 From: jpokorny at redhat.com (Jan =?utf-8?Q?Pokorn=C3=BD?=) Date: Tue, 17 Sep 2013 14:06:04 +0200 Subject: [Linux-cluster] BUG in PID dir creation??? In-Reply-To: <522BE679.6060302@srce.hr> References: <522BE5B9.5040706@srce.hr> <522BE679.6060302@srce.hr> Message-ID: <20130917120604.GB10365@redhat.com> Hello Jakov, On 08/09/13 04:52 +0200, Jakov Sosic wrote: > On 09/08/2013 04:49 AM, Jakov Sosic wrote: > >> Should I file a bug report ?! provided that you have even a patch, this would probably get more attention as a pull request for resource-agents repository hosted on GitHub: https://github.com/ClusterLabs/resource-agents/ -- Jan From james.hofmeister at hp.com Mon Sep 23 21:30:11 2013 From: james.hofmeister at hp.com (Hofmeister, James (HP ESSN BCS Linux ERT)) Date: Mon, 23 Sep 2013 21:30:11 +0000 Subject: [Linux-cluster] gfs:gfs_assert_i+0x67/0x92 seen when node joining cluster Message-ID: <5CBE4DF16DF0DE4A99CCC64ACC08A8791434BBAC@G5W2714.americas.hpqcorp.net> Hello Folks, RE: gfs:gfs_assert_i+0x67/0x92 seen when node joining cluster Has this been seen at other sites? Call Trace: [] :gfs:gfs_assert_i+0x67/0x92 [] :gfs:unlinked_scan_elements+0x99/0x180 [] :gfs:gfs_dreread+0x87/0xc6 [] :gfs:foreach_descriptor+0x229/0x305 [] :gfs:fill_super+0x0/0x642 [] :gfs:gfs_recover_dump+0xdd/0x14e [] :gfs:gfs_make_fs_rw+0xc0/0x11a [] :gfs:init_journal+0x279/0x34c [] :gfs:fill_super+0x48e/0x642 [] get_sb_bdev+0x10a/0x16c [] vfs_kern_mount+0x93/0x11a [] do_kern_mount+0x36/0x4d [] do_mount+0x6a9/0x719 [] enqueue_task+0x41/0x56 [] do_sock_read+0xcf/0x110 [] sock_aio_read+0x4f/0x5e [] do_sync_read+0xc7/0x104 [] zone_statistics+0x3e/0x6d [] __alloc_pages+0x78/0x308 [] sys_mount+0x8a/0xcd Sep 18 04:09:51 hpium2 syslogd 1.4.1: restart. Sep 18 04:09:51 hpium2 kernel: klogd 1.4.1, log source = /proc/kmsg started. Regards, James Hofmeister Hewlett Packard Linux Engineering Resolution Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekuric at redhat.com Tue Sep 24 08:59:33 2013 From: ekuric at redhat.com (Elvir Kuric) Date: Tue, 24 Sep 2013 10:59:33 +0200 Subject: [Linux-cluster] gfs:gfs_assert_i+0x67/0x92 seen when node joining cluster In-Reply-To: <5CBE4DF16DF0DE4A99CCC64ACC08A8791434BBAC@G5W2714.americas.hpqcorp.net> References: <5CBE4DF16DF0DE4A99CCC64ACC08A8791434BBAC@G5W2714.americas.hpqcorp.net> Message-ID: <52415475.2070702@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 09/23/2013 11:30 PM, Hofmeister, James (HP ESSN BCS Linux ERT) wrote: > > Hello Folks, > > RE: gfs:gfs_assert_i+0x67/0x92 seen when node joining cluster > > > > Has this been seen at other sites? > > > > Call Trace: > > [] :gfs:gfs_assert_i+0x67/0x92 > > [] :gfs:unlinked_scan_elements+0x99/0x180 > > [] :gfs:gfs_dreread+0x87/0xc6 > > [] :gfs:foreach_descriptor+0x229/0x305 > > [] :gfs:fill_super+0x0/0x642 > > [] :gfs:gfs_recover_dump+0xdd/0x14e > > [] :gfs:gfs_make_fs_rw+0xc0/0x11a > > [] :gfs:init_journal+0x279/0x34c > > [] :gfs:fill_super+0x48e/0x642 > > [] get_sb_bdev+0x10a/0x16c > > [] vfs_kern_mount+0x93/0x11a > > [] do_kern_mount+0x36/0x4d > > [] do_mount+0x6a9/0x719 > > [] enqueue_task+0x41/0x56 > > [] do_sock_read+0xcf/0x110 > > [] sock_aio_read+0x4f/0x5e > > [] do_sync_read+0xc7/0x104 > > [] zone_statistics+0x3e/0x6d > > [] __alloc_pages+0x78/0x308 > > [] sys_mount+0x8a/0xcd > > > > Sep 18 04:09:51 hpium2 syslogd 1.4.1: restart. > > Sep 18 04:09:51 hpium2 kernel: klogd 1.4.1, log source = /proc/kmsg started. > > > > Regards, > > James Hofmeister Hewlett Packard Linux Engineering Resolution Team > > > Hi James, letting us know below details can help 1) RHEL version 2) gfs* packages version 3) kernel version You can also attach cluster.conf for us. Also if you have valid support contract with Red Hat,please open case via https://access.redhat.com so we can work on this issue in more details where you can provide us all necessary information we need in order to debug this Thank you Kind regards, - -- Elvir Kuric,STSE / Red Hat / GSS EMEA / -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlJBVG4ACgkQ8YYZ36KGw0NCEQD/Vwoliq/cg+yoYfQ9l1EgxeZQ cH8Zli1fB6ZBsHKyEQoBAI6UJXau1IDcbHpwQm5t0co4+vCfrnOROsoTDH4f8HSD =7zJ8 -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.desport at ac-versailles.fr Tue Sep 24 09:29:07 2013 From: olivier.desport at ac-versailles.fr (Olivier Desport) Date: Tue, 24 Sep 2013 11:29:07 +0200 Subject: [Linux-cluster] slow NFS performance on GFS2 Message-ID: <52415B63.8060707@ac-versailles.fr> Hello, I've installed a two nodes GFS2 cluster on Debian 7. The nodes are connected to the datas by iSCSI and multipathing with a 10 Gb/s link. I can write a 1g file with dd at 500 Mbytes/s. I export with NFS (on a 10 Gb/s network) and I only can reach 220 Mbytes/s. I think that it's a little bit far from 500 Mbytes/s... Do you how to tune my settings to increase the speed for NFS ? GFS2 mount : /dev/vg-bigfiles/lv-bigfiles /export/bigfiles gfs2 _netdev,nodiratime,noatime 0 0 NFS export : /export/bigfiles 172.16.0.0/16(fsid=2,rw,async,no_root_squash,no_subtree_check) mount on NFS clients : nfs-server:/export/bigfiles /data/bigfiles nfs4 _netdev,rw,user,nodiratime,noatime,intr 0 0 From swhiteho at redhat.com Tue Sep 24 09:42:36 2013 From: swhiteho at redhat.com (Steven Whitehouse) Date: Tue, 24 Sep 2013 10:42:36 +0100 Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <52415B63.8060707@ac-versailles.fr> References: <52415B63.8060707@ac-versailles.fr> Message-ID: <1380015756.2715.2.camel@menhir> Hi, On Tue, 2013-09-24 at 11:29 +0200, Olivier Desport wrote: > Hello, > > I've installed a two nodes GFS2 cluster on Debian 7. The nodes are > connected to the datas by iSCSI and multipathing with a 10 Gb/s link. I > can write a 1g file with dd at 500 Mbytes/s. I export with NFS (on a 10 > Gb/s network) and I only can reach 220 Mbytes/s. I think that it's a > little bit far from 500 Mbytes/s... > > Do you how to tune my settings to increase the speed for NFS ? > > GFS2 mount : > /dev/vg-bigfiles/lv-bigfiles /export/bigfiles gfs2 > _netdev,nodiratime,noatime 0 0 > > NFS export : > /export/bigfiles > 172.16.0.0/16(fsid=2,rw,async,no_root_squash,no_subtree_check) > > mount on NFS clients : > nfs-server:/export/bigfiles /data/bigfiles nfs4 > _netdev,rw,user,nodiratime,noatime,intr 0 0 > One possibility is to try changing rsize and wsize at the client end. Is the 10G network used for NFS the same network as the one used for the iSCSI? I assume that the NFS is set up active/passive on the two GFS2 nodes, so only one node is exporting the GFS2 fs via NFS in the normal case? Steve. From olivier.desport at ac-versailles.fr Tue Sep 24 10:44:44 2013 From: olivier.desport at ac-versailles.fr (Olivier Desport) Date: Tue, 24 Sep 2013 12:44:44 +0200 Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <1380015756.2715.2.camel@menhir> References: <52415B63.8060707@ac-versailles.fr> <1380015756.2715.2.camel@menhir> Message-ID: <52416D1C.1040608@ac-versailles.fr> Le 24/09/2013 11:42, Steven Whitehouse a ?crit : > Hi, > > On Tue, 2013-09-24 at 11:29 +0200, Olivier Desport wrote: >> Hello, >> >> I've installed a two nodes GFS2 cluster on Debian 7. The nodes are >> connected to the datas by iSCSI and multipathing with a 10 Gb/s link. I >> can write a 1g file with dd at 500 Mbytes/s. I export with NFS (on a 10 >> Gb/s network) and I only can reach 220 Mbytes/s. I think that it's a >> little bit far from 500 Mbytes/s... >> >> Do you how to tune my settings to increase the speed for NFS ? >> >> GFS2 mount : >> /dev/vg-bigfiles/lv-bigfiles /export/bigfiles gfs2 >> _netdev,nodiratime,noatime 0 0 >> >> NFS export : >> /export/bigfiles >> 172.16.0.0/16(fsid=2,rw,async,no_root_squash,no_subtree_check) >> >> mount on NFS clients : >> nfs-server:/export/bigfiles /data/bigfiles nfs4 >> _netdev,rw,user,nodiratime,noatime,intr 0 0 >> > One possibility is to try changing rsize and wsize at the client end. Is > the 10G network used for NFS the same network as the one used for the > iSCSI? I've tried several values of rsize and wsize and it doesn't change anything. iSCSI and NFS are not on the same VLAN. I'm testing with servers which are on the same chassis (NFS/GFS servers are on blades and client is a KVM guest with virtio network card). iSCSI and NFS use separates network in the chassis but each blade has to share iSCSI (50% to 100% of 10 Gbits/s) and LAN (50% to 100%). iperf beetween server and client : 3.4 Gbits/s (435 Mbytes/s). > > I assume that the NFS is set up active/passive on the two GFS2 nodes, so > only one node is exporting the GFS2 fs via NFS in the normal case? > > Steve. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at brimer.org Tue Sep 24 12:00:51 2013 From: lists at brimer.org (Barry Brimer) Date: Tue, 24 Sep 2013 07:00:51 -0500 (CDT) Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <52416D1C.1040608@ac-versailles.fr> References: <52415B63.8060707@ac-versailles.fr> <1380015756.2715.2.camel@menhir> <52416D1C.1040608@ac-versailles.fr> Message-ID: >>> Hello, >>> >>> I've installed a two nodes GFS2 cluster on Debian 7. The nodes are >>> connected to the datas by iSCSI and multipathing with a 10 Gb/s link. I >>> can write a 1g file with dd at 500 Mbytes/s. I export with NFS (on a 10 >>> Gb/s network) and I only can reach 220 Mbytes/s. I think that it's a >>> little bit far from 500 Mbytes/s... >>> >>> Do you how to tune my settings to increase the speed for NFS ? >>> >>> GFS2 mount : >>> /dev/vg-bigfiles/lv-bigfiles /export/bigfiles gfs2 >>> _netdev,nodiratime,noatime 0 0 >>> >>> NFS export : >>> /export/bigfiles >>> 172.16.0.0/16(fsid=2,rw,async,no_root_squash,no_subtree_check) >>> >>> mount on NFS clients : >>> nfs-server:/export/bigfiles /data/bigfiles nfs4 >>> _netdev,rw,user,nodiratime,noatime,intr 0 0 Fedora 19 contains a program called nfsometer which might be helpful in NFS tuning. From ray at oneunified.net Tue Sep 24 12:03:33 2013 From: ray at oneunified.net (Raymond Burkholder) Date: Tue, 24 Sep 2013 09:03:33 -0300 Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <52416D1C.1040608@ac-versailles.fr> References: <52415B63.8060707@ac-versailles.fr> <1380015756.2715.2.camel@menhir> <52416D1C.1040608@ac-versailles.fr> Message-ID: <074801ceb91e$1b1f8110$515e8330$@oneunified.net> >One possibility is to try changing rsize and wsize at the client end. Is >the 10G network used for NFS the same network as the one used for the >iSCSI? >I've tried several values of rsize and wsize and it doesn't change anything. iSCSI and NFS are not on the same VLAN. I'm testing with servers which are on the same chassis (NFS/GFS servers are on blades and client is a KVM guest with virtio network card). iSCSI and NFS use separates network in the chassis but each blade has to share iSCSI (50% to 100% of 10 Gbits/s) and LAN (50% to 100%). >iperf between server and client : 3.4 Gbits/s (435 Mbytes/s). You could also try using Jumbo Frames on the network. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From olivier.desport at ac-versailles.fr Tue Sep 24 12:08:01 2013 From: olivier.desport at ac-versailles.fr (Olivier Desport) Date: Tue, 24 Sep 2013 14:08:01 +0200 Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <074801ceb91e$1b1f8110$515e8330$@oneunified.net> References: <52415B63.8060707@ac-versailles.fr> <1380015756.2715.2.camel@menhir> <52416D1C.1040608@ac-versailles.fr> <074801ceb91e$1b1f8110$515e8330$@oneunified.net> Message-ID: <524180A1.2050402@ac-versailles.fr> Le 24/09/2013 14:03, Raymond Burkholder a ?crit : >> One possibility is to try changing rsize and wsize at the client end. Is >> the 10G network used for NFS the same network as the one used for the >> iSCSI? >> I've tried several values of rsize and wsize and it doesn't change anything. iSCSI and NFS are not on the same VLAN. I'm testing with servers which are on the same chassis (NFS/GFS servers are on blades and client is a KVM guest with virtio network card). iSCSI and NFS use separates network in the chassis but each blade has to share iSCSI (50% to 100% of 10 Gbits/s) and LAN (50% to 100%). >> iperf between server and client : 3.4 Gbits/s (435 Mbytes/s). > > You could also try using Jumbo Frames on the network. I've tried but the NFS speed is the same. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmg at redhat.com Tue Sep 24 12:57:08 2013 From: tmg at redhat.com (Thom Gardner) Date: Tue, 24 Sep 2013 08:57:08 -0400 Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <524180A1.2050402@ac-versailles.fr> References: <52415B63.8060707@ac-versailles.fr> <1380015756.2715.2.camel@menhir> <52416D1C.1040608@ac-versailles.fr> <074801ceb91e$1b1f8110$515e8330$@oneunified.net> <524180A1.2050402@ac-versailles.fr> Message-ID: <20130924125708.GC3315@Hungry.redhat.com> OK, I don't know much about increasing NFS performance, but I do have some things for you to consider that may actually help anyway: In general we (the cluster support group at RedHat) have started recommending that you just not even use GFS or GFS2 for use as an exported FS from a cluster. The reason being that if you follow all the rules, these filesystems don't usually buy you anything. While a GFS or GFS2 filesystem is exported, you shouldn't have any other type of access to it, and then only from one node at a time. We recommend that you don't even mount on more than one node at a time. You certainly should not NFS export it from more than one node at a time. You should not even export it as both NFS and CIFS at the same time from the same node. If memory serves correctly, you shouldn't even use the FS for, say, a database on the same node as the one you're exporting it from (I may be remembering that last bit wrong, if it's important I can check, but I don't think it's important to this discussion). The point here is that because you can't use it from more than one node at a time when you are exporting it, there's just no point in using GFS or GFS2 at all. This is not a high performance filesystem. It was never intended to be. It's primary advantage is that it can be used on multiple nodes in a cluster at the same time. If your application (exporting it) takes that advantage away, you've lost the main (possibly only) advantage in using it. Steve Whitehouse actually touched on this when he asked if you had it set up active/passive, but I didn't see where you answered that question (I may have missed it, though). Now, if you still want to use GFS2, there are some things that I can help you with from the GFS side that can speed it up considerably (depending on the kind of access going on with it), which may actually turn out to be the cause of the problem that you're experiencing (interaction between NFS and GFS2), but I would first recommend that you use a different filesystem all together if your only purpose for it is NFS exporting. I'm sorry I don't have much advice to give for the NFS specific stuff, though. tg. On Tue, Sep 24, 2013 at 02:08:01PM +0200, Olivier Desport wrote: > Le 24/09/2013 14:03, Raymond Burkholder a e'crit : > > One possibility is to try changing rsize and wsize at the client end. Is > the 10G network used for NFS the same network as the one used for the > iSCSI? > I've tried several values of rsize and wsize and it doesn't > change anything. iSCSI and NFS are not on the same VLAN. I'm > testing with servers which are on the same chassis (NFS/GFS > servers are on blades and client is a KVM guest with virtio > network card). iSCSI and NFS use separates network in the > chassis but each blade has to share iSCSI (50% to 100% > of 10 Gbits/s) and LAN (50% to 100%). > > iperf between server and client : 3.4 Gbits/s (435 Mbytes/s). > > You could also try using Jumbo Frames on the network. > > I've tried but the NFS speed is the same. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From swhiteho at redhat.com Tue Sep 24 13:10:09 2013 From: swhiteho at redhat.com (Steven Whitehouse) Date: Tue, 24 Sep 2013 14:10:09 +0100 Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <20130924125708.GC3315@Hungry.redhat.com> References: <52415B63.8060707@ac-versailles.fr> <1380015756.2715.2.camel@menhir> <52416D1C.1040608@ac-versailles.fr> <074801ceb91e$1b1f8110$515e8330$@oneunified.net> <524180A1.2050402@ac-versailles.fr> <20130924125708.GC3315@Hungry.redhat.com> Message-ID: <1380028209.2715.13.camel@menhir> Hi, On Tue, 2013-09-24 at 08:57 -0400, Thom Gardner wrote: > OK, I don't know much about increasing NFS performance, but I do have > some things for you to consider that may actually help anyway: > > In general we (the cluster support group at RedHat) have started > recommending that you just not even use GFS or GFS2 for use as an > exported FS from a cluster. The reason being that if you follow > all the rules, these filesystems don't usually buy you anything. > While a GFS or GFS2 filesystem is exported, you shouldn't have any > other type of access to it, and then only from one node at a time. Well thats not entirely true - you should get a faster failover time doing things this way. > We recommend that you don't even mount on more than one node at > a time. Thats news to me... there is no problem at all with mounting a GFS(2) on multiple nodes in this kind of configuration, and if there is no i/o from the second node, it will make no difference to overall performance either. > You certainly should not NFS export it from more than one > node at a time. You should not even export it as both NFS and CIFS > at the same time from the same node. If memory serves correctly, > you shouldn't even use the FS for, say, a database on the same node > as the one you're exporting it from (I may be remembering that last > bit wrong, if it's important I can check, but I don't think it's > important to this discussion). > > The point here is that because you can't use it from more than one > node at a time when you are exporting it, there's just no point in > using GFS or GFS2 at all. This is not a high performance filesystem. > It was never intended to be. It's primary advantage is that it can > be used on multiple nodes in a cluster at the same time. If your > application (exporting it) takes that advantage away, you've lost > the main (possibly only) advantage in using it. > > Steve Whitehouse actually touched on this when he asked if you had > it set up active/passive, but I didn't see where you answered that > question (I may have missed it, though). > Yes, I was just trying to get my head around the configuration in this case to make sure that it was ok and so far I've not spotted anything that is obviously likely to make the performance via NFS slow. However this is more of an NFS than a GFS2 question I think - GFS2 seems to be providing reasonable performance and it is the addition of NFS that is causing the performance issue. I didn't see an answer to the question either, but I'm assuming that means that it is not a problem. Also, as regards GFS2 being high performance or not, that largely depends on what the workload is, and for the test which was outlined there is no reason why GFS2 should not provide pretty good performance and that seems to be bourne out by the reported results. > Now, if you still want to use GFS2, there are some things that I can > help you with from the GFS side that can speed it up considerably > (depending on the kind of access going on with it), which may actually > turn out to be the cause of the problem that you're experiencing > (interaction between NFS and GFS2), but I would first recommend that > you use a different filesystem all together if your only purpose for > it is NFS exporting. > I don't agree. Within the restrictions that we've noted above there is nothing wrong with exporting GFS2 via NFS, Steve. > I'm sorry I don't have much advice to give for the NFS specific > stuff, though. > > tg. > > On Tue, Sep 24, 2013 at 02:08:01PM +0200, Olivier Desport wrote: > > Le 24/09/2013 14:03, Raymond Burkholder a e'crit : > > > > One possibility is to try changing rsize and wsize at the client end. Is > > the 10G network used for NFS the same network as the one used for the > > iSCSI? > > I've tried several values of rsize and wsize and it doesn't > > change anything. iSCSI and NFS are not on the same VLAN. I'm > > testing with servers which are on the same chassis (NFS/GFS > > servers are on blades and client is a KVM guest with virtio > > network card). iSCSI and NFS use separates network in the > > chassis but each blade has to share iSCSI (50% to 100% > > of 10 Gbits/s) and LAN (50% to 100%). > > > > iperf between server and client : 3.4 Gbits/s (435 Mbytes/s). > > > > You could also try using Jumbo Frames on the network. > > > > I've tried but the NFS speed is the same. > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > From olivier.desport at ac-versailles.fr Tue Sep 24 13:26:32 2013 From: olivier.desport at ac-versailles.fr (Olivier Desport) Date: Tue, 24 Sep 2013 15:26:32 +0200 Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <20130924125708.GC3315@Hungry.redhat.com> References: <52415B63.8060707@ac-versailles.fr> <1380015756.2715.2.camel@menhir> <52416D1C.1040608@ac-versailles.fr> <074801ceb91e$1b1f8110$515e8330$@oneunified.net> <524180A1.2050402@ac-versailles.fr> <20130924125708.GC3315@Hungry.redhat.com> Message-ID: <52419308.2000808@ac-versailles.fr> Le 24/09/2013 14:57, Thom Gardner a ?crit : > > Steve Whitehouse actually touched on this when he asked if you had > it set up active/passive, but I didn't see where you answered that > question (I may have missed it, though). > > I've installed keepalived to have failover for NFS. All volumes are NFS exported by the two nodes but the clients use the same NFS server to mount the same volume. I use NFS because I will have to connect more than 30 clients to the SAN datas. Network team can't give me a large number of addresses in the iSCSI VLAN and I've read that 16 nodes should be the maximum number. -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.hofmeister at hp.com Tue Sep 24 16:34:20 2013 From: james.hofmeister at hp.com (Hofmeister, James (HP ESSN BCS Linux ERT)) Date: Tue, 24 Sep 2013 16:34:20 +0000 Subject: [Linux-cluster] gfs:gfs_assert_i+0x67/0x92 seen when node joining cluster In-Reply-To: <52415475.2070702@redhat.com> References: <5CBE4DF16DF0DE4A99CCC64ACC08A8791434BBAC@G5W2714.americas.hpqcorp.net> <52415475.2070702@redhat.com> Message-ID: <5CBE4DF16DF0DE4A99CCC64ACC08A8791435717C@G6W2500.americas.hpqcorp.net> Hello Elvir, I was not looking for a deep analysis of this problem, just a search for known issues... I have not found a duplicate in my Google and bugzilla searches. 1) RHEL version Red Hat Enterprise Linux Server release 5.7 (Tikanga) 2) gfs* packages version gfs2-utils-0.1.62-31.el5.x86_64 Fri 20 Jan 2012 11:25:40 AM COT gfs-utils-0.1.20-10.el5.x86_64 Fri 20 Jan 2012 11:25:40 AM COT kmod-gfs-0.1.34-15.el5.x86_64 Fri 20 Jan 2012 11:26:53 AM COT 3) kernel version Linux xxxxxx 2.6.18-274.el5 #1 SMP Fri Jul 8 17:36:59 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux 4) You can also attach cluster.conf for us. I will send in the cluster.conf when I open the support call. They are taking an error out of gfs not seen/reported at other sites: Call Trace: [] :gfs:gfs_assert_i+0x67/0x92 [] :gfs:unlinked_scan_elements+0x99/0x180 [] :gfs:gfs_dreread+0x87/0xc6 [] :gfs:foreach_descriptor+0x229/0x305 [] :gfs:fill_super+0x0/0x642 [] :gfs:gfs_recover_dump+0xdd/0x14e [] :gfs:gfs_make_fs_rw+0xc0/0x11a [] :gfs:init_journal+0x279/0x34c [] :gfs:fill_super+0x48e/0x642 [] get_sb_bdev+0x10a/0x16c [] vfs_kern_mount+0x93/0x11a [] do_kern_mount+0x36/0x4d [] do_mount+0x6a9/0x719 [] enqueue_task+0x41/0x56 [] do_sock_read+0xcf/0x110 [] sock_aio_read+0x4f/0x5e [] do_sync_read+0xc7/0x104 [] zone_statistics+0x3e/0x6d [] __alloc_pages+0x78/0x308 [] sys_mount+0x8a/0xcd Sep 18 04:09:51 hpium2 syslogd 1.4.1: restart. Sep 18 04:09:51 hpium2 kernel: klogd 1.4.1, log source = /proc/kmsg started. Regards, James Hofmeister Hewlett Packard Linux Engineering Resolution Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From morpheus.ibis at gmail.com Wed Sep 25 14:25:57 2013 From: morpheus.ibis at gmail.com (Pavel Herrmann) Date: Wed, 25 Sep 2013 16:25:57 +0200 Subject: [Linux-cluster] bug in GFS2? Message-ID: <2199931.kG8UE7uL2q@gesher> Hi I am trying to build a two-node cluster for samba, but I'm having some GFS2 issues. The nodes themselves run as virtual machines in KVM (on different hosts), use gentoo kernel 3.10.7 (not sure what exact version of vanilla it is based on), and I use the cluster-next stack in somewhat minimal configuration (corosync-2 with DLM-4, no pacemaker). while testing my cluster (using smbtorture), everything works fine, but the moment I let users onto it, i get a kernel error that hangs the cluster (fencing is set up and working, but doesnt kick in for some reason) this is what I get in kernel log: Sep 25 07:10:12 fs2 kernel: [18024.888481] GFS2: fsid=fs_clust:homes.1: quota exceeded for user 104202 Sep 25 07:10:18 fs2 kernel: [18030.335727] GFS2: fsid=fs_clust:homes.1: quota exceeded for user 104202 Sep 25 07:10:23 fs2 kernel: [18035.994476] original: gfs2_inode_lookup+0x128/0x240 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.994482] pid: 25317 Sep 25 07:10:23 fs2 kernel: [18035.994484] lock type: 5 req lock state : 3 Sep 25 07:10:23 fs2 kernel: [18035.994491] new: gfs2_inode_lookup+0x128/0x240 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.994493] pid: 25317 Sep 25 07:10:23 fs2 kernel: [18035.994494] lock type: 5 req lock state : 3 Sep 25 07:10:23 fs2 kernel: [18035.994498] G: s:SH n:5/168b15e f:Iqob t:SH d:EX/0 a:0 v:0 r:4 m:50 Sep 25 07:10:23 fs2 kernel: [18035.994506] H: s:SH f:EH e:0 p:25317 [smbd] gfs2_inode_lookup+0x128/0x240 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.994549] general protection fault: 0000 [#1] SMP Sep 25 07:10:23 fs2 kernel: [18035.994840] Modules linked in: iptable_filter ip_tables x_tables gfs2 dm_mod dlm sctp libcrc32c ipv6 configfs virtio_net i6300esb Sep 25 07:10:23 fs2 kernel: [18035.995617] CPU: 2 PID: 25317 Comm: smbd Not tainted 3.10.7-gentoo #10 Sep 25 07:10:23 fs2 kernel: [18035.995910] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Sep 25 07:10:23 fs2 kernel: [18035.996191] task: ffff8800b2aa1b00 ti: ffff8800a4a02000 task.ti: ffff8800a4a02000 Sep 25 07:10:23 fs2 kernel: [18035.996546] RIP: 0010:[] [] pid_task+0xb/0x40 Sep 25 07:10:23 fs2 kernel: [18035.996999] RSP: 0018:ffff8800a4a03a10 EFLAGS: 00010206 Sep 25 07:10:23 fs2 kernel: [18035.997253] RAX: 13270cbeaaf4957b RBX: ffff8800988f7710 RCX: 0000000000000006 Sep 25 07:10:23 fs2 kernel: [18035.997592] RDX: 0000000000000007 RSI: 0000000000000000 RDI: 13270cbeaaf4957b Sep 25 07:10:23 fs2 kernel: [18035.997934] RBP: ffff8800a4b43ba0 R08: 000000000000000a R09: 0000000000000000 Sep 25 07:10:23 fs2 kernel: [18035.998019] R10: 0000000000000191 R11: 0000000000000190 R12: 0000000000000000 Sep 25 07:10:23 fs2 kernel: [18035.998019] R13: ffff8800a4b43bf0 R14: ffffffffa0133720 R15: ffff8800995bd988 Sep 25 07:10:23 fs2 kernel: [18035.998019] FS: 00007f1846316740(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000 Sep 25 07:10:23 fs2 kernel: [18035.998019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 25 07:10:23 fs2 kernel: [18035.998019] CR2: 000000000122aae8 CR3: 000000009880c000 CR4: 00000000000007a0 Sep 25 07:10:23 fs2 kernel: [18035.998019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 25 07:10:23 fs2 kernel: [18035.998019] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Sep 25 07:10:23 fs2 kernel: [18035.998019] Stack: Sep 25 07:10:23 fs2 kernel: [18035.998019] ffffffffa0111f07 ffff8800b2aa1e70 ffffffffa011ffd8 0000000000000000 Sep 25 07:10:23 fs2 kernel: [18035.998019] 0000000000000000 0000000000000000 ffff880000000004 0000000000000032 Sep 25 07:10:23 fs2 kernel: [18035.998019] ffff8800a4b43ba0 ffff8800a4b43bf0 00000000626f7149 ffff8800995bd988 Sep 25 07:10:23 fs2 kernel: [18035.998019] Call Trace: Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_dump_glock+0x1c7/0x360 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_inode_lookup+0x128/0x240 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? printk+0x4f/0x54 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? inode_init_always+0xed/0x1b0 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? _raw_spin_lock+0x5/0x10 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_glock_nq+0x30b/0x3e0 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_inode_lookup+0x130/0x240 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_dirent_search+0xe5/0x1c0 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_dir_search+0x4a/0x80 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_lookupi+0xf7/0x1f0 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_lookupi+0x1b9/0x1f0 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_lookup+0x21/0xa0 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? _raw_spin_lock+0x5/0x10 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? d_alloc+0x76/0x90 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? lookup_dcache+0xa3/0xd0 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? lookup_real+0x14/0x50 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? __lookup_hash+0x32/0x50 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? lookup_slow+0x3c/0xa2 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? _raw_spin_lock+0x5/0x10 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? path_lookupat+0x23f/0x780 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_getxattr+0x79/0xa0 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? filename_lookup+0x2f/0xc0 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? getname_flags+0xbc/0x1a0 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? user_path_at_empty+0x5c/0xb0 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_holder_uninit+0x16/0x30 [gfs2] Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? cp_new_stat+0x10d/0x120 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? vfs_fstatat+0x3f/0x90 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? SYSC_newstat+0x12/0x30 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? lg_local_lock+0x11/0x20 Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? system_call_fastpath+0x16/0x1b Sep 25 07:10:23 fs2 kernel: [18035.998019] Code: 31 f6 48 85 c0 74 0c 8b 50 04 48 c1 e2 05 48 8b 74 10 38 e9 28 ff ff ff 0f 1f 84 00 00 00 00 00 48 85 ff 74 23 89 f6 48 8d 04 f7 <48> 8b 40 08 48 85 c0 74 1c 48 8d 14 76 48 8d 14 d5 30 02 00 00 Sep 25 07:10:23 fs2 kernel: [18035.998019] RIP [] pid_task+0xb/0x40 Sep 25 07:10:23 fs2 kernel: [18035.998019] RSP Sep 25 07:10:23 fs2 kernel: [18036.033702] ---[ end trace e5751bbc7d3a8d7c ]--- simple inspecfion of the gfs2 code showed this is caused by attempting a recursive lock. two gfs2_inode_lookups are visible in the trace, not sure that is strictly relevant though. this is followed by (probaby related) trace: Sep 25 07:10:24 fs2 kernel: [18036.162513] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070 Sep 25 07:10:24 fs2 kernel: [18036.164016] IP: [] gfs2_permission+0x56/0x110 [gfs2] Sep 25 07:10:24 fs2 kernel: [18036.164016] PGD 989a3067 PUD 9886a067 PMD 0 Sep 25 07:10:24 fs2 kernel: [18036.164016] Oops: 0000 [#2] SMP Sep 25 07:10:24 fs2 kernel: [18036.164016] Modules linked in: iptable_filter ip_tables x_tables gfs2 dm_mod dlm sctp libcrc32c ipv6 configfs virtio_net i6300esb Sep 25 07:10:24 fs2 kernel: [18036.164016] CPU: 1 PID: 25453 Comm: smbd Tainted: G D 3.10.7-gentoo #10 Sep 25 07:10:24 fs2 kernel: [18036.164016] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Sep 25 07:10:24 fs2 kernel: [18036.164016] task: ffff8800afca0d80 ti: ffff8800a4a02000 task.ti: ffff8800a4a02000 Sep 25 07:10:24 fs2 kernel: [18036.164016] RIP: 0010:[] [] gfs2_permission+0x56/0x110 [gfs2] Sep 25 07:10:24 fs2 kernel: [18036.164016] RSP: 0018:ffff8800a4a03c08 EFLAGS: 00010286 Sep 25 07:10:24 fs2 kernel: [18036.164016] RAX: ffffffff8145f245 RBX: 0000000000000040 RCX: 0000000000000000 Sep 25 07:10:24 fs2 kernel: [18036.164016] RDX: ffff8800b5668f00 RSI: 0000000000000001 RDI: ffff8800a4b97ddc Sep 25 07:10:24 fs2 kernel: [18036.164016] RBP: ffff880099486e60 R08: 0000000000000061 R09: 0000000000000000 Sep 25 07:10:24 fs2 kernel: [18036.164016] R10: ff48ad3954b34002 R11: d09e94939e979e85 R12: ffff8800a4b97ddc Sep 25 07:10:24 fs2 kernel: [18036.164016] R13: 0000000000000001 R14: ffff8800a4b97df8 R15: ffff8800afca0d80 Sep 25 07:10:24 fs2 kernel: [18036.164016] FS: 00007f1846316740(0000) GS:ffff8800bfa80000(0000) knlGS:0000000000000000 Sep 25 07:10:24 fs2 kernel: [18036.164016] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 25 07:10:24 fs2 kernel: [18036.164016] CR2: 0000000000000070 CR3: 000000009880c000 CR4: 00000000000007a0 Sep 25 07:10:24 fs2 kernel: [18036.164016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 25 07:10:24 fs2 kernel: [18036.164016] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Sep 25 07:10:24 fs2 kernel: [18036.164016] Stack: Sep 25 07:10:24 fs2 kernel: [18036.164016] ffff8800994e0c00 ffffffff81125a8b ffff8800a4a03c18 ffff8800a4a03c18 Sep 25 07:10:24 fs2 kernel: [18036.164016] 0000000000000000 ffff8800bbba8d20 0000000800000003 0000000200000000 Sep 25 07:10:24 fs2 kernel: [18036.164016] ffffffff8145f245 ffffffff8112ff5e ffff8800a4a03e08 0000000000000007 Sep 25 07:10:24 fs2 kernel: [18036.164016] Call Trace: Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? lookup_fast+0x1ab/0x2f0 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? _raw_spin_lock+0x5/0x10 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? dput+0x17e/0x220 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? link_path_walk+0x23a/0x8b0 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? path_init+0x30c/0x410 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? path_lookupat+0x52/0x780 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? filename_lookup+0x2f/0xc0 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? getname_flags+0xbc/0x1a0 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? user_path_at_empty+0x5c/0xb0 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? vfs_fstatat+0x3f/0x90 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? SYSC_newstat+0x12/0x30 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? SyS_read+0x50/0xa0 Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? system_call_fastpath+0x16/0x1b Sep 25 07:10:24 fs2 kernel: [18036.164016] Code: c6 50 65 48 8b 04 25 80 b7 00 00 48 8b 90 40 02 00 00 4c 39 f3 75 14 eb 1a 0f 1f 40 00 48 3b 53 18 74 12 48 8b 1b 49 39 de 74 08 <48> 8b 43 30 a8 40 75 ea 31 db 4c 89 e7 e8 e8 78 f0 e0 66 90 45 Sep 25 07:10:24 fs2 kernel: [18036.164016] RIP [] gfs2_permission+0x56/0x110 [gfs2] Sep 25 07:10:24 fs2 kernel: [18036.164016] RSP Sep 25 07:10:24 fs2 kernel: [18036.164016] CR2: 0000000000000070 Sep 25 07:10:24 fs2 kernel: [18036.218133] ---[ end trace e5751bbc7d3a8d7d ]--- afterwards the log is filled with "INFO: rcu_sched self-detected stall" and NMI-caused backtraces Is this a known-and-fixed bug? is there a way to prevent this? thanks Pavel Herrmann From swhiteho at redhat.com Wed Sep 25 14:38:54 2013 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 25 Sep 2013 15:38:54 +0100 Subject: [Linux-cluster] bug in GFS2? In-Reply-To: <2199931.kG8UE7uL2q@gesher> References: <2199931.kG8UE7uL2q@gesher> Message-ID: <1380119934.2656.10.camel@menhir> Hi, On Wed, 2013-09-25 at 16:25 +0200, Pavel Herrmann wrote: > Hi > > I am trying to build a two-node cluster for samba, but I'm having some GFS2 > issues. > > The nodes themselves run as virtual machines in KVM (on different hosts), use > gentoo kernel 3.10.7 (not sure what exact version of vanilla it is based on), > and I use the cluster-next stack in somewhat minimal configuration (corosync-2 > with DLM-4, no pacemaker). > > while testing my cluster (using smbtorture), everything works fine, but the > moment I let users onto it, i get a kernel error that hangs the cluster > (fencing is set up and working, but doesnt kick in for some reason) > I suspect that this has been fixed, but without knowing exactly what version of the kernel this is and what patches have been applied to the kernel, I'm afraid that I'm a bit in the dark. I don't think we've seen anything like this recently relating to type 5 glocks, Steve. > this is what I get in kernel log: > > Sep 25 07:10:12 fs2 kernel: [18024.888481] GFS2: fsid=fs_clust:homes.1: quota exceeded for user 104202 > Sep 25 07:10:18 fs2 kernel: [18030.335727] GFS2: fsid=fs_clust:homes.1: quota exceeded for user 104202 > Sep 25 07:10:23 fs2 kernel: [18035.994476] original: gfs2_inode_lookup+0x128/0x240 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.994482] pid: 25317 > Sep 25 07:10:23 fs2 kernel: [18035.994484] lock type: 5 req lock state : 3 > Sep 25 07:10:23 fs2 kernel: [18035.994491] new: gfs2_inode_lookup+0x128/0x240 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.994493] pid: 25317 > Sep 25 07:10:23 fs2 kernel: [18035.994494] lock type: 5 req lock state : 3 > Sep 25 07:10:23 fs2 kernel: [18035.994498] G: s:SH n:5/168b15e f:Iqob t:SH d:EX/0 a:0 v:0 r:4 m:50 > Sep 25 07:10:23 fs2 kernel: [18035.994506] H: s:SH f:EH e:0 p:25317 [smbd] gfs2_inode_lookup+0x128/0x240 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.994549] general protection fault: 0000 [#1] SMP > Sep 25 07:10:23 fs2 kernel: [18035.994840] Modules linked in: iptable_filter ip_tables x_tables gfs2 dm_mod dlm sctp libcrc32c ipv6 configfs virtio_net i6300esb > Sep 25 07:10:23 fs2 kernel: [18035.995617] CPU: 2 PID: 25317 Comm: smbd Not tainted 3.10.7-gentoo #10 > Sep 25 07:10:23 fs2 kernel: [18035.995910] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > Sep 25 07:10:23 fs2 kernel: [18035.996191] task: ffff8800b2aa1b00 ti: ffff8800a4a02000 task.ti: ffff8800a4a02000 > Sep 25 07:10:23 fs2 kernel: [18035.996546] RIP: 0010:[] [] pid_task+0xb/0x40 > Sep 25 07:10:23 fs2 kernel: [18035.996999] RSP: 0018:ffff8800a4a03a10 EFLAGS: 00010206 > Sep 25 07:10:23 fs2 kernel: [18035.997253] RAX: 13270cbeaaf4957b RBX: ffff8800988f7710 RCX: 0000000000000006 > Sep 25 07:10:23 fs2 kernel: [18035.997592] RDX: 0000000000000007 RSI: 0000000000000000 RDI: 13270cbeaaf4957b > Sep 25 07:10:23 fs2 kernel: [18035.997934] RBP: ffff8800a4b43ba0 R08: 000000000000000a R09: 0000000000000000 > Sep 25 07:10:23 fs2 kernel: [18035.998019] R10: 0000000000000191 R11: 0000000000000190 R12: 0000000000000000 > Sep 25 07:10:23 fs2 kernel: [18035.998019] R13: ffff8800a4b43bf0 R14: ffffffffa0133720 R15: ffff8800995bd988 > Sep 25 07:10:23 fs2 kernel: [18035.998019] FS: 00007f1846316740(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000 > Sep 25 07:10:23 fs2 kernel: [18035.998019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Sep 25 07:10:23 fs2 kernel: [18035.998019] CR2: 000000000122aae8 CR3: 000000009880c000 CR4: 00000000000007a0 > Sep 25 07:10:23 fs2 kernel: [18035.998019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Sep 25 07:10:23 fs2 kernel: [18035.998019] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Sep 25 07:10:23 fs2 kernel: [18035.998019] Stack: > Sep 25 07:10:23 fs2 kernel: [18035.998019] ffffffffa0111f07 ffff8800b2aa1e70 ffffffffa011ffd8 0000000000000000 > Sep 25 07:10:23 fs2 kernel: [18035.998019] 0000000000000000 0000000000000000 ffff880000000004 0000000000000032 > Sep 25 07:10:23 fs2 kernel: [18035.998019] ffff8800a4b43ba0 ffff8800a4b43bf0 00000000626f7149 ffff8800995bd988 > Sep 25 07:10:23 fs2 kernel: [18035.998019] Call Trace: > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_dump_glock+0x1c7/0x360 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_inode_lookup+0x128/0x240 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? printk+0x4f/0x54 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? inode_init_always+0xed/0x1b0 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? _raw_spin_lock+0x5/0x10 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_glock_nq+0x30b/0x3e0 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_inode_lookup+0x130/0x240 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_dirent_search+0xe5/0x1c0 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_dir_search+0x4a/0x80 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_lookupi+0xf7/0x1f0 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_lookupi+0x1b9/0x1f0 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_lookup+0x21/0xa0 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? _raw_spin_lock+0x5/0x10 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? d_alloc+0x76/0x90 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? lookup_dcache+0xa3/0xd0 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? lookup_real+0x14/0x50 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? __lookup_hash+0x32/0x50 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? lookup_slow+0x3c/0xa2 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? _raw_spin_lock+0x5/0x10 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? path_lookupat+0x23f/0x780 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_getxattr+0x79/0xa0 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? filename_lookup+0x2f/0xc0 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? getname_flags+0xbc/0x1a0 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? user_path_at_empty+0x5c/0xb0 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? gfs2_holder_uninit+0x16/0x30 [gfs2] > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? cp_new_stat+0x10d/0x120 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? vfs_fstatat+0x3f/0x90 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? SYSC_newstat+0x12/0x30 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? lg_local_lock+0x11/0x20 > Sep 25 07:10:23 fs2 kernel: [18035.998019] [] ? system_call_fastpath+0x16/0x1b > Sep 25 07:10:23 fs2 kernel: [18035.998019] Code: 31 f6 48 85 c0 74 0c 8b 50 04 48 c1 e2 05 48 8b 74 10 38 e9 28 ff ff ff 0f 1f 84 00 00 00 00 00 48 85 ff 74 23 89 f6 48 8d 04 f7 <48> 8b 40 08 48 85 c0 74 1c 48 8d 14 76 48 8d 14 d5 30 02 00 00 > Sep 25 07:10:23 fs2 kernel: [18035.998019] RIP [] pid_task+0xb/0x40 > Sep 25 07:10:23 fs2 kernel: [18035.998019] RSP > Sep 25 07:10:23 fs2 kernel: [18036.033702] ---[ end trace e5751bbc7d3a8d7c ]--- > > > simple inspecfion of the gfs2 code showed this is caused by attempting a > recursive lock. two gfs2_inode_lookups are visible in the trace, not sure > that is strictly relevant though. > > this is followed by (probaby related) trace: > > > Sep 25 07:10:24 fs2 kernel: [18036.162513] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070 > Sep 25 07:10:24 fs2 kernel: [18036.164016] IP: [] gfs2_permission+0x56/0x110 [gfs2] > Sep 25 07:10:24 fs2 kernel: [18036.164016] PGD 989a3067 PUD 9886a067 PMD 0 > Sep 25 07:10:24 fs2 kernel: [18036.164016] Oops: 0000 [#2] SMP > Sep 25 07:10:24 fs2 kernel: [18036.164016] Modules linked in: iptable_filter ip_tables x_tables gfs2 dm_mod dlm sctp libcrc32c ipv6 configfs virtio_net i6300esb > Sep 25 07:10:24 fs2 kernel: [18036.164016] CPU: 1 PID: 25453 Comm: smbd Tainted: G D 3.10.7-gentoo #10 > Sep 25 07:10:24 fs2 kernel: [18036.164016] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > Sep 25 07:10:24 fs2 kernel: [18036.164016] task: ffff8800afca0d80 ti: ffff8800a4a02000 task.ti: ffff8800a4a02000 > Sep 25 07:10:24 fs2 kernel: [18036.164016] RIP: 0010:[] [] gfs2_permission+0x56/0x110 [gfs2] > Sep 25 07:10:24 fs2 kernel: [18036.164016] RSP: 0018:ffff8800a4a03c08 EFLAGS: 00010286 > Sep 25 07:10:24 fs2 kernel: [18036.164016] RAX: ffffffff8145f245 RBX: 0000000000000040 RCX: 0000000000000000 > Sep 25 07:10:24 fs2 kernel: [18036.164016] RDX: ffff8800b5668f00 RSI: 0000000000000001 RDI: ffff8800a4b97ddc > Sep 25 07:10:24 fs2 kernel: [18036.164016] RBP: ffff880099486e60 R08: 0000000000000061 R09: 0000000000000000 > Sep 25 07:10:24 fs2 kernel: [18036.164016] R10: ff48ad3954b34002 R11: d09e94939e979e85 R12: ffff8800a4b97ddc > Sep 25 07:10:24 fs2 kernel: [18036.164016] R13: 0000000000000001 R14: ffff8800a4b97df8 R15: ffff8800afca0d80 > Sep 25 07:10:24 fs2 kernel: [18036.164016] FS: 00007f1846316740(0000) GS:ffff8800bfa80000(0000) knlGS:0000000000000000 > Sep 25 07:10:24 fs2 kernel: [18036.164016] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Sep 25 07:10:24 fs2 kernel: [18036.164016] CR2: 0000000000000070 CR3: 000000009880c000 CR4: 00000000000007a0 > Sep 25 07:10:24 fs2 kernel: [18036.164016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Sep 25 07:10:24 fs2 kernel: [18036.164016] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Sep 25 07:10:24 fs2 kernel: [18036.164016] Stack: > Sep 25 07:10:24 fs2 kernel: [18036.164016] ffff8800994e0c00 ffffffff81125a8b ffff8800a4a03c18 ffff8800a4a03c18 > Sep 25 07:10:24 fs2 kernel: [18036.164016] 0000000000000000 ffff8800bbba8d20 0000000800000003 0000000200000000 > Sep 25 07:10:24 fs2 kernel: [18036.164016] ffffffff8145f245 ffffffff8112ff5e ffff8800a4a03e08 0000000000000007 > Sep 25 07:10:24 fs2 kernel: [18036.164016] Call Trace: > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? lookup_fast+0x1ab/0x2f0 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? _raw_spin_lock+0x5/0x10 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? dput+0x17e/0x220 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? link_path_walk+0x23a/0x8b0 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? path_init+0x30c/0x410 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? path_lookupat+0x52/0x780 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? filename_lookup+0x2f/0xc0 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? getname_flags+0xbc/0x1a0 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? user_path_at_empty+0x5c/0xb0 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? vfs_fstatat+0x3f/0x90 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? SYSC_newstat+0x12/0x30 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? SyS_read+0x50/0xa0 > Sep 25 07:10:24 fs2 kernel: [18036.164016] [] ? system_call_fastpath+0x16/0x1b > Sep 25 07:10:24 fs2 kernel: [18036.164016] Code: c6 50 65 48 8b 04 25 80 b7 00 00 48 8b 90 40 02 00 00 4c 39 f3 75 14 eb 1a 0f 1f 40 00 48 3b 53 18 74 12 48 8b 1b 49 39 de 74 08 <48> 8b 43 30 a8 40 75 ea 31 db 4c 89 e7 e8 e8 78 f0 e0 66 90 45 > Sep 25 07:10:24 fs2 kernel: [18036.164016] RIP [] gfs2_permission+0x56/0x110 [gfs2] > Sep 25 07:10:24 fs2 kernel: [18036.164016] RSP > Sep 25 07:10:24 fs2 kernel: [18036.164016] CR2: 0000000000000070 > Sep 25 07:10:24 fs2 kernel: [18036.218133] ---[ end trace e5751bbc7d3a8d7d ]--- > > afterwards the log is filled with "INFO: rcu_sched self-detected stall" and > NMI-caused backtraces > > Is this a known-and-fixed bug? is there a way to prevent this? > > > thanks > Pavel Herrmann > From morpheus.ibis at gmail.com Wed Sep 25 15:29:08 2013 From: morpheus.ibis at gmail.com (Pavel Herrmann) Date: Wed, 25 Sep 2013 17:29:08 +0200 Subject: [Linux-cluster] bug in GFS2? In-Reply-To: <1380119934.2656.10.camel@menhir> References: <2199931.kG8UE7uL2q@gesher> <1380119934.2656.10.camel@menhir> Message-ID: <2222256.xlCigChRg0@bloomfield> Hi On Wednesday 25 of September 2013 15:38:54 Steven Whitehouse wrote: > Hi, > > On Wed, 2013-09-25 at 16:25 +0200, Pavel Herrmann wrote: > > Hi > > > > I am trying to build a two-node cluster for samba, but I'm having some > > GFS2 > > issues. > > > > The nodes themselves run as virtual machines in KVM (on different hosts), > > use gentoo kernel 3.10.7 (not sure what exact version of vanilla it is > > based on), and I use the cluster-next stack in somewhat minimal > > configuration (corosync-2 with DLM-4, no pacemaker). > > > > while testing my cluster (using smbtorture), everything works fine, but > > the > > moment I let users onto it, i get a kernel error that hangs the cluster > > (fencing is set up and working, but doesnt kick in for some reason) > > I suspect that this has been fixed, but without knowing exactly what > version of the kernel this is and what patches have been applied to the > kernel, I'm afraid that I'm a bit in the dark. I don't think we've seen > anything like this recently relating to type 5 glocks, The kernel seems to be based on vanilla 3.10.7, with no additional patches that are related to DLM or GFS2 (full list on [1]). I could try with a newer kernel version, but since I need my users to reproduce the bug (and they are not too happy when things break), I would prefer not to do it just for the sake of having the latest version, if there were no possibly-related changes introduced thanks Pavel Herrmann [1] http://dev.gentoo.org/~mpagano/genpatches/patches-3.10-13.htm From bubble at hoster-ok.com Thu Sep 26 05:28:50 2013 From: bubble at hoster-ok.com (Vladislav Bogdanov) Date: Thu, 26 Sep 2013 08:28:50 +0300 Subject: [Linux-cluster] bug in GFS2? In-Reply-To: <2199931.kG8UE7uL2q@gesher> References: <2199931.kG8UE7uL2q@gesher> Message-ID: <5243C612.4060400@hoster-ok.com> 25.09.2013 17:25, Pavel Herrmann wrote: > Hi > > I am trying to build a two-node cluster for samba, but I'm having some GFS2 > issues. > > The nodes themselves run as virtual machines in KVM (on different hosts), use > gentoo kernel 3.10.7 (not sure what exact version of vanilla it is based on), > and I use the cluster-next stack in somewhat minimal configuration (corosync-2 > with DLM-4, no pacemaker). > Just a note. dlm-4 (and thus gfs) requires stonith-ng subsystem of pacemaker to be running, otherwise it is unable to query/perform fencing and many funny things may happen. I believe something was done in the pacemaker code to allow stonithd (daemon implementing stonith-ng) to be run independently of the rest of pacemaker. From morpheus.ibis at gmail.com Thu Sep 26 07:59:56 2013 From: morpheus.ibis at gmail.com (Pavel Herrmann) Date: Thu, 26 Sep 2013 09:59:56 +0200 Subject: [Linux-cluster] bug in GFS2? In-Reply-To: <5243C612.4060400@hoster-ok.com> References: <2199931.kG8UE7uL2q@gesher> <5243C612.4060400@hoster-ok.com> Message-ID: <1673863.2lEfWF2HIg@bloomfield> Hi On Thursday 26 of September 2013 08:28:50 Vladislav Bogdanov wrote: > 25.09.2013 17:25, Pavel Herrmann wrote: > > Hi > > > > I am trying to build a two-node cluster for samba, but I'm having some > > GFS2 > > issues. > > > > The nodes themselves run as virtual machines in KVM (on different hosts), > > use gentoo kernel 3.10.7 (not sure what exact version of vanilla it is > > based on), and I use the cluster-next stack in somewhat minimal > > configuration (corosync-2 with DLM-4, no pacemaker). > > Just a note. > dlm-4 (and thus gfs) requires stonith-ng subsystem of pacemaker to be > running, otherwise it is unable to query/perform fencing and many funny > things may happen. I believe something was done in the pacemaker code to > allow stonithd (daemon implementing stonith-ng) to be run independently > of the rest of pacemaker. >From looking at a bit of fencing code, there was no (obvious) dependency on pacemaker. I do have fencing set up, using a custom script that connects to the other nodes qemu console and forcibly reboots it. In testing (that is stopping one of the nodes from said console) it worked perfectly, but in the case of this lockup it was not invoked (logging the date is of course part of the script). regards Pavel Herrmann From bubble at hoster-ok.com Thu Sep 26 08:20:11 2013 From: bubble at hoster-ok.com (Vladislav Bogdanov) Date: Thu, 26 Sep 2013 11:20:11 +0300 Subject: [Linux-cluster] bug in GFS2? In-Reply-To: <1673863.2lEfWF2HIg@bloomfield> References: <2199931.kG8UE7uL2q@gesher> <5243C612.4060400@hoster-ok.com> <1673863.2lEfWF2HIg@bloomfield> Message-ID: <5243EE3B.6090500@hoster-ok.com> 26.09.2013 10:59, Pavel Herrmann wrote: > Hi > > On Thursday 26 of September 2013 08:28:50 Vladislav Bogdanov wrote: >> 25.09.2013 17:25, Pavel Herrmann wrote: >>> Hi >>> >>> I am trying to build a two-node cluster for samba, but I'm having some >>> GFS2 >>> issues. >>> >>> The nodes themselves run as virtual machines in KVM (on different hosts), >>> use gentoo kernel 3.10.7 (not sure what exact version of vanilla it is >>> based on), and I use the cluster-next stack in somewhat minimal >>> configuration (corosync-2 with DLM-4, no pacemaker). >> >> Just a note. >> dlm-4 (and thus gfs) requires stonith-ng subsystem of pacemaker to be >> running, otherwise it is unable to query/perform fencing and many funny >> things may happen. I believe something was done in the pacemaker code to >> allow stonithd (daemon implementing stonith-ng) to be run independently >> of the rest of pacemaker. > > From looking at a bit of fencing code, there was no (obvious) dependency on > pacemaker. I do have fencing set up, using a custom script that connects to > the other nodes qemu console and forcibly reboots it. In testing (that is > stopping one of the nodes from said console) it worked perfectly, but in the > case of this lockup it was not invoked (logging the date is of course part of > the script). Ok, got it. I actually meant default dlm setup, it runs /usr/sbin/dlm_stonith which uses stonith_api_*_helper() functions defined inline in pacemaker's stonith-ng API, which in turn dlopen libstonithd.so.2 and use symbols from there. From bfields at fieldses.org Thu Sep 26 16:09:42 2013 From: bfields at fieldses.org (J. Bruce Fields) Date: Thu, 26 Sep 2013 12:09:42 -0400 Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <52415B63.8060707@ac-versailles.fr> References: <52415B63.8060707@ac-versailles.fr> Message-ID: <20130926160942.GF704@fieldses.org> On Tue, Sep 24, 2013 at 11:29:07AM +0200, Olivier Desport wrote: > Hello, > > I've installed a two nodes GFS2 cluster on Debian 7. What kernel is that? --b. > The nodes are > connected to the datas by iSCSI and multipathing with a 10 Gb/s > link. I can write a 1g file with dd at 500 Mbytes/s. I export with > NFS (on a 10 Gb/s network) and I only can reach 220 Mbytes/s. I > think that it's a little bit far from 500 Mbytes/s... > > Do you how to tune my settings to increase the speed for NFS ? > > GFS2 mount : > /dev/vg-bigfiles/lv-bigfiles /export/bigfiles gfs2 > _netdev,nodiratime,noatime 0 0 > > NFS export : > /export/bigfiles > 172.16.0.0/16(fsid=2,rw,async,no_root_squash,no_subtree_check) > > mount on NFS clients : > nfs-server:/export/bigfiles /data/bigfiles nfs4 > _netdev,rw,user,nodiratime,noatime,intr 0 0 > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From rgoldwyn at gmail.com Thu Sep 26 17:24:49 2013 From: rgoldwyn at gmail.com (Goldwyn Rodrigues) Date: Thu, 26 Sep 2013 12:24:49 -0500 Subject: [Linux-cluster] [PATCH] Fix hang of gfs2 mount if older dlm_controld is used Message-ID: <20130926172449.GA7668@shrek.lan> This folds ops_results and error into one. This enables the error code to trickle all the way to the calling function and the gfs2 mount fails if older dlm_controld is used. Signed-off-by: Goldwyn Rodrigues --- diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index 88556dc..8c8327a 100644 --- a/fs/dlm/lockspace.c +++ b/fs/dlm/lockspace.c @@ -409,7 +409,7 @@ static void threads_stop(void) static int new_lockspace(const char *name, const char *cluster, uint32_t flags, int lvblen, const struct dlm_lockspace_ops *ops, void *ops_arg, - int *ops_result, dlm_lockspace_t **lockspace) + dlm_lockspace_t **lockspace) { struct dlm_ls *ls; int i, size, error; @@ -431,11 +431,9 @@ static int new_lockspace(const char *name, const char *cluster, goto out; } - if (ops && ops_result) { - if (!dlm_config.ci_recover_callbacks) - *ops_result = -EOPNOTSUPP; - else - *ops_result = 0; + if (ops && (!dlm_config.ci_recover_callbacks)) { + error = -EOPNOTSUPP; + goto out; } if (dlm_config.ci_recover_callbacks && cluster && @@ -679,7 +677,7 @@ static int new_lockspace(const char *name, const char *cluster, int dlm_new_lockspace(const char *name, const char *cluster, uint32_t flags, int lvblen, const struct dlm_lockspace_ops *ops, void *ops_arg, - int *ops_result, dlm_lockspace_t **lockspace) + dlm_lockspace_t **lockspace) { int error = 0; @@ -690,7 +688,7 @@ int dlm_new_lockspace(const char *name, const char *cluster, goto out; error = new_lockspace(name, cluster, flags, lvblen, ops, ops_arg, - ops_result, lockspace); + lockspace); if (!error) ls_count++; if (error > 0) diff --git a/fs/dlm/user.c b/fs/dlm/user.c index 8121491..a29dd09 100644 --- a/fs/dlm/user.c +++ b/fs/dlm/user.c @@ -393,7 +393,7 @@ static int device_create_lockspace(struct dlm_lspace_params *params) return -EPERM; error = dlm_new_lockspace(params->name, NULL, params->flags, - DLM_USER_LVB_LEN, NULL, NULL, NULL, + DLM_USER_LVB_LEN, NULL, NULL, &lockspace); if (error) return error; diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c index c8423d6..2043544 100644 --- a/fs/gfs2/lock_dlm.c +++ b/fs/gfs2/lock_dlm.c @@ -1190,7 +1190,7 @@ static int gdlm_mount(struct gfs2_sbd *sdp, const char *table) char cluster[GFS2_LOCKNAME_LEN]; const char *fsname; uint32_t flags; - int error, ops_result; + int error; /* * initialize everything @@ -1232,24 +1232,13 @@ static int gdlm_mount(struct gfs2_sbd *sdp, const char *table) */ error = dlm_new_lockspace(fsname, cluster, flags, GDLM_LVB_SIZE, - &gdlm_lockspace_ops, sdp, &ops_result, + &gdlm_lockspace_ops, sdp, &ls->ls_dlm); if (error) { fs_err(sdp, "dlm_new_lockspace error %d\n", error); goto fail_free; } - if (ops_result < 0) { - /* - * dlm does not support ops callbacks, - * old dlm_controld/gfs_controld are used, try without ops. - */ - fs_info(sdp, "dlm lockspace ops not used\n"); - free_recover_size(ls); - set_bit(DFL_NO_DLM_OPS, &ls->ls_recover_flags); - return 0; - } - if (!test_bit(SDF_NOJOURNALID, &sdp->sd_flags)) { fs_err(sdp, "dlm lockspace ops disallow jid preset\n"); error = -EINVAL; diff --git a/fs/ocfs2/stack_user.c b/fs/ocfs2/stack_user.c index 286edf1..6546a6b 100644 --- a/fs/ocfs2/stack_user.c +++ b/fs/ocfs2/stack_user.c @@ -828,7 +828,7 @@ static int user_cluster_connect(struct ocfs2_cluster_connection *conn) } rc = dlm_new_lockspace(conn->cc_name, NULL, DLM_LSFL_FS, DLM_LVB_LEN, - NULL, NULL, NULL, &fsdlm); + NULL, NULL, &fsdlm); if (rc) { ocfs2_live_connection_drop(control); goto out; diff --git a/include/linux/dlm.h b/include/linux/dlm.h index d02da2c..9522b25 100644 --- a/include/linux/dlm.h +++ b/include/linux/dlm.h @@ -85,7 +85,7 @@ struct dlm_lockspace_ops { int dlm_new_lockspace(const char *name, const char *cluster, uint32_t flags, int lvblen, const struct dlm_lockspace_ops *ops, void *ops_arg, - int *ops_result, dlm_lockspace_t **lockspace); + dlm_lockspace_t **lockspace); /* * dlm_release_lockspace From teigland at redhat.com Thu Sep 26 18:05:28 2013 From: teigland at redhat.com (David Teigland) Date: Thu, 26 Sep 2013 14:05:28 -0400 Subject: [Linux-cluster] [PATCH] Fix hang of gfs2 mount if older dlm_controld is used In-Reply-To: <20130926172449.GA7668@shrek.lan> References: <20130926172449.GA7668@shrek.lan> Message-ID: <20130926180528.GA16351@redhat.com> On Thu, Sep 26, 2013 at 12:24:49PM -0500, Goldwyn Rodrigues wrote: > This folds ops_results and error into one. This enables the > error code to trickle all the way to the calling function and the gfs2 > mount fails if older dlm_controld is used. If it's not working then I'd prefer to fix it rather than abandoning the idea of making it work. Dave From olivier.desport at ac-versailles.fr Fri Sep 27 06:30:35 2013 From: olivier.desport at ac-versailles.fr (Olivier Desport) Date: Fri, 27 Sep 2013 08:30:35 +0200 Subject: [Linux-cluster] slow NFS performance on GFS2 In-Reply-To: <20130926160942.GF704@fieldses.org> References: <52415B63.8060707@ac-versailles.fr> <20130926160942.GF704@fieldses.org> Message-ID: <5245260B.2030201@ac-versailles.fr> Le 26/09/2013 18:09, J. Bruce Fields a ?crit : > On Tue, Sep 24, 2013 at 11:29:07AM +0200, Olivier Desport wrote: >> Hello, >> >> I've installed a two nodes GFS2 cluster on Debian 7. > What kernel is that? 3.2.0-4-amd64 > > --b. > >> The nodes are >> connected to the datas by iSCSI and multipathing with a 10 Gb/s >> link. I can write a 1g file with dd at 500 Mbytes/s. I export with >> NFS (on a 10 Gb/s network) and I only can reach 220 Mbytes/s. I >> think that it's a little bit far from 500 Mbytes/s... >> >> Do you how to tune my settings to increase the speed for NFS ? >> >> GFS2 mount : >> /dev/vg-bigfiles/lv-bigfiles /export/bigfiles gfs2 >> _netdev,nodiratime,noatime 0 0 >> >> NFS export : >> /export/bigfiles >> 172.16.0.0/16(fsid=2,rw,async,no_root_squash,no_subtree_check) >> >> mount on NFS clients : >> nfs-server:/export/bigfiles /data/bigfiles nfs4 >> _netdev,rw,user,nodiratime,noatime,intr 0 0 >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From swhiteho at redhat.com Fri Sep 27 08:38:20 2013 From: swhiteho at redhat.com (Steven Whitehouse) Date: Fri, 27 Sep 2013 09:38:20 +0100 Subject: [Linux-cluster] [PATCH] Fix hang of gfs2 mount if older dlm_controld is used In-Reply-To: <20130926180528.GA16351@redhat.com> References: <20130926172449.GA7668@shrek.lan> <20130926180528.GA16351@redhat.com> Message-ID: <1380271100.3407.3.camel@menhir> Hi, On Thu, 2013-09-26 at 14:05 -0400, David Teigland wrote: > On Thu, Sep 26, 2013 at 12:24:49PM -0500, Goldwyn Rodrigues wrote: > > This folds ops_results and error into one. This enables the > > error code to trickle all the way to the calling function and the gfs2 > > mount fails if older dlm_controld is used. > > If it's not working then I'd prefer to fix it rather than abandoning the > idea of making it work. > Dave > I agree. I use the older dlm_controld regularly and I've not seen any obvious issues. What is the problem in this case I wonder? Steve. From james.hofmeister at hp.com Fri Sep 27 20:30:52 2013 From: james.hofmeister at hp.com (Hofmeister, James (HP ESSN BCS Linux ERT)) Date: Fri, 27 Sep 2013 20:30:52 +0000 Subject: [Linux-cluster] gfs:gfs_assert_i+0x67/0x92 seen when node joining cluster Message-ID: <5CBE4DF16DF0DE4A99CCC64ACC08A8791435CE34@G6W2500.americas.hpqcorp.net> I am not looking for a deep analysis of this problem, just a search for known issues... I have not found a duplicate in my Google and bugzilla searches. 1) RHEL version Red Hat Enterprise Linux Server release 5.7 (Tikanga) 2) gfs* packages version gfs2-utils-0.1.62-31.el5.x86_64 Fri 20 Jan 2012 11:25:40 AM COT gfs-utils-0.1.20-10.el5.x86_64 Fri 20 Jan 2012 11:25:40 AM COT kmod-gfs-0.1.34-15.el5.x86_64 Fri 20 Jan 2012 11:26:53 AM COT 3) kernel version Linux xxxxxx 2.6.18-274.el5 #1 SMP Fri Jul 8 17:36:59 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux 4) You can also attach cluster.conf for us. I will send in the cluster.conf when I open the support call. They are taking an error out of gfs not seen/reported at other sites: Call Trace: [] :gfs:gfs_assert_i+0x67/0x92 [] :gfs:unlinked_scan_elements+0x99/0x180 [] :gfs:gfs_dreread+0x87/0xc6 [] :gfs:foreach_descriptor+0x229/0x305 [] :gfs:fill_super+0x0/0x642 [] :gfs:gfs_recover_dump+0xdd/0x14e [] :gfs:gfs_make_fs_rw+0xc0/0x11a [] :gfs:init_journal+0x279/0x34c [] :gfs:fill_super+0x48e/0x642 [] get_sb_bdev+0x10a/0x16c [] vfs_kern_mount+0x93/0x11a [] do_kern_mount+0x36/0x4d [] do_mount+0x6a9/0x719 [] enqueue_task+0x41/0x56 [] do_sock_read+0xcf/0x110 [] sock_aio_read+0x4f/0x5e [] do_sync_read+0xc7/0x104 [] zone_statistics+0x3e/0x6d [] __alloc_pages+0x78/0x308 [] sys_mount+0x8a/0xcd Sep 18 04:09:51 hpium2 syslogd 1.4.1: restart. Sep 18 04:09:51 hpium2 kernel: klogd 1.4.1, log source = /proc/kmsg started. Regards, James Hofmeister Hewlett Packard Linux Engineering Resolution Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgoldwyn at gmail.com Fri Sep 27 21:26:36 2013 From: rgoldwyn at gmail.com (Goldwyn Rodrigues) Date: Fri, 27 Sep 2013 16:26:36 -0500 Subject: [Linux-cluster] [PATCH] Fix hang of gfs2 mount if older dlm_controld is used In-Reply-To: <1380271100.3407.3.camel@menhir> References: <20130926172449.GA7668@shrek.lan> <20130926180528.GA16351@redhat.com> <1380271100.3407.3.camel@menhir> Message-ID: On Fri, Sep 27, 2013 at 3:38 AM, Steven Whitehouse wrote: > Hi, > > On Thu, 2013-09-26 at 14:05 -0400, David Teigland wrote: >> On Thu, Sep 26, 2013 at 12:24:49PM -0500, Goldwyn Rodrigues wrote: >> > This folds ops_results and error into one. This enables the >> > error code to trickle all the way to the calling function and the gfs2 >> > mount fails if older dlm_controld is used. >> >> If it's not working then I'd prefer to fix it rather than abandoning the >> idea of making it work. >> Dave >> > > I agree. I use the older dlm_controld regularly and I've not seen any > obvious issues. What is the problem in this case I wonder? > Sorry, I misread the code. The dlm_new_lockspace() still continues to create a new lockspace even if the operations are not provided. So, the fallback option has to be after that rather than calling the function again. -- Goldwyn From adam.scheblein at marquette.edu Sat Sep 28 00:13:51 2013 From: adam.scheblein at marquette.edu (Scheblein, Adam) Date: Sat, 28 Sep 2013 00:13:51 +0000 Subject: [Linux-cluster] gfs:gfs_assert_i+0x67/0x92 seen when node joining cluster In-Reply-To: <5CBE4DF16DF0DE4A99CCC64ACC08A8791435CE34@G6W2500.americas.hpqcorp.net> References: <5CBE4DF16DF0DE4A99CCC64ACC08A8791435CE34@G6W2500.americas.hpqcorp.net> Message-ID: In the vmcore-dmesg from kdump do you have a line that starts with RIP? ex: <1>RIP [] gfs2_inplace_reserve+0xca/0x7e0 [gfs2] thanks, Adam On Sep 27, 2013, at 3:30 PM, Hofmeister, James (HP ESSN BCS Linux ERT) > wrote: I am not looking for a deep analysis of this problem, just a search for known issues? I have not found a duplicate in my Google and bugzilla searches. 1) RHEL version Red Hat Enterprise Linux Server release 5.7 (Tikanga) 2) gfs* packages version gfs2-utils-0.1.62-31.el5.x86_64 Fri 20 Jan 2012 11:25:40 AM COT gfs-utils-0.1.20-10.el5.x86_64 Fri 20 Jan 2012 11:25:40 AM COT kmod-gfs-0.1.34-15.el5.x86_64 Fri 20 Jan 2012 11:26:53 AM COT 3) kernel version Linux xxxxxx 2.6.18-274.el5 #1 SMP Fri Jul 8 17:36:59 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux 4) You can also attach cluster.conf for us. I will send in the cluster.conf when I open the support call. They are taking an error out of gfs not seen/reported at other sites: Call Trace: [] :gfs:gfs_assert_i+0x67/0x92 [] :gfs:unlinked_scan_elements+0x99/0x180 [] :gfs:gfs_dreread+0x87/0xc6 [] :gfs:foreach_descriptor+0x229/0x305 [] :gfs:fill_super+0x0/0x642 [] :gfs:gfs_recover_dump+0xdd/0x14e [] :gfs:gfs_make_fs_rw+0xc0/0x11a [] :gfs:init_journal+0x279/0x34c [] :gfs:fill_super+0x48e/0x642 [] get_sb_bdev+0x10a/0x16c [] vfs_kern_mount+0x93/0x11a [] do_kern_mount+0x36/0x4d [] do_mount+0x6a9/0x719 [] enqueue_task+0x41/0x56 [] do_sock_read+0xcf/0x110 [] sock_aio_read+0x4f/0x5e [] do_sync_read+0xc7/0x104 [] zone_statistics+0x3e/0x6d [] __alloc_pages+0x78/0x308 [] sys_mount+0x8a/0xcd Sep 18 04:09:51 hpium2 syslogd 1.4.1: restart. Sep 18 04:09:51 hpium2 kernel: klogd 1.4.1, log source = /proc/kmsg started. Regards, James Hofmeister Hewlett Packard Linux Engineering Resolution Team -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From swhiteho at redhat.com Mon Sep 30 09:47:40 2013 From: swhiteho at redhat.com (Steven Whitehouse) Date: Mon, 30 Sep 2013 10:47:40 +0100 Subject: [Linux-cluster] gfs:gfs_assert_i+0x67/0x92 seen when node joining cluster In-Reply-To: <5CBE4DF16DF0DE4A99CCC64ACC08A8791435CE34@G6W2500.americas.hpqcorp.net> References: <5CBE4DF16DF0DE4A99CCC64ACC08A8791435CE34@G6W2500.americas.hpqcorp.net> Message-ID: <1380534460.2738.3.camel@menhir> Hi, On Fri, 2013-09-27 at 20:30 +0000, Hofmeister, James (HP ESSN BCS Linux ERT) wrote: > I am not looking for a deep analysis of this problem, just a search > for known issues? I have not found a duplicate in my Google and > bugzilla searches. > The trace looks to me as if the unlinked inodes (hidden file) has become corrupt on disk for some reason and this has triggered an assert during mount. Does fsck.gfs not fix this? It isn't something that I recall seeing before, and even with a detailed analysis of the on disk filesystem it may not be possible to give an exact explanation of what has gone wrong, depending on what state the fs is currently in. I would certainly double check any fencing configuration in this case to make sure that it is set up correctly in case that is an issue, Steve. > > > 1) RHEL version > > Red Hat Enterprise Linux Server release 5.7 (Tikanga) > > > > 2) gfs* packages version > > gfs2-utils-0.1.62-31.el5.x86_64 Fri 20 Jan 2012 11:25:40 AM COT > > gfs-utils-0.1.20-10.el5.x86_64 Fri 20 Jan 2012 11:25:40 AM COT > > kmod-gfs-0.1.34-15.el5.x86_64 Fri 20 Jan 2012 11:26:53 AM COT > > > 3) kernel version > Linux xxxxxx 2.6.18-274.el5 #1 SMP Fri Jul 8 17:36:59 EDT 2011 x86_64 > x86_64 > > x86_64 GNU/Linux > > > > 4) You can also attach cluster.conf for us. > > I will send in the cluster.conf when I open the support call. > > > > They are taking an error out of gfs not seen/reported at other sites: > > Call Trace: > > [] :gfs:gfs_assert_i+0x67/0x92 > > [] :gfs:unlinked_scan_elements+0x99/0x180 > > [] :gfs:gfs_dreread+0x87/0xc6 > > [] :gfs:foreach_descriptor+0x229/0x305 > > [] :gfs:fill_super+0x0/0x642 > > [] :gfs:gfs_recover_dump+0xdd/0x14e > > [] :gfs:gfs_make_fs_rw+0xc0/0x11a > > [] :gfs:init_journal+0x279/0x34c > > [] :gfs:fill_super+0x48e/0x642 > > [] get_sb_bdev+0x10a/0x16c > > [] vfs_kern_mount+0x93/0x11a > > [] do_kern_mount+0x36/0x4d > > [] do_mount+0x6a9/0x719 > > [] enqueue_task+0x41/0x56 > > [] do_sock_read+0xcf/0x110 > > [] sock_aio_read+0x4f/0x5e > > [] do_sync_read+0xc7/0x104 > > [] zone_statistics+0x3e/0x6d > > [] __alloc_pages+0x78/0x308 > > [] sys_mount+0x8a/0xcd > > > > Sep 18 04:09:51 hpium2 syslogd 1.4.1: restart. > > Sep 18 04:09:51 hpium2 kernel: klogd 1.4.1, log source = /proc/kmsg > started. > > > > Regards, > > James Hofmeister Hewlett Packard Linux Engineering Resolution > Team > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster