From adrian.gibanel at btactic.com  Sun Jun  2 21:12:54 2013
From: adrian.gibanel at btactic.com (Adrian Gibanel)
Date: Sun, 2 Jun 2013 23:12:54 +0200 (CEST)
Subject: [Linux-cluster] fence_ovh - Fence agent for OVH
In-Reply-To: <2056213.43.1370206273474.JavaMail.adrian@adrianworktop>
Message-ID: <3826574.639.1370207568347.JavaMail.adrian@adrianworktop>

  As requested by digimer in linux-ha irc channel here there is fence_ovh. It's not a priority that it's included by default in official distribution of cluster software but if you guide me on how to polish it I think I can improve it a lot more and make tests in real machines (as long as my machines are still test machines and not production ones). 

1) What is fence_ovh 

fence_ovh is a fence agent based on python for the big French datacentre provider OVH. You can get information about OVH on: http://www.ovh.co.uk/ . I also wanted to make clear that I'm not part of official OVH staff. 

2) Features 
The script has two main functions: 

* Reboot into rescue mode (action=off) 
* Reboot into the hard disk (action=on;action=reboot) 

3) Technical details 
So as you might deduce the classical fence mechanism which turns off the other node is not actually done by turning off the machine but by rebooting it into a rescue mode. 

Another particular thing to mention is that the script checks if the machine has rebooted ok into rescue mode thanks to an OVH API which reports the date when the server rebooted. By the way the OVH API is also used in the main function that consists in rebooting the machine into rescue mode. 

4) How to use it 

4.1) Make sure python-soappy package is installed (Debian/Ubuntu).
4.2) Save fence_ovh in /usr/sbin 
4.3) Run: ccs_update_schema so that new metadata is put into cluster.rng 
4.4) If needed validate your configuration: 
ccs_config_validate -v -f /etc/pve/cluster.conf.new 
4.5) Here's an example of how to use it in cluster.conf:

<?xml version="1.0"?>
<cluster name="ha-008-010" config_version="3">

<cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu" two_node="1" expected_votes="1">
</cman>

<fencedevices>
        <fencedevice agent="fence_ovh" name="fence008" email="admin at domain.com" ipaddr="ns123456" login="ab12345-ovh" passwd="MYSECRET" />
        <fencedevice agent="fence_ovh" name="fence010" email="admin at domain.com" ipaddr="ns789012" login="ab12345-ovh" passwd="MYSECRET" />
</fencedevices>

<clusternodes>
<clusternode name="server008" nodeid="1" votes="1">
  <fence>
    <method name="1">
      <device name="fence008" action="off"/>
    </method>
  </fence>
</clusternode>
<clusternode name="server010" nodeid="2" votes="1">
  <fence>
    <method name="1">
      <device name="fence010" action="off"/>
    </method>
  </fence>
</clusternode>
</clusternodes>


</cluster>



Finally I attach to this email the first version of ovh_fence script. It can be improved a lot, I've just realised that I've left some mention an .ini file in the metadata that I had used previously to feed user / pass while now they are gathered from cluster.conf configuration directly as any fence agent.

The original thread from Proxmox forum from which I adapted original secofor script: http://forum.proxmox.com/threads/11066-Proxmox-HA-Cluster-at-OVH-Fencing?p=75152#post75152

P.S.: It was not easy to develop a fence agent because there's no documentation on it. I maybe arise another email in this same mailing list about this subject. 

-- 

-- 
Adri?n Gibanel 
I.T. Manager 

+34 675 683 301 
www.btactic.com 



Ens podeu seguir a/Nos podeis seguir en: 

i 


Abans d?imprimir aquest missatge, pensa en el medi ambient. El medi ambient ?s cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. 

AVIS: 
El contingut d'aquest missatge i els seus annexos ?s confidencial. Si no en sou el destinatari, us fem saber que est? prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autoritzaci? corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . 

AVISO: 
El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que est? prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorizaci?n correspondiente. Si han recibido este mensaje por error, les agradecer?amos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ovh_fence
Type: text/x-python
Size: 5740 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130602/bd59611b/attachment.py>

From adrian.gibanel at btactic.com  Sun Jun  2 21:26:43 2013
From: adrian.gibanel at btactic.com (Adrian Gibanel)
Date: Sun, 2 Jun 2013 23:26:43 +0200 (CEST)
Subject: [Linux-cluster] Fence agent howto needed
In-Reply-To: <257418.648.1370208010160.JavaMail.adrian@adrianworktop>
Message-ID: <11568770.732.1370208401096.JavaMail.adrian@adrianworktop>

As I mentioned in a previous message I think there's missing a: 

How to write your own Fence agent howto 
or 
Fence agent howto 

As I've written one fence agent myself I think I can help in explaining how to use included python fencing libraries (when you program in python of course) which eases a lot the parsing of options from stdin. I mean, it's easy, but it's far from obvious even by checking the default python fence_* scripts. 

Thank you. 

-- 

Adri?n Gibanel 
I.T. Manager 

+34 675 683 301 
www.btactic.com 



Ens podeu seguir a/Nos podeis seguir en: 

i 


Abans d?imprimir aquest missatge, pensa en el medi ambient. El medi ambient ?s cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. 

AVIS: 
El contingut d'aquest missatge i els seus annexos ?s confidencial. Si no en sou el destinatari, us fem saber que est? prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autoritzaci? corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . 

AVISO: 
El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que est? prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorizaci?n correspondiente. Si han recibido este mensaje por error, les agradecer?amos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130602/b40b28dc/attachment.htm>

From adrian.gibanel at btactic.com  Wed Jun  5 14:52:55 2013
From: adrian.gibanel at btactic.com (Adrian Gibanel)
Date: Wed, 5 Jun 2013 16:52:55 +0200 (CEST)
Subject: [Linux-cluster] fence_ovh - Fence agent for OVH
In-Reply-To: <51ACAD2C.1000902@redhat.com>
References: <3826574.639.1370207568347.JavaMail.adrian@adrianworktop>
	<51ACAD2C.1000902@redhat.com>
Message-ID: <19221023.357.1370443974551.JavaMail.adrian@adrianworktop>

----- Mensaje original ----- 

> De: "Marek Grac" <mgrac at redhat.com>
> Para: "Adrian Gibanel" <adrian.gibanel at btactic.com>
> Enviados: Lunes, 3 de Junio 2013 16:50:20
> Asunto: Re: [Linux-cluster] fence_ovh - Fence agent for OVH

> Hi Adrian,

> On 06/02/2013 11:12 PM, Adrian Gibanel wrote:
> > As requested by digimer in linux-ha irc channel here there is
> > fence_ovh. It's not a priority that it's included by default in
> > official distribution of cluster software but if you guide me on
> > how to polish it I think I can improve it a lot more and make
> > tests in real machines (as long as my machines are still test
> > machines and not production ones).
> Inclusion in upstream package should not be a serious issue but I
> have
> some comments.

Ok. 

> > 4) How to use it
> >
> > 4.1) Make sure python-soappy package is installed (Debian/Ubuntu).
> Do you think it would be possible to use package 'python-suds' ? We
> already use it for fence_vmware_soap and I would rather use only one
> library for same thing. Do not bother with strange code in
> fence_vmware_soap, VMWare implementation was strange so some hacks
> were
> needed, it should not be needed for you and transformation to suds
> should be quite simple.
I think it's just a matter of trying it. Not sure when I will have the time.

> > 4.2) Save fence_ovh in /usr/sbin
> > 4.3) Run: ccs_update_schema so that new metadata is put into
> > cluster.rng
> > 4.4) If needed validate your configuration:
> > ccs_config_validate -v -f /etc/pve/cluster.conf.new
> > 4.5) Here's an example of how to use it in cluster.conf:
> >
> > <?xml version="1.0"?>
> > <cluster name="ha-008-010" config_version="3">
> >
> > <cman keyfile="/var/lib/pve-cluster/corosync.authkey"
> > transport="udpu" two_node="1" expected_votes="1">
> > </cman>
> >
> > <fencedevices>
> > <fencedevice agent="fence_ovh" name="fence008"
> > email="admin at domain.com" ipaddr="ns123456" login="ab12345-ovh"
> > passwd="MYSECRET" />
> > <fencedevice agent="fence_ovh" name="fence010"
> > email="admin at domain.com" ipaddr="ns789012" login="ab12345-ovh"
> > passwd="MYSECRET" />
> > </fencedevices>
> >
> > <clusternodes>
> > <clusternode name="server008" nodeid="1" votes="1">
> > <fence>
> > <method name="1">
> > <device name="fence008" action="off"/>
> > </method>
> > </fence>
> > </clusternode>
> > <clusternode name="server010" nodeid="2" votes="1">
> > <fence>
> > <method name="1">
> > <device name="fence010" action="off"/>
> > </method>
> > </fence>
> > </clusternode>
> > </clusternodes>
> >
> >
> > </cluster>

> I like the idea of 'email', it is possible to do it on cluster level
> but
> as fence agents are used in different places that could help more
> people.

Well, I didn't find a default field for setting it and I added one of my own. I think that you can use OVH API without that email informed but I prefer myself to be informed about fencing. So, I think that rather than writing in a file somewhere in the filesystem, it would be more useful to write it in cluster.conf.

  I don't think if I understood what you meant. You were thinking that maybe email should be a cluster wide settings instead of a fence agent wide setting... but then you thought that the latter was better idea?

> > P.S.: It was not easy to develop a fence agent because there's no
> > documentation on it. I maybe arise another email in this same
> > mailing list about this subject.
> >
> There is a Fence Agent API document (a bit outdated), info about
> fence
> timing and the major document that you need is in my to-do list
> (should
> be completed before summer, but we had some more urgent deadlines).
Ok. I'll contribute back to it if I can when it's public.

Thank you!

> m,

-- 

-- 
Adri?n Gibanel 
I.T. Manager 

+34 675 683 301 
www.btactic.com 



Ens podeu seguir a/Nos podeis seguir en: 

i 


Abans d?imprimir aquest missatge, pensa en el medi ambient. El medi ambient ?s cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. 

AVIS: 
El contingut d'aquest missatge i els seus annexos ?s confidencial. Si no en sou el destinatari, us fem saber que est? prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autoritzaci? corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . 

AVISO: 
El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que est? prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorizaci?n correspondiente. Si han recibido este mensaje por error, les agradecer?amos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . 



From rossnick-lists at cybercat.ca  Thu Jun  6 16:37:06 2013
From: rossnick-lists at cybercat.ca (Nicolas Ross)
Date: Thu, 06 Jun 2013 12:37:06 -0400
Subject: [Linux-cluster] Using LVM / ext4 in cluster.conf more than one time
Message-ID: <51B0BAB2.6060009@cybercat.ca>

Hi !

We have 2 clusters of 8 nodes each. Since we begun using RHCS about 2.5
years ago, we mostly use GFS on a shared storage array for data storage.
In most cases, there is no need to use GFS, ext4 would be enough since
that filesystem is only used within one service.

Production services are enabled on one cluster at the time. A service
containing a webserver and data directories is running in cluster A. A
service also exists in cluster B that only use the data directories.
Data is then synced from cluster A to cluster B manually for recovery in
case of disaster. So in each cluster, we have a copy of 2 services. Both
use the same data file system, but only one start the webserver. I will
upload a verry simplified version of one cluster.conf to illustrate this
and the problem.

We begun migrating some GFS file system to ext4, as it offers more
performance, while retaining the same HA features in the event of a node
failure. And while doing so, we discover this problem.

While a ext3 using HA-LVM in a clustered environement cannot be mounted
on 2 nodes (that's fine), we do need for it to be defined 2 times in the
cluster.conf file.

So, in my exemple cluster.conf, you will see 2 services. Both uses the
same lvm and fs, but one starts a script, and the other doesn't. While
only one service can be active at the time in one cluster, and that's
fine for us, only the one will effectively mount the file system.

In my cluster.conf example, I have 2 services. SandBox would be the
"production" service starting the script and listening on the proper IP.
The SandBoxRecovery service would be the service I run in my other
cluster to receive the synced data from the "production" service.

In this exemple, the SandBox service would start fine, but
SandBoxRecovery would only starts the IPs, and not mount the FS.

With rg_test, I was able to see this warning : Warning: Max references
exceeded for resource name (type lvm)

and google led me to :

https://access.redhat.com/site/solutions/222453

Which pretty much explains it. You can see the comment on this solution
which is execly my question here :

So, how can I define a LVM and FS a second time in cluster.conf ?

------------ example cluster.conf ------------
<?xml version="1.0"?>
<cluster config_version="1199" name="CyberClusterAS">
  <cman/>
  <logging debug="off"/>
  <gfs_controld plock_ownership="1" plock_rate_limit="500"/>
  <clusternodes>
    <clusternode name="node201.lan.cybercat.priv" nodeid="1">
      <fence>
        <method name="1">
          <device name="node201-ipmi"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="node202.lan.cybercat.priv" nodeid="2">
      <fence>
        <method name="1">
          <device name="node202-ipmi"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="node203.lan.cybercat.priv" nodeid="3">
      <fence>
        <method name="1">
          <device name="node203-ipmi"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice agent="fence_ipmilan" ipaddr="192.168.113.151"
login="Admin" name="node201-ipmi" passwd="darkman."/>
    <fencedevice agent="fence_ipmilan" ipaddr="192.168.113.152"
login="Admin" name="node202-ipmi" passwd="darkman."/>
    <fencedevice agent="fence_ipmilan" ipaddr="192.168.113.153"
login="Admin" name="node203-ipmi" passwd="darkman."/>
  </fencedevices>
  <rm>
    <failoverdomains>
      <failoverdomain name="cybercat" nofailback="1" ordered="0"
restricted="0">
        <failoverdomainnode name="node201.lan.cybercat.priv" priority=""/>
        <failoverdomainnode name="node202.lan.cybercat.priv" priority=""/>
        <failoverdomainnode name="node203.lan.cybercat.priv" priority=""/>
      </failoverdomain>
    </failoverdomains>
    <resources>
      <lvm lv_name="SandBox" name="VGa-SandBox" vg_name="VGa"/>
      <fs device="/dev/VGa/SandBox" force_fsck="0" force_unmount="1"
fsid="64052" fstype="ext4" mountpoint="/CyberCat/SandBox" name="SandBox"
options="" self_fence="0"/>
    </resources>
    <service autostart="0" domain="cybercat" exclusive="0" name="SandBox">
      <ip address="192.168.110.43" monitor_link="on" sleeptime="1">
        <ip address="192.168.112.43" monitor_link="on" sleeptime="1"/>
        <lvm ref="VGa-SandBox">
          <fs ref="SandBox">
            <script __independent_subtree="1"
file="/CyberCat/SandBox/scripts/startup" name="SandBox-script"/>
          </fs>
        </lvm>
      </ip>
    </service>
    <service autostart="0" domain="cybercat" exclusive="0"
name="SandBoxRecovery">
      <ip address="192.168.110.174" monitor_link="on" sleeptime="1">
        <ip address="192.168.112.174" monitor_link="on" sleeptime="1"/>
        <lvm ref="VGa-SandBox">
          <fs ref="SandBox"/>
        </lvm>
      </ip>
    </service>
  </rm>
</cluster>
------------ / example cluster.conf ------------

PS, I also opened a ticket with GSS and waiting for their answer...



From urgrue at bulbous.org  Thu Jun  6 18:17:45 2013
From: urgrue at bulbous.org (urgrue)
Date: Thu, 06 Jun 2013 20:17:45 +0200
Subject: [Linux-cluster] fenced_domain_info error -1
Message-ID: <51B0D249.6010003@bulbous.org>

I seem to get rgmanager stuck/unresponsive/unkillable rather regularly 
after a fence attempt has failed. fence_ack_manual rarely brings it back 
into shape and I have to resort to rebooting the node(s).
Googling the error messages usually gets me nowhere, e.g. the one below 
(fenced_domain_info error -1) yields exactly one hit, the source code. I 
feel special.

Can anyone point me in the right direction on how to debug this further?

This is RHEL 6.3 and a simple 2-node cluster + quorum disk and shared 
SAN storage with HA-LVM.

Here's a recent log snippet if it helps:
Jun 06 10:07:23 dlm_controld cluster node 1 removed seq 100812
Jun 06 10:07:23 dlm_controld del_configfs_node rmdir 
"/sys/kernel/config/dlm/cluster/comms/1"
Jun 06 10:07:23 dlm_controld dlm:controld conf 1 0 1 memb 2 join left 1
Jun 06 10:07:23 dlm_controld dlm:ls:rgmanager conf 1 0 1 memb 2 join left 1
Jun 06 10:07:23 dlm_controld rgmanager add_change cg 4 remove nodeid 1 
reason 3
Jun 06 10:07:23 dlm_controld rgmanager add_change cg 4 counts member 1 
joined 0 remove 1 failed 1
Jun 06 10:07:23 dlm_controld rgmanager stop_kernel cg 4
Jun 06 10:07:23 dlm_controld write "0" to 
"/sys/kernel/dlm/rgmanager/control"
Jun 06 10:07:23 dlm_controld rgmanager check_fencing 1 wait add 
1370252270 fail 1370506043 last 1370252214
Jun 06 10:07:24 dlm_controld cluster node 1 added seq 100816
Jun 06 10:07:24 dlm_controld set_configfs_node 1 10.112.32.22 local 0
Jun 06 10:07:24 dlm_controld dlm:controld conf 2 1 0 memb 1 2 join 1 left
Jun 06 10:07:24 dlm_controld cpg_mcast_joined retried 2 protocol
Jun 06 10:07:24 dlm_controld dlm:ls:rgmanager conf 2 1 0 memb 1 2 join 1 
left
Jun 06 10:07:24 dlm_controld rgmanager add_change cg 5 joined nodeid 1
Jun 06 10:07:24 dlm_controld rgmanager add_change cg 5 counts member 2 
joined 1 remove 0 failed 0
Jun 06 10:07:49 dlm_controld cluster node 1 removed seq 100820
Jun 06 10:07:49 dlm_controld del_configfs_node rmdir 
"/sys/kernel/config/dlm/cluster/comms/1"
Jun 06 10:07:49 dlm_controld dlm:controld conf 1 0 1 memb 2 join left 1
Jun 06 10:07:49 dlm_controld dlm:ls:rgmanager conf 1 0 1 memb 2 join left 1
Jun 06 10:07:49 dlm_controld rgmanager add_change cg 6 remove nodeid 1 
reason 3
Jun 06 10:07:49 dlm_controld rgmanager add_change cg 6 counts member 1 
joined 0 remove 1 failed 1
Jun 06 10:07:49 dlm_controld rgmanager check_fencing 1 wait add 
1370252270 fail 1370506069 last 1370252214
Jun 06 11:55:03 dlm_controld cluster node 1 added seq 100828
Jun 06 11:55:03 dlm_controld set_configfs_node 1 10.112.32.22 local 0
Jun 06 11:55:10 dlm_controld cluster node 1 removed seq 100832
Jun 06 11:55:10 dlm_controld del_configfs_node rmdir 
"/sys/kernel/config/dlm/cluster/comms/1"
Jun 06 12:19:10 dlm_controld dlm_controld 3.0.12.1 started
Jun 06 12:19:11 dlm_controld found /dev/misc/dlm-control minor 54
Jun 06 12:19:11 dlm_controld found /dev/misc/dlm-monitor minor 53
Jun 06 12:19:11 dlm_controld found /dev/misc/dlm_plock minor 52
Jun 06 12:19:11 dlm_controld /dev/misc/dlm-monitor fd 12
Jun 06 12:19:11 dlm_controld /sys/kernel/config/dlm/cluster/comms: 
opendir failed: 2
Jun 06 12:19:11 dlm_controld /sys/kernel/config/dlm/cluster/spaces: 
opendir failed: 2
Jun 06 12:19:11 dlm_controld cluster node 2 added seq 100836
Jun 06 12:19:11 dlm_controld set_configfs_node 2 10.113.32.25 local 1
Jun 06 12:19:11 dlm_controld totem/rrp_mode = 'none'
Jun 06 12:19:11 dlm_controld set protocol 0
Jun 06 12:19:11 dlm_controld group_mode 3 compat 0
Jun 06 12:19:11 dlm_controld setup_cpg_daemon 15
Jun 06 12:19:11 dlm_controld dlm:controld conf 1 1 0 memb 2 join 2 left
Jun 06 12:19:11 dlm_controld set_protocol member_count 1 propose daemon 
1.1.1 kernel 1.1.1
Jun 06 12:19:11 dlm_controld run protocol from nodeid 2
Jun 06 12:19:11 dlm_controld daemon run 1.1.1 max 1.1.1 kernel run 1.1.1 
max 1.1.1
Jun 06 12:19:11 dlm_controld plocks 17
Jun 06 12:19:11 dlm_controld plock cpg message size: 104 bytes
Jun 06 12:19:12 dlm_controld client connection 5 fd 18
Jun 06 12:21:15 dlm_controld uevent: add@/kernel/dlm/rgmanager
Jun 06 12:21:15 dlm_controld kernel: add@ rgmanager
Jun 06 12:21:15 dlm_controld uevent: online@/kernel/dlm/rgmanager
Jun 06 12:21:15 dlm_controld kernel: online@ rgmanager
Jun 06 12:21:15 dlm_controld dlm:ls:rgmanager conf 1 1 0 memb 2 join 2 left
Jun 06 12:21:15 dlm_controld rgmanager add_change cg 1 joined nodeid 2
Jun 06 12:21:15 dlm_controld rgmanager add_change cg 1 we joined
Jun 06 12:21:15 dlm_controld rgmanager add_change cg 1 counts member 1 
joined 1 remove 0 failed 0
Jun 06 12:28:40 dlm_controld uevent: remove@/kernel/dlm/rgmanager
Jun 06 12:28:40 dlm_controld kernel: remove@ rgmanager
Jun 06 12:28:57 dlm_controld uevent: add@/kernel/dlm/rgmanager
Jun 06 12:28:57 dlm_controld kernel: add@ rgmanager
Jun 06 12:28:57 dlm_controld uevent: online@/kernel/dlm/rgmanager
Jun 06 12:28:57 dlm_controld kernel: online@ rgmanager
Jun 06 12:28:57 dlm_controld process_uevent online@ error -17 errno 2
Jun 06 12:32:06 dlm_controld uevent: remove@/kernel/dlm/rgmanager
Jun 06 12:32:06 dlm_controld kernel: remove@ rgmanager
Jun 06 12:32:24 dlm_controld connection 5 read error -1
Jun 06 12:33:24 dlm_controld dlm_controld 3.0.12.1 started
Jun 06 12:33:24 dlm_controld found /dev/misc/dlm-control minor 54
Jun 06 12:33:24 dlm_controld found /dev/misc/dlm-monitor minor 53
Jun 06 12:33:24 dlm_controld found /dev/misc/dlm_plock minor 52
Jun 06 12:33:24 dlm_controld /dev/misc/dlm-monitor fd 13
Jun 06 12:33:24 dlm_controld clear_configfs_nodes rmdir 
"/sys/kernel/config/dlm/cluster/comms/2"
Jun 06 12:33:24 dlm_controld cluster node 2 added seq 100836
Jun 06 12:33:24 dlm_controld set_configfs_node 2 10.113.32.25 local 1
Jun 06 12:33:24 dlm_controld totem/rrp_mode = 'none'
Jun 06 12:33:24 dlm_controld set protocol 0
Jun 06 12:33:24 dlm_controld group_mode 3 compat 0
Jun 06 12:33:24 dlm_controld setup_cpg_daemon 15
Jun 06 12:33:24 dlm_controld dlm:controld conf 1 1 0 memb 2 join 2 left
Jun 06 12:33:24 dlm_controld set_protocol member_count 1 propose daemon 
1.1.1 kernel 1.1.1
Jun 06 12:33:24 dlm_controld run protocol from nodeid 2
Jun 06 12:33:24 dlm_controld daemon run 1.1.1 max 1.1.1 kernel run 1.1.1 
max 1.1.1
Jun 06 12:33:24 dlm_controld plocks 17
Jun 06 12:33:24 dlm_controld plock cpg message size: 104 bytes
Jun 06 12:33:25 dlm_controld client connection 5 fd 18
Jun 06 12:33:46 dlm_controld uevent: add@/kernel/dlm/rgmanager
Jun 06 12:33:46 dlm_controld kernel: add@ rgmanager
Jun 06 12:33:46 dlm_controld uevent: online@/kernel/dlm/rgmanager
Jun 06 12:33:46 dlm_controld kernel: online@ rgmanager
Jun 06 12:33:46 dlm_controld dlm:ls:rgmanager conf 1 1 0 memb 2 join 2 left
Jun 06 12:33:46 dlm_controld rgmanager add_change cg 1 joined nodeid 2
Jun 06 12:33:46 dlm_controld rgmanager add_change cg 1 we joined
Jun 06 12:33:46 dlm_controld rgmanager add_change cg 1 counts member 1 
joined 1 remove 0 failed 0
Jun 06 12:55:46 dlm_controld fenced_domain_info error -1
<same message repeats every second>
Jun 06 12:58:08 dlm_controld cluster node 1 added seq 100840
Jun 06 12:58:08 dlm_controld set_configfs_node 1 10.112.32.22 local 0
Jun 06 12:58:08 dlm_controld fenced_domain_info error -1
<same message repeats every second>
Jun 06 12:58:25 dlm_controld cluster node 1 removed seq 100844
Jun 06 12:58:25 dlm_controld del_configfs_node rmdir 
"/sys/kernel/config/dlm/cluster/comms/1"
Jun 06 12:58:25 dlm_controld fenced_domain_info error -1
<same message repeats every second>
Jun 06 12:58:54 dlm_controld uevent: remove@/kernel/dlm/rgmanager
Jun 06 12:58:54 dlm_controld kernel: remove@ rgmanager
Jun 06 12:58:54 dlm_controld fenced_domain_info error -1
<same message repeats every second>
Jun 06 12:59:14 dlm_controld uevent: add@/kernel/dlm/rgmanager
Jun 06 12:59:14 dlm_controld kernel: add@ rgmanager
Jun 06 12:59:14 dlm_controld fenced_domain_info error -1
Jun 06 12:59:14 dlm_controld uevent: online@/kernel/dlm/rgmanager
Jun 06 12:59:14 dlm_controld kernel: online@ rgmanager
Jun 06 12:59:14 dlm_controld process_uevent online@ error -17 errno 111
Jun 06 12:59:14 dlm_controld fenced_domain_info error -1
<same message repeats every second>
Jun 06 13:00:30 dlm_controld uevent: remove@/kernel/dlm/rgmanager
Jun 06 13:00:30 dlm_controld kernel: remove@ rgmanager
Jun 06 13:00:30 dlm_controld fenced_domain_info error -1
<same message repeats every second>



From ali.bendriss at gmail.com  Wed Jun 12 14:13:02 2013
From: ali.bendriss at gmail.com (Ali Bendriss)
Date: Wed, 12 Jun 2013 16:13:02 +0200
Subject: [Linux-cluster] kernel BUG at fs/gfs2/glock.c
Message-ID: <1753598.gZm6zVSfWa@zapp>

Hello,

I have enabled the quota on a gfs2 file system through fstab on a 2 (identical) 
nodes cluster : quota=on
but I've not used it yet.
Now I can see the kernel message in attachment only on one node.

thanks


sotware version:
linux  3.9.3
cluster 3.2.0
gfs2-utils 3.1.5

--
Ali
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130612/2521505b/attachment.htm>
-------------- next part --------------
Jun 12 04:47:37 minnie kernel: [34757.502162] original: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.502167] pid: 10880
Jun 12 04:47:37 minnie kernel: [34757.502169] lock type: 8 req lock state : 1
Jun 12 04:47:37 minnie kernel: [34757.502176] new: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.502177] pid: 10880
Jun 12 04:47:37 minnie kernel: [34757.502179] lock type: 8 req lock state : 1
Jun 12 04:47:37 minnie kernel: [34757.502184]  G:  s:EX n:8/0 f:Iqb t:EX d:EX/0 a:0 v:0 r:6 m:200
Jun 12 04:47:37 minnie kernel: [34757.502192]   H: s:EX f:cH e:0 p:10880 [gfs2_quotad] do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.502221] ------------[ cut here ]------------
Jun 12 04:47:37 minnie kernel: [34757.507116] kernel BUG at fs/gfs2/glock.c:1029!
Jun 12 04:47:37 minnie kernel: [34757.511904] invalid opcode: 0000 [#1] SMP 
Jun 12 04:47:37 minnie kernel: [34757.516283] Modules linked in: ip_vs_lc ip_vs nf_conntrack gfs2 dlm sctp ipv6 8021q garp stp llc bonding pcmcia pcmcia_core quota_v2 quota_tree lp ppdev parport_pc parport fuse hid_generic coretemp acpi_cpufreq mperf freq_table igb mgag200 ttm processor kvm_intel thermal_sys kvm drm_kms_helper drm agpgart syscopyarea sysfillrect sysimgblt crc32c_intel ptp evdev gpio_ich psmouse serio_raw ioatdma microcode lpc_ich pps_core hwmon i2c_algo_bit i2c_i801 i2c_core dca button usbhid hid usb_storage ehci_pci uhci_hcd lpfc scsi_transport_fc ehci_hcd xfs exportfs [last unloaded: pcmcia_core]
Jun 12 04:47:37 minnie kernel: [34757.572445] CPU 3 
Jun 12 04:47:37 minnie kernel: [34757.574400] Pid: 10880, comm: gfs2_quotad Not tainted 3.9.3 #1 Supermicro X8DTU/X8DTU
Jun 12 04:47:37 minnie kernel: [34757.582900] RIP: 0010:[<ffffffffa0e0a81d>]  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.592380] RSP: 0018:ffff8809fc91bc58  EFLAGS: 00010292
Jun 12 04:47:37 minnie kernel: [34757.598219] RAX: 0000000000000000 RBX: ffff8809ef57e680 RCX: 0000000000000006
Jun 12 04:47:37 minnie kernel: [34757.605809] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffff880a3fc6ce90
Jun 12 04:47:37 minnie kernel: [34757.613376] RBP: ffff8809fc91bc98 R08: 000000000000000a R09: 0000000000000573
Jun 12 04:47:37 minnie kernel: [34757.621028] R10: 0000000000000000 R11: 0000000000000572 R12: ffff8809ed9cb548
Jun 12 04:47:37 minnie kernel: [34757.628638] R13: ffff8809ef57e6c0 R14: ffff8809ed9cb548 R15: ffff8809ed9cb598
Jun 12 04:47:37 minnie kernel: [34757.636240] FS:  0000000000000000(0000) GS:ffff880a3fc60000(0000) knlGS:0000000000000000
Jun 12 04:47:37 minnie kernel: [34757.644853] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 12 04:47:37 minnie kernel: [34757.651038] CR2: 000000000069de98 CR3: 000000000180c000 CR4: 00000000000007e0
Jun 12 04:47:37 minnie kernel: [34757.658646] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 12 04:47:37 minnie kernel: [34757.666290] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 12 04:47:37 minnie kernel: [34757.673903] Process gfs2_quotad (pid: 10880, threadinfo ffff8809fc91a000, task ffff8809fdee1500)
Jun 12 04:47:37 minnie kernel: [34757.683262] Stack:
Jun 12 04:47:37 minnie kernel: [34757.685474]  ffff880a0046e000 0000000000000000 ffff8809fc91bc98 ffff8809ef57e6c0
Jun 12 04:47:37 minnie kernel: [34757.693472]  0000000000000001 0000000000000002 000000000000000f ffff8809f3dd5408
Jun 12 04:47:37 minnie kernel: [34757.701427]  ffff8809fc91bd78 ffffffffa0e1a931 ffff8809fc91bd78 ffffffffa0e1a5a6
Jun 12 04:47:37 minnie kernel: [34757.709435] Call Trace:
Jun 12 04:47:37 minnie kernel: [34757.712031]  [<ffffffffa0e1a931>] do_sync+0x191/0x4c0 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.718043]  [<ffffffffa0e1a5a6>] ? bh_get+0x156/0x1f0 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.724177]  [<ffffffffa0e1acfd>] gfs2_quota_sync+0x9d/0x320 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.730908]  [<ffffffffa0e1af8e>] gfs2_quota_sync_timeo+0xe/0x10 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.737923]  [<ffffffffa0e1a310>] quotad_check_timeo.part.15+0x30/0x70 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.745563]  [<ffffffffa0e1c917>] gfs2_quotad+0x217/0x2a0 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.751975]  [<ffffffff8107b4e0>] ? finish_wait+0x80/0x80
Jun 12 04:47:37 minnie kernel: [34757.757714]  [<ffffffffa0e1c700>] ? gfs2_wake_up_statfs+0x40/0x40 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.764864]  [<ffffffff8107ada0>] kthread+0xc0/0xd0
Jun 12 04:47:37 minnie kernel: [34757.770074]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:47:37 minnie kernel: [34757.777003]  [<ffffffff8156dddc>] ret_from_fork+0x7c/0xb0
Jun 12 04:47:37 minnie kernel: [34757.782766]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:47:37 minnie kernel: [34757.789784] Code: a0 31 c0 e8 b4 a7 75 e0 49 8b 45 10 41 8b 55 20 48 c7 c7 38 fc e2 a0 8b 70 28 31 c0 e8 9b a7 75 e0 4c 89 f6 31 ff e8 53 e7 ff ff <0f> 0b 90 48 83 7d c8 00 48 8b 45 c8 48 0f 44 c3 48 89 45 c8 e9 
Jun 12 04:47:37 minnie kernel: [34757.811621] RIP  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:47:37 minnie kernel: [34757.818651]  RSP <ffff8809fc91bc58>
Jun 12 04:47:37 minnie kernel: [34757.905533] ---[ end trace 09eb442b9da063fd ]---
Jun 12 04:47:38 minnie kernel: [34758.348114] original: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:47:38 minnie kernel: [34758.368158] pid: 10936
Jun 12 04:47:38 minnie kernel: [34758.385824] lock type: 8 req lock state : 1
Jun 12 04:47:38 minnie kernel: [34758.405285] new: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:47:38 minnie kernel: [34758.424603] pid: 10936
Jun 12 04:47:38 minnie kernel: [34758.441967] lock type: 8 req lock state : 1
Jun 12 04:47:38 minnie kernel: [34758.460859]  G:  s:EX n:8/0 f:Iqb t:EX d:EX/0 a:0 v:0 r:6 m:200
Jun 12 04:47:38 minnie kernel: [34758.481289]   H: s:EX f:cH e:0 p:10936 [gfs2_quotad] do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:47:38 minnie kernel: [34758.502961] ------------[ cut here ]------------
Jun 12 04:47:38 minnie kernel: [34758.521349] kernel BUG at fs/gfs2/glock.c:1029!
Jun 12 04:47:38 minnie kernel: [34758.539256] invalid opcode: 0000 [#2] SMP 
Jun 12 04:47:38 minnie kernel: [34758.556705] Modules linked in: ip_vs_lc ip_vs nf_conntrack gfs2 dlm sctp ipv6 8021q garp stp llc bonding pcmcia pcmcia_core quota_v2 quota_tree lp ppdev parport_pc parport fuse hid_generic coretemp acpi_cpufreq mperf freq_table igb mgag200 ttm processor kvm_intel thermal_sys kvm drm_kms_helper drm agpgart syscopyarea sysfillrect sysimgblt crc32c_intel ptp evdev gpio_ich psmouse serio_raw ioatdma microcode lpc_ich pps_core hwmon i2c_algo_bit i2c_i801 i2c_core dca button usbhid hid usb_storage ehci_pci uhci_hcd lpfc scsi_transport_fc ehci_hcd xfs exportfs [last unloaded: pcmcia_core]
Jun 12 04:47:38 minnie kernel: [34758.683832] CPU 4 
Jun 12 04:47:38 minnie kernel: [34758.685906] Pid: 10936, comm: gfs2_quotad Tainted: G      D      3.9.3 #1 Supermicro X8DTU/X8DTU
Jun 12 04:47:38 minnie kernel: [34758.724366] RIP: 0010:[<ffffffffa0e0a81d>]  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:47:38 minnie kernel: [34758.748669] RSP: 0018:ffff8809f952fc58  EFLAGS: 00010292
Jun 12 04:47:38 minnie kernel: [34758.769201] RAX: 0000000000000000 RBX: ffff8809fa889980 RCX: ffff880a3fcee988
Jun 12 04:47:38 minnie kernel: [34758.791528] RDX: 0000000000000000 RSI: ffff880a3fcece98 RDI: ffff880a3fcece90
Jun 12 04:47:38 minnie kernel: [34758.813724] RBP: ffff8809f952fc98 R08: 0000000000000000 R09: 00000000000005a3
Jun 12 04:47:38 minnie kernel: [34758.836186] R10: 0000000000000007 R11: ffff880a0f607170 R12: ffff8809ebeb4c08
Jun 12 04:47:38 minnie kernel: [34758.858737] R13: ffff8809fa8899c0 R14: ffff8809ebeb4c08 R15: ffff8809ebeb4c58
Jun 12 04:47:38 minnie kernel: [34758.881277] FS:  0000000000000000(0000) GS:ffff880a3fc80000(0000) knlGS:0000000000000000
Jun 12 04:47:38 minnie kernel: [34758.904625] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 12 04:47:39 minnie kernel: [34758.925474] CR2: 00007f8df994ea08 CR3: 000000000180c000 CR4: 00000000000007e0
Jun 12 04:47:39 minnie kernel: [34758.947856] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 12 04:47:39 minnie kernel: [34758.970137] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 12 04:47:39 minnie kernel: [34758.992228] Process gfs2_quotad (pid: 10936, threadinfo ffff8809f952e000, task ffff880a0f521500)
Jun 12 04:47:39 minnie kernel: [34759.016240] Stack:
Jun 12 04:47:39 minnie kernel: [34759.033215]  ffff880a0f660000 0000000000000000 ffff8809f952fc98 ffff8809fa8899c0
Jun 12 04:47:39 minnie kernel: [34759.056514]  0000000000000001 0000000000000002 000000000000000f ffff880a0d890e08
Jun 12 04:47:39 minnie kernel: [34759.079728]  ffff8809f952fd78 ffffffffa0e1a931 ffff8809f952fd78 ffffffffa0e1a5a6
Jun 12 04:47:39 minnie kernel: [34759.103152] Call Trace:
Jun 12 04:47:39 minnie kernel: [34759.120923]  [<ffffffffa0e1a931>] do_sync+0x191/0x4c0 [gfs2]
Jun 12 04:47:39 minnie kernel: [34759.142139]  [<ffffffffa0e1a5a6>] ? bh_get+0x156/0x1f0 [gfs2]
Jun 12 04:47:39 minnie kernel: [34759.163395]  [<ffffffffa0e1acfd>] gfs2_quota_sync+0x9d/0x320 [gfs2]
Jun 12 04:47:39 minnie kernel: [34759.185256]  [<ffffffffa0e1af8e>] gfs2_quota_sync_timeo+0xe/0x10 [gfs2]
Jun 12 04:47:39 minnie kernel: [34759.207389]  [<ffffffffa0e1a310>] quotad_check_timeo.part.15+0x30/0x70 [gfs2]
Jun 12 04:47:39 minnie kernel: [34759.230160]  [<ffffffffa0e1c917>] gfs2_quotad+0x217/0x2a0 [gfs2]
Jun 12 04:47:39 minnie kernel: [34759.251687]  [<ffffffff8107b4e0>] ? finish_wait+0x80/0x80
Jun 12 04:47:39 minnie kernel: [34759.272191]  [<ffffffffa0e1c700>] ? gfs2_wake_up_statfs+0x40/0x40 [gfs2]
Jun 12 04:47:39 minnie kernel: [34759.293901]  [<ffffffff8107ada0>] kthread+0xc0/0xd0
Jun 12 04:47:39 minnie kernel: [34759.313531]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:47:39 minnie kernel: [34759.334846]  [<ffffffff8156dddc>] ret_from_fork+0x7c/0xb0
Jun 12 04:47:39 minnie kernel: [34759.354999]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:47:39 minnie kernel: [34759.376416] Code: a0 31 c0 e8 b4 a7 75 e0 49 8b 45 10 41 8b 55 20 48 c7 c7 38 fc e2 a0 8b 70 28 31 c0 e8 9b a7 75 e0 4c 89 f6 31 ff e8 53 e7 ff ff <0f> 0b 90 48 83 7d c8 00 48 8b 45 c8 48 0f 44 c3 48 89 45 c8 e9 
Jun 12 04:47:39 minnie kernel: [34759.429402] RIP  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:47:39 minnie kernel: [34759.451801]  RSP <ffff8809f952fc58>
Jun 12 04:47:39 minnie kernel: [34759.470905] ---[ end trace 09eb442b9da063fe ]---
Jun 12 04:47:43 minnie kernel: [34763.834104] original: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:47:43 minnie kernel: [34763.854551] pid: 10994
Jun 12 04:47:43 minnie kernel: [34763.872380] lock type: 8 req lock state : 1
Jun 12 04:47:43 minnie kernel: [34763.891922] new: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:47:44 minnie kernel: [34763.911585] pid: 10994
Jun 12 04:47:44 minnie kernel: [34763.929116] lock type: 8 req lock state : 1
Jun 12 04:47:44 minnie kernel: [34763.948192]  G:  s:EX n:8/0 f:Iqb t:EX d:EX/0 a:0 v:0 r:6 m:200
Jun 12 04:47:44 minnie kernel: [34763.968758]   H: s:EX f:cH e:0 p:10994 [gfs2_quotad] do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:47:44 minnie kernel: [34763.990602] ------------[ cut here ]------------
Jun 12 04:47:44 minnie kernel: [34764.009517] kernel BUG at fs/gfs2/glock.c:1029!
Jun 12 04:47:44 minnie kernel: [34764.028004] invalid opcode: 0000 [#3] SMP 
Jun 12 04:47:44 minnie kernel: [34764.046117] Modules linked in: ip_vs_lc ip_vs nf_conntrack gfs2 dlm sctp ipv6 8021q garp stp llc bonding pcmcia pcmcia_core quota_v2 quota_tree lp ppdev parport_pc parport fuse hid_generic coretemp acpi_cpufreq mperf freq_table igb mgag200 ttm processor kvm_intel thermal_sys kvm drm_kms_helper drm agpgart syscopyarea sysfillrect sysimgblt crc32c_intel ptp evdev gpio_ich psmouse serio_raw ioatdma microcode lpc_ich pps_core hwmon i2c_algo_bit i2c_i801 i2c_core dca button usbhid hid usb_storage ehci_pci uhci_hcd lpfc scsi_transport_fc ehci_hcd xfs exportfs [last unloaded: pcmcia_core]
Jun 12 04:47:44 minnie kernel: [34764.182952] CPU 0 
Jun 12 04:47:44 minnie kernel: [34764.185244] Pid: 10994, comm: gfs2_quotad Tainted: G      D      3.9.3 #1 Supermicro X8DTU/X8DTU
Jun 12 04:47:44 minnie kernel: [34764.224360] RIP: 0010:[<ffffffffa0e0a81d>]  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:47:44 minnie kernel: [34764.249387] RSP: 0018:ffff8809f72fdc58  EFLAGS: 00010292
Jun 12 04:47:44 minnie kernel: [34764.270287] RAX: 0000000000000000 RBX: ffff880a0e982600 RCX: ffff880a3fc8e988
Jun 12 04:47:44 minnie kernel: [34764.293012] RDX: 0000000000000000 RSI: ffff880a3fc8ce98 RDI: ffff880a3fc8ce90
Jun 12 04:47:44 minnie kernel: [34764.315541] RBP: ffff8809f72fdc98 R08: 0000000000000000 R09: 00000000000005d3
Jun 12 04:47:44 minnie kernel: [34764.338284] R10: 0000000000000007 R11: ffff880a0f607170 R12: ffff8809ebeb41b8
Jun 12 04:47:44 minnie kernel: [34764.361226] R13: ffff880a0e982640 R14: ffff8809ebeb41b8 R15: ffff8809ebeb4208
Jun 12 04:47:44 minnie kernel: [34764.384067] FS:  0000000000000000(0000) GS:ffff880a3fc00000(0000) knlGS:0000000000000000
Jun 12 04:47:44 minnie kernel: [34764.407838] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 12 04:47:44 minnie kernel: [34764.429064] CR2: 00007f0081b62f70 CR3: 000000000180c000 CR4: 00000000000007f0
Jun 12 04:47:44 minnie kernel: [34764.451855] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 12 04:47:44 minnie kernel: [34764.474412] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 12 04:47:44 minnie kernel: [34764.496728] Process gfs2_quotad (pid: 10994, threadinfo ffff8809f72fc000, task ffff880a010e6200)
Jun 12 04:47:44 minnie kernel: [34764.520853] Stack:
Jun 12 04:47:44 minnie kernel: [34764.538179]  ffff8809fbff8000 0000000000000000 ffff8809f72fdc98 ffff880a0e982640
Jun 12 04:47:44 minnie kernel: [34764.563138]  0000000000000001 0000000000000002 000000000000000f ffff880a101dbc08
Jun 12 04:47:44 minnie kernel: [34764.588288]  ffff8809f72fdd78 ffffffffa0e1a931 ffff8809f72fdd78 ffffffffa0e1a5a6
Jun 12 04:47:44 minnie kernel: [34764.613498] Call Trace:
Jun 12 04:47:44 minnie kernel: [34764.631749]  [<ffffffffa0e1a931>] do_sync+0x191/0x4c0 [gfs2]
Jun 12 04:47:44 minnie kernel: [34764.653247]  [<ffffffffa0e1a5a6>] ? bh_get+0x156/0x1f0 [gfs2]
Jun 12 04:47:44 minnie kernel: [34764.674949]  [<ffffffffa0e1acfd>] gfs2_quota_sync+0x9d/0x320 [gfs2]
Jun 12 04:47:44 minnie kernel: [34764.697072]  [<ffffffffa0e1af8e>] gfs2_quota_sync_timeo+0xe/0x10 [gfs2]
Jun 12 04:47:44 minnie kernel: [34764.719473]  [<ffffffffa0e1a310>] quotad_check_timeo.part.15+0x30/0x70 [gfs2]
Jun 12 04:47:44 minnie kernel: [34764.742500]  [<ffffffffa0e1c917>] gfs2_quotad+0x217/0x2a0 [gfs2]
Jun 12 04:47:44 minnie kernel: [34764.764452]  [<ffffffff8107b4e0>] ? finish_wait+0x80/0x80
Jun 12 04:47:44 minnie kernel: [34764.785344]  [<ffffffffa0e1c700>] ? gfs2_wake_up_statfs+0x40/0x40 [gfs2]
Jun 12 04:47:44 minnie kernel: [34764.807353]  [<ffffffff8107ada0>] kthread+0xc0/0xd0
Jun 12 04:47:44 minnie kernel: [34764.827248]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:47:44 minnie kernel: [34764.848868]  [<ffffffff8156dddc>] ret_from_fork+0x7c/0xb0
Jun 12 04:47:44 minnie kernel: [34764.869423]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:47:45 minnie kernel: [34764.891319] Code: a0 31 c0 e8 b4 a7 75 e0 49 8b 45 10 41 8b 55 20 48 c7 c7 38 fc e2 a0 8b 70 28 31 c0 e8 9b a7 75 e0 4c 89 f6 31 ff e8 53 e7 ff ff <0f> 0b 90 48 83 7d c8 00 48 8b 45 c8 48 0f 44 c3 48 89 45 c8 e9 
Jun 12 04:47:45 minnie kernel: [34764.959315] RIP  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:47:45 minnie kernel: [34764.982284]  RSP <ffff8809f72fdc58>
Jun 12 04:47:45 minnie kernel: [34765.001639] ---[ end trace 09eb442b9da063ff ]---
Jun 12 04:48:37 minnie kernel: [34817.232590] original: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:37 minnie kernel: [34817.252950] pid: 10869
Jun 12 04:48:37 minnie kernel: [34817.270771] lock type: 8 req lock state : 1
Jun 12 04:48:37 minnie kernel: [34817.290301] new: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:37 minnie kernel: [34817.309807] pid: 10869
Jun 12 04:48:37 minnie kernel: [34817.327318] lock type: 8 req lock state : 1
Jun 12 04:48:37 minnie kernel: [34817.346466]  G:  s:EX n:8/0 f:Iqb t:EX d:EX/0 a:0 v:0 r:6 m:200
Jun 12 04:48:37 minnie kernel: [34817.367038]   H: s:EX f:cH e:0 p:10869 [gfs2_quotad] do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:37 minnie kernel: [34817.388844] ------------[ cut here ]------------
Jun 12 04:48:37 minnie kernel: [34817.407514] kernel BUG at fs/gfs2/glock.c:1029!
Jun 12 04:48:37 minnie kernel: [34817.425714] invalid opcode: 0000 [#4] SMP 
Jun 12 04:48:37 minnie kernel: [34817.443283] Modules linked in: ip_vs_lc ip_vs nf_conntrack gfs2 dlm sctp ipv6 8021q garp stp llc bonding pcmcia pcmcia_core quota_v2 quota_tree lp ppdev parport_pc parport fuse hid_generic coretemp acpi_cpufreq mperf freq_table igb mgag200 ttm processor kvm_intel thermal_sys kvm drm_kms_helper drm agpgart syscopyarea sysfillrect sysimgblt crc32c_intel ptp evdev gpio_ich psmouse serio_raw ioatdma microcode lpc_ich pps_core hwmon i2c_algo_bit i2c_i801 i2c_core dca button usbhid hid usb_storage ehci_pci uhci_hcd lpfc scsi_transport_fc ehci_hcd xfs exportfs [last unloaded: pcmcia_core]
Jun 12 04:48:37 minnie kernel: [34817.569931] CPU 5 
Jun 12 04:48:37 minnie kernel: [34817.571907] Pid: 10869, comm: gfs2_quotad Tainted: G      D      3.9.3 #1 Supermicro X8DTU/X8DTU
Jun 12 04:48:37 minnie kernel: [34817.610684] RIP: 0010:[<ffffffffa0e0a81d>]  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:48:37 minnie kernel: [34817.635088] RSP: 0018:ffff8809fddf3c58  EFLAGS: 00010292
Jun 12 04:48:37 minnie kernel: [34817.655692] RAX: 0000000000000000 RBX: ffff880a03bad480 RCX: ffff880a3fcae988
Jun 12 04:48:37 minnie kernel: [34817.678227] RDX: 0000000000000000 RSI: ffff880a3fcace98 RDI: ffff880a3fcace90
Jun 12 04:48:37 minnie kernel: [34817.700723] RBP: ffff8809fddf3c98 R08: 0000000000000000 R09: 0000000000000603
Jun 12 04:48:37 minnie kernel: [34817.723198] R10: 0000000000000007 R11: ffff880a0f607170 R12: ffff8809ebeb6af8
Jun 12 04:48:37 minnie kernel: [34817.745789] R13: ffff880a03bad4c0 R14: ffff8809ebeb6af8 R15: ffff8809ebeb6b48
Jun 12 04:48:37 minnie kernel: [34817.768169] FS:  0000000000000000(0000) GS:ffff880a3fca0000(0000) knlGS:0000000000000000
Jun 12 04:48:37 minnie kernel: [34817.791555] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 12 04:48:37 minnie kernel: [34817.812393] CR2: 00000000006852a0 CR3: 0000000a08504000 CR4: 00000000000007e0
Jun 12 04:48:37 minnie kernel: [34817.834824] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 12 04:48:37 minnie kernel: [34817.857053] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 12 04:48:37 minnie kernel: [34817.879067] Process gfs2_quotad (pid: 10869, threadinfo ffff8809fddf2000, task ffff880a089ed400)
Jun 12 04:48:37 minnie kernel: [34817.902944] Stack:
Jun 12 04:48:38 minnie kernel: [34817.919865]  ffff880a0046f000 0000000000000000 ffff8809fddf3c98 ffff880a03bad4c0
Jun 12 04:48:38 minnie kernel: [34817.943007]  0000000000000001 0000000000000002 000000000000000f ffff880a0e71f008
Jun 12 04:48:38 minnie kernel: [34817.966153]  ffff8809fddf3d78 ffffffffa0e1a931 ffff8809fddf3d78 ffffffffa0e1a5a6
Jun 12 04:48:38 minnie kernel: [34817.989436] Call Trace:
Jun 12 04:48:38 minnie kernel: [34818.007206]  [<ffffffffa0e1a931>] do_sync+0x191/0x4c0 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.028423]  [<ffffffffa0e1a5a6>] ? bh_get+0x156/0x1f0 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.049748]  [<ffffffffa0e1acfd>] gfs2_quota_sync+0x9d/0x320 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.071465]  [<ffffffffa0e1af8e>] gfs2_quota_sync_timeo+0xe/0x10 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.093522]  [<ffffffffa0e1a310>] quotad_check_timeo.part.15+0x30/0x70 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.116140]  [<ffffffffa0e1c917>] gfs2_quotad+0x217/0x2a0 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.137605]  [<ffffffff8107b4e0>] ? finish_wait+0x80/0x80
Jun 12 04:48:38 minnie kernel: [34818.158147]  [<ffffffffa0e1c700>] ? gfs2_wake_up_statfs+0x40/0x40 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.179833]  [<ffffffff8107ada0>] kthread+0xc0/0xd0
Jun 12 04:48:38 minnie kernel: [34818.180567] original: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.180568] pid: 10925
Jun 12 04:48:38 minnie kernel: [34818.180569] lock type: 8 req lock state : 1
Jun 12 04:48:38 minnie kernel: [34818.180574] new: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.180574] pid: 10925
Jun 12 04:48:38 minnie kernel: [34818.180575] lock type: 8 req lock state : 1
Jun 12 04:48:38 minnie kernel: [34818.180578]  G:  s:EX n:8/0 f:Iqb t:EX d:EX/0 a:0 v:0 r:6 m:200
Jun 12 04:48:38 minnie kernel: [34818.180584]   H: s:EX f:cH e:0 p:10925 [gfs2_quotad] do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.350334]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:48:38 minnie kernel: [34818.371705]  [<ffffffff8156dddc>] ret_from_fork+0x7c/0xb0
Jun 12 04:48:38 minnie kernel: [34818.391831]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:48:38 minnie kernel: [34818.412966] Code: a0 31 c0 e8 b4 a7 75 e0 49 8b 45 10 41 8b 55 20 48 c7 c7 38 fc e2 a0 8b 70 28 31 c0 e8 9b a7 75 e0 4c 89 f6 31 ff e8 53 e7 ff ff <0f> 0b 90 48 83 7d c8 00 48 8b 45 c8 48 0f 44 c3 48 89 45 c8 e9 
Jun 12 04:48:38 minnie kernel: [34818.466043] RIP  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.487862]  RSP <ffff8809fddf3c58>
Jun 12 04:48:38 minnie kernel: [34818.505921] ---[ end trace 09eb442b9da06400 ]---
Jun 12 04:48:38 minnie kernel: [34818.505928] ------------[ cut here ]------------
Jun 12 04:48:38 minnie kernel: [34818.505929] kernel BUG at fs/gfs2/glock.c:1029!
Jun 12 04:48:38 minnie kernel: [34818.505931] invalid opcode: 0000 [#5] SMP 
Jun 12 04:48:38 minnie kernel: [34818.505969] Modules linked in: ip_vs_lc ip_vs nf_conntrack gfs2 dlm sctp ipv6 8021q garp stp llc bonding pcmcia pcmcia_core quota_v2 quota_tree lp ppdev parport_pc parport fuse hid_generic coretemp acpi_cpufreq mperf freq_table igb mgag200 ttm processor kvm_intel thermal_sys kvm drm_kms_helper drm agpgart syscopyarea sysfillrect sysimgblt crc32c_intel ptp evdev gpio_ich psmouse serio_raw ioatdma microcode lpc_ich pps_core hwmon i2c_algo_bit i2c_i801 i2c_core dca button usbhid hid usb_storage ehci_pci uhci_hcd lpfc scsi_transport_fc ehci_hcd xfs exportfs [last unloaded: pcmcia_core]
Jun 12 04:48:38 minnie kernel: [34818.505973] CPU 7 
Jun 12 04:48:38 minnie kernel: [34818.505974] Pid: 10925, comm: gfs2_quotad Tainted: G      D      3.9.3 #1 Supermicro X8DTU/X8DTU
Jun 12 04:48:38 minnie kernel: [34818.505980] RIP: 0010:[<ffffffffa0e0a81d>]  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.505982] RSP: 0018:ffff8809faa39c58  EFLAGS: 00010292
Jun 12 04:48:38 minnie kernel: [34818.505983] RAX: 0000000000000000 RBX: ffff8809fa889500 RCX: 00000000000000fa
Jun 12 04:48:38 minnie kernel: [34818.505984] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffffffff8181e6f0
Jun 12 04:48:38 minnie kernel: [34818.505985] RBP: ffff8809faa39c98 R08: 000000000000062c R09: ffffffff819eb12f
Jun 12 04:48:38 minnie kernel: [34818.505985] R10: 0000000000000043 R11: 0000000000040000 R12: ffff8809ebeb60a8
Jun 12 04:48:38 minnie kernel: [34818.505986] R13: ffff8809fa889540 R14: ffff8809ebeb60a8 R15: ffff8809ebeb60f8
Jun 12 04:48:38 minnie kernel: [34818.505988] FS:  0000000000000000(0000) GS:ffff880a3fce0000(0000) knlGS:0000000000000000
Jun 12 04:48:38 minnie kernel: [34818.505989] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 12 04:48:38 minnie kernel: [34818.505990] CR2: 000000000060e9a0 CR3: 000000000180c000 CR4: 00000000000007e0
Jun 12 04:48:38 minnie kernel: [34818.505990] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 12 04:48:38 minnie kernel: [34818.505991] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 12 04:48:38 minnie kernel: [34818.505992] Process gfs2_quotad (pid: 10925, threadinfo ffff8809faa38000, task ffff880a0f523100)
Jun 12 04:48:38 minnie kernel: [34818.505993] Stack:
Jun 12 04:48:38 minnie kernel: [34818.505995]  ffff8809fbff9000 0000000000000000 ffff8809faa39c98 ffff8809fa889540
Jun 12 04:48:38 minnie kernel: [34818.506007]  0000000000000001 0000000000000002 000000000000000f ffff880a0d890608
Jun 12 04:48:38 minnie kernel: [34818.506018]  ffff8809faa39d78 ffffffffa0e1a931 ffff8809faa39d78 ffffffffa0e1a5a6
Jun 12 04:48:38 minnie kernel: [34818.506021] Call Trace:
Jun 12 04:48:38 minnie kernel: [34818.506030]  [<ffffffffa0e1a931>] do_sync+0x191/0x4c0 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.506040]  [<ffffffffa0e1a5a6>] ? bh_get+0x156/0x1f0 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.506049]  [<ffffffffa0e1acfd>] gfs2_quota_sync+0x9d/0x320 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.506058]  [<ffffffffa0e1af8e>] gfs2_quota_sync_timeo+0xe/0x10 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.506067]  [<ffffffffa0e1a310>] quotad_check_timeo.part.15+0x30/0x70 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.506076]  [<ffffffffa0e1c917>] gfs2_quotad+0x217/0x2a0 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.506083]  [<ffffffff8107b4e0>] ? finish_wait+0x80/0x80
Jun 12 04:48:38 minnie kernel: [34818.506091]  [<ffffffffa0e1c700>] ? gfs2_wake_up_statfs+0x40/0x40 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.506097]  [<ffffffff8107ada0>] kthread+0xc0/0xd0
Jun 12 04:48:38 minnie kernel: [34818.506102]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:48:38 minnie kernel: [34818.506104]  [<ffffffff8156dddc>] ret_from_fork+0x7c/0xb0
Jun 12 04:48:38 minnie kernel: [34818.506106]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:48:38 minnie kernel: [34818.506121] Code: a0 31 c0 e8 b4 a7 75 e0 49 8b 45 10 41 8b 55 20 48 c7 c7 38 fc e2 a0 8b 70 28 31 c0 e8 9b a7 75 e0 4c 89 f6 31 ff e8 53 e7 ff ff <0f> 0b 90 48 83 7d c8 00 48 8b 45 c8 48 0f 44 c3 48 89 45 c8 e9 
Jun 12 04:48:38 minnie kernel: [34818.506126] RIP  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:48:38 minnie kernel: [34818.506126]  RSP <ffff8809faa39c58>
Jun 12 04:48:38 minnie kernel: [34818.506139] ---[ end trace 09eb442b9da06401 ]---
Jun 12 04:48:43 minnie kernel: [34823.838421] original: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838444] original: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838445] pid: 10972
Jun 12 04:48:43 minnie kernel: [34823.838446] lock type: 8 req lock state : 1
Jun 12 04:48:43 minnie kernel: [34823.838452] new: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838453] pid: 10972
Jun 12 04:48:43 minnie kernel: [34823.838453] lock type: 8 req lock state : 1
Jun 12 04:48:43 minnie kernel: [34823.838458]  G:  s:EX n:8/0 f:Iqb t:EX d:EX/0 a:0 v:0 r:7 m:200
Jun 12 04:48:43 minnie kernel: [34823.838465]   H: s:EX f:cH e:0 p:10972 [gfs2_quotad] do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838495] ------------[ cut here ]------------
Jun 12 04:48:43 minnie kernel: [34823.838496] kernel BUG at fs/gfs2/glock.c:1029!
Jun 12 04:48:43 minnie kernel: [34823.838498] invalid opcode: 0000 [#6] SMP 
Jun 12 04:48:43 minnie kernel: [34823.838531] Modules linked in: ip_vs_lc ip_vs nf_conntrack gfs2 dlm sctp ipv6 8021q garp stp llc bonding pcmcia pcmcia_core quota_v2 quota_tree lp ppdev parport_pc parport fuse hid_generic coretemp acpi_cpufreq mperf freq_table igb mgag200 ttm processor kvm_intel thermal_sys kvm drm_kms_helper drm agpgart syscopyarea sysfillrect sysimgblt crc32c_intel ptp evdev gpio_ich psmouse serio_raw ioatdma microcode lpc_ich pps_core hwmon i2c_algo_bit i2c_i801 i2c_core dca button usbhid hid usb_storage ehci_pci uhci_hcd lpfc scsi_transport_fc ehci_hcd xfs exportfs [last unloaded: pcmcia_core]
Jun 12 04:48:43 minnie kernel: [34823.838536] CPU 6 
Jun 12 04:48:43 minnie kernel: [34823.838537] Pid: 10972, comm: gfs2_quotad Tainted: G      D      3.9.3 #1 Supermicro X8DTU/X8DTU
Jun 12 04:48:43 minnie kernel: [34823.838544] RIP: 0010:[<ffffffffa0e0a81d>]  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838545] RSP: 0018:ffff880a0873bc58  EFLAGS: 00010292
Jun 12 04:48:43 minnie kernel: [34823.838547] RAX: 0000000000000000 RBX: ffff8809e62a6cc0 RCX: 0000000000000096
Jun 12 04:48:43 minnie kernel: [34823.838548] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffffffff8181e6f0
Jun 12 04:48:43 minnie kernel: [34823.838549] RBP: ffff880a0873bc98 R08: 0000000000000664 R09: ffffffff819ec2f7
Jun 12 04:48:43 minnie kernel: [34823.838550] R10: 0000000000000043 R11: 0000000000040000 R12: ffff8809e4ea06e0
Jun 12 04:48:43 minnie kernel: [34823.838551] R13: ffff8809e62a6d00 R14: ffff8809e4ea06e0 R15: ffff8809e4ea0730
Jun 12 04:48:43 minnie kernel: [34823.838555] FS:  0000000000000000(0000) GS:ffff880a3fcc0000(0000) knlGS:0000000000000000
Jun 12 04:48:43 minnie kernel: [34823.838556] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 12 04:48:43 minnie kernel: [34823.838557] CR2: 00007fff1a717a20 CR3: 000000000180c000 CR4: 00000000000007e0
Jun 12 04:48:43 minnie kernel: [34823.838558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 12 04:48:43 minnie kernel: [34823.838559] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 12 04:48:43 minnie kernel: [34823.838560] Process gfs2_quotad (pid: 10972, threadinfo ffff880a0873a000, task ffff880a0f4bb800)
Jun 12 04:48:43 minnie kernel: [34823.838561] Stack:
Jun 12 04:48:43 minnie kernel: [34823.838562]  ffff8809fbffb000 0000000000000000 ffff880a0873bc98 ffff8809e62a6d00
Jun 12 04:48:43 minnie kernel: [34823.838564]  0000000000000001 0000000000000003 000000000000000f ffff880a0f0f8008
Jun 12 04:48:43 minnie kernel: [34823.838565]  ffff880a0873bd78 ffffffffa0e1a931 ffff880a0873bd78 ffffffffa0e1a5a6
Jun 12 04:48:43 minnie kernel: [34823.838566] Call Trace:
Jun 12 04:48:43 minnie kernel: [34823.838573]  [<ffffffffa0e1a931>] do_sync+0x191/0x4c0 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838578]  [<ffffffffa0e1a5a6>] ? bh_get+0x156/0x1f0 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838584]  [<ffffffffa0e1acfd>] gfs2_quota_sync+0x9d/0x320 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838589]  [<ffffffffa0e1af8e>] gfs2_quota_sync_timeo+0xe/0x10 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838593]  [<ffffffffa0e1a310>] quotad_check_timeo.part.15+0x30/0x70 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838598]  [<ffffffffa0e1c917>] gfs2_quotad+0x217/0x2a0 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838604]  [<ffffffff8107b4e0>] ? finish_wait+0x80/0x80
Jun 12 04:48:43 minnie kernel: [34823.838609]  [<ffffffffa0e1c700>] ? gfs2_wake_up_statfs+0x40/0x40 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838611]  [<ffffffff8107ada0>] kthread+0xc0/0xd0
Jun 12 04:48:43 minnie kernel: [34823.838613]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:48:43 minnie kernel: [34823.838617]  [<ffffffff8156dddc>] ret_from_fork+0x7c/0xb0
Jun 12 04:48:43 minnie kernel: [34823.838619]  [<ffffffff8107ace0>] ? kthread_create_on_node+0x130/0x130
Jun 12 04:48:43 minnie kernel: [34823.838634] Code: a0 31 c0 e8 b4 a7 75 e0 49 8b 45 10 41 8b 55 20 48 c7 c7 38 fc e2 a0 8b 70 28 31 c0 e8 9b a7 75 e0 4c 89 f6 31 ff e8 53 e7 ff ff <0f> 0b 90 48 83 7d c8 00 48 8b 45 c8 48 0f 44 c3 48 89 45 c8 e9 
Jun 12 04:48:43 minnie kernel: [34823.838639] RIP  [<ffffffffa0e0a81d>] gfs2_glock_nq+0x33d/0x400 [gfs2]
Jun 12 04:48:43 minnie kernel: [34823.838639]  RSP <ffff880a0873bc58>
Jun 12 04:48:43 minnie kernel: [34823.838657] ---[ end trace 09eb442b9da06402 ]---
Jun 12 04:48:45 minnie kernel: [34824.941638] pid: 10983
Jun 12 04:48:45 minnie kernel: [34824.958808] lock type: 8 req lock state : 1
Jun 12 04:48:45 minnie kernel: [34824.977806] new: do_sync+0x189/0x4c0 [gfs2]
Jun 12 04:48:45 minnie kernel: [34824.996645] pid: 10983
Jun 12 04:48:45 minnie kernel: [34825.013475] lock type: 8 req lock state : 1
Jun 12 04:48:45 minnie kernel: [34825.032266]  G:  s:EX n:8/0 f:Iqb t:EX d:EX/0 a:0 v:0 r:6 m:200
Jun 12 04:48:45 minnie kernel: [34825.052556]   H: s:EX f:cH e:0 p:10983 [gfs2_quotad] do_sync+0x189/0x4c0 [gfs2]

From emi2fast at gmail.com  Mon Jun 17 15:42:56 2013
From: emi2fast at gmail.com (emmanuel segura)
Date: Mon, 17 Jun 2013 17:42:56 +0200
Subject: [Linux-cluster] kernel BUG at fs/gfs2/glock.c
In-Reply-To: <1753598.gZm6zVSfWa@zapp>
References: <1753598.gZm6zVSfWa@zapp>
Message-ID: <CAE7pJ3ANkjAUDh_Hrj9CMTd=rqaA0qYskV-5T10aeizmSLzmuQ@mail.gmail.com>

Hello Ali

Do you try to aske in kernel mailling list?

Thanks


2013/6/12 Ali Bendriss <ali.bendriss at gmail.com>

> **
>
> Hello,
>
>
>
> I have enabled the quota on a gfs2 file system through fstab on a 2
> (identical) nodes cluster : quota=on
>
> but I've not used it yet.
>
> Now I can see the kernel message in attachment only on one node.
>
>
>
> thanks
>
>
>
>
>
> sotware version:
>
> linux 3.9.3
>
> cluster 3.2.0
>
> gfs2-utils 3.1.5
>
>
>
> --
>
> Ali
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



-- 
esta es mi vida e me la vivo hasta que dios quiera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130617/5ad55a78/attachment.htm>

From ali.bendriss at gmail.com  Mon Jun 17 16:15:43 2013
From: ali.bendriss at gmail.com (Ali Bendriss)
Date: Mon, 17 Jun 2013 18:15:43 +0200
Subject: [Linux-cluster] kernel BUG at fs/gfs2/glock.c
In-Reply-To: <CAE7pJ3ANkjAUDh_Hrj9CMTd=rqaA0qYskV-5T10aeizmSLzmuQ@mail.gmail.com>
References: <1753598.gZm6zVSfWa@zapp>
	<CAE7pJ3ANkjAUDh_Hrj9CMTd=rqaA0qYskV-5T10aeizmSLzmuQ@mail.gmail.com>
Message-ID: <7392982.MCCGyWiZNM@zapp>

On Monday, June 17, 2013 05:42:56 PM emmanuel segura wrote:

Hello Ali Do you try to aske in kernel mailling list? Thanks 

Thanks for your email,
Just after sending to the list I've seen my mistake.
cluster 3.2 dlm library was running against the wrong linux kernel.
It seems that during the compilation cluster looks for the kernel in 
/usr/src/linux (and for me it was a symlink to an other linux version).

So I have cleaned /usr/src to link with the good kernel, recompiled the 
cluster package and all is fine now.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130617/7c7d6698/attachment.htm>

From hsiddiqi at gmail.com  Mon Jun 17 19:05:00 2013
From: hsiddiqi at gmail.com (Hammad Siddiqi)
Date: Tue, 18 Jun 2013 00:05:00 +0500
Subject: [Linux-cluster] cman error in corosync
Message-ID: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>

Dear Fabio,

Thanks for the update. Is there any fix for this bug. Would really
appreciate if some patch or update is provided.

Thank you,

Hammad Siddiqi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130618/0a26cb72/attachment.htm>

From fdinitto at redhat.com  Mon Jun 17 20:10:45 2013
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Mon, 17 Jun 2013 22:10:45 +0200
Subject: [Linux-cluster] cman error in corosync
In-Reply-To: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>
References: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>
Message-ID: <51BF6D45.4040303@redhat.com>

On 06/17/2013 09:05 PM, Hammad Siddiqi wrote:
> Dear Fabio,
> 
> Thanks for the update. Is there any fix for this bug. Would really
> appreciate if some patch or update is provided.
> 
> Thank you,
> 
> Hammad Siddiqi
> 
> 

The fix should be upstream already.

Chrissie do you know if it's been included?

Fabio



From taiyebjadliwala1986 at gmail.com  Tue Jun 18 00:44:22 2013
From: taiyebjadliwala1986 at gmail.com (taiyeb jadliwala)
Date: Tue, 18 Jun 2013 06:14:22 +0530
Subject: [Linux-cluster] question on storage
Message-ID: <CAG0WgGrMdEMpiXTZwwvF=dTd1Q-jMRZAWD2_KUaC-XfHHSDpTA@mail.gmail.com>

Hello Experts,

I have a simple query.

Let say i am having two computer which are connected on fiber channel
switch. How can i check the connectivity between these two computer. For eg
if the same two computer are connected on ethernet i can use tcp/ip
protocol and i can ping command to test the connectivity similarly how can
i check the connectivity between two computers when connected using fiber
channel.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130618/2965f667/attachment.htm>

From ricks at alldigital.com  Tue Jun 18 01:25:48 2013
From: ricks at alldigital.com (Rick Stevens)
Date: Mon, 17 Jun 2013 18:25:48 -0700
Subject: [Linux-cluster] question on storage
In-Reply-To: <CAG0WgGrMdEMpiXTZwwvF=dTd1Q-jMRZAWD2_KUaC-XfHHSDpTA@mail.gmail.com>
References: <CAG0WgGrMdEMpiXTZwwvF=dTd1Q-jMRZAWD2_KUaC-XfHHSDpTA@mail.gmail.com>
Message-ID: <51BFB71C.8050107@alldigital.com>

On 06/17/2013 05:44 PM, taiyeb jadliwala issued this missive:
> Hello Experts,
>
> I have a simple query.
>
> Let say i am having two computer which are connected on fiber channel
> switch. How can i check the connectivity between these two computer. For
> eg if the same two computer are connected on ethernet i can use tcp/ip
> protocol and i can ping command to test the connectivity similarly how
> can i check the connectivity between two computers when connected using
> fiber channel.

Assuming the fiber channel switch's mapping is flat (every port can see
every other port), you could query the HBA on one server to see if it
sees the WWNs of the other server. You'd probably need one of the tools
that come with your HBAs to do this (e.g. "scli" for QLogic cards).
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer, AllDigital    ricks at alldigital.com -
- AIM/Skype: therps2        ICQ: 22643734            Yahoo: origrps2 -
-                                                                    -
-    Admitting you have a problem is the first step toward getting   -
-    medicated for it.      -- Jim Evarts (http://www.TopFive.com)   -
----------------------------------------------------------------------



From ccaulfie at redhat.com  Tue Jun 18 08:12:31 2013
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Tue, 18 Jun 2013 09:12:31 +0100
Subject: [Linux-cluster] cman error in corosync
In-Reply-To: <51BF6D45.4040303@redhat.com>
References: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>
	<51BF6D45.4040303@redhat.com>
Message-ID: <51C0166F.4050208@redhat.com>

On 17/06/13 21:10, Fabio M. Di Nitto wrote:
> On 06/17/2013 09:05 PM, Hammad Siddiqi wrote:
>> Dear Fabio,
>>
>> Thanks for the update. Is there any fix for this bug. Would really
>> appreciate if some patch or update is provided.
>>
>> Thank you,
>>
>> Hammad Siddiqi
>>
>>
>
> The fix should be upstream already.
>
> Chrissie do you know if it's been included?
>

I don't have the start of this thread in my email client. Which error?

Chrissie



From fdinitto at redhat.com  Tue Jun 18 08:16:56 2013
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Tue, 18 Jun 2013 10:16:56 +0200
Subject: [Linux-cluster] cman error in corosync
In-Reply-To: <51C0166F.4050208@redhat.com>
References: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>
	<51BF6D45.4040303@redhat.com> <51C0166F.4050208@redhat.com>
Message-ID: <51C01778.6070304@redhat.com>

On 6/18/2013 10:12 AM, Christine Caulfield wrote:
> On 17/06/13 21:10, Fabio M. Di Nitto wrote:
>> On 06/17/2013 09:05 PM, Hammad Siddiqi wrote:
>>> Dear Fabio,
>>>
>>> Thanks for the update. Is there any fix for this bug. Would really
>>> appreciate if some patch or update is provided.
>>>
>>> Thank you,
>>>
>>> Hammad Siddiqi
>>>
>>>
>>
>> The fix should be upstream already.
>>
>> Chrissie do you know if it's been included?
>>
> 
> I don't have the start of this thread in my email client. Which error?

https://www.redhat.com/archives/linux-cluster/2013-April/msg00009.html

Cheers
Fabio



From ccaulfie at redhat.com  Tue Jun 18 08:23:02 2013
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Tue, 18 Jun 2013 09:23:02 +0100
Subject: [Linux-cluster] cman error in corosync
In-Reply-To: <51C01778.6070304@redhat.com>
References: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>
	<51BF6D45.4040303@redhat.com> <51C0166F.4050208@redhat.com>
	<51C01778.6070304@redhat.com>
Message-ID: <51C018E6.2070900@redhat.com>

On 18/06/13 09:16, Fabio M. Di Nitto wrote:
> On 6/18/2013 10:12 AM, Christine Caulfield wrote:
>> On 17/06/13 21:10, Fabio M. Di Nitto wrote:
>>> On 06/17/2013 09:05 PM, Hammad Siddiqi wrote:
>>>> Dear Fabio,
>>>>
>>>> Thanks for the update. Is there any fix for this bug. Would really
>>>> appreciate if some patch or update is provided.
>>>>
>>>> Thank you,
>>>>
>>>> Hammad Siddiqi
>>>>
>>>>
>>>
>>> The fix should be upstream already.
>>>
>>> Chrissie do you know if it's been included?
>>>
>>
>> I don't have the start of this thread in my email client. Which error?
>
> https://www.redhat.com/archives/linux-cluster/2013-April/msg00009.html
>
>

That's a bug in modclusterd, not cman. It's been fixed upstream in ricci 
but I'm not sure if it's out in public packages yet:

https://bugzilla.redhat.com/show_bug.cgi?id=951470

Chrissie



From hsiddiqi at gmail.com  Tue Jun 18 13:38:46 2013
From: hsiddiqi at gmail.com (Hammad Siddiqi)
Date: Tue, 18 Jun 2013 18:38:46 +0500
Subject: [Linux-cluster] cman error in corosync
In-Reply-To: <51C018E6.2070900@redhat.com>
References: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>
	<51BF6D45.4040303@redhat.com> <51C0166F.4050208@redhat.com>
	<51C01778.6070304@redhat.com> <51C018E6.2070900@redhat.com>
Message-ID: <CABE8nt_cr=E5GZ2O2rwAZ5tD3LuUSE+LgdG9xYRGuKPbSyx2Fw@mail.gmail.com>

Thanks for sharing this info. Would you please let know when the update
will be available in public packages and which versions.

Regards,

Hammad Siddiqi


On Tue, Jun 18, 2013 at 1:23 PM, Christine Caulfield <ccaulfie at redhat.com>wrote:

> On 18/06/13 09:16, Fabio M. Di Nitto wrote:
>
>> On 6/18/2013 10:12 AM, Christine Caulfield wrote:
>>
>>> On 17/06/13 21:10, Fabio M. Di Nitto wrote:
>>>
>>>> On 06/17/2013 09:05 PM, Hammad Siddiqi wrote:
>>>>
>>>>> Dear Fabio,
>>>>>
>>>>> Thanks for the update. Is there any fix for this bug. Would really
>>>>> appreciate if some patch or update is provided.
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Hammad Siddiqi
>>>>>
>>>>>
>>>>>
>>>> The fix should be upstream already.
>>>>
>>>> Chrissie do you know if it's been included?
>>>>
>>>>
>>> I don't have the start of this thread in my email client. Which error?
>>>
>>
>> https://www.redhat.com/**archives/linux-cluster/2013-**
>> April/msg00009.html<https://www.redhat.com/archives/linux-cluster/2013-April/msg00009.html>
>>
>>
>>
> That's a bug in modclusterd, not cman. It's been fixed upstream in ricci
> but I'm not sure if it's out in public packages yet:
>
> https://bugzilla.redhat.com/**show_bug.cgi?id=951470<https://bugzilla.redhat.com/show_bug.cgi?id=951470>
>
> Chrissie
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/**mailman/listinfo/linux-cluster<https://www.redhat.com/mailman/listinfo/linux-cluster>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130618/0a384a95/attachment.htm>

From hsiddiqi at gmail.com  Tue Jun 18 14:45:08 2013
From: hsiddiqi at gmail.com (Hammad Siddiqi)
Date: Tue, 18 Jun 2013 19:45:08 +0500
Subject: [Linux-cluster] cman error in corosync
In-Reply-To: <CABE8nt_cr=E5GZ2O2rwAZ5tD3LuUSE+LgdG9xYRGuKPbSyx2Fw@mail.gmail.com>
References: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>
	<51BF6D45.4040303@redhat.com> <51C0166F.4050208@redhat.com>
	<51C01778.6070304@redhat.com> <51C018E6.2070900@redhat.com>
	<CABE8nt_cr=E5GZ2O2rwAZ5tD3LuUSE+LgdG9xYRGuKPbSyx2Fw@mail.gmail.com>
Message-ID: <CABE8nt_tbY1C4A1sbSnVH0gLoYTYnjrsaGjaRcM_z=9VxbPNmg@mail.gmail.com>

Also, I had tried visiting the bugzilla link but my account does not have
sufficient privileges to get detailed info for this bug. It would be really
great if you enlighten what the bug really is.

Thanks a lot for your prompt replies


Hammad Siddiqi


On Tue, Jun 18, 2013 at 6:38 PM, Hammad Siddiqi <hsiddiqi at gmail.com> wrote:

> Thanks for sharing this info. Would you please let know when the update
> will be available in public packages and which versions.
>
> Regards,
>
> Hammad Siddiqi
>
>
> On Tue, Jun 18, 2013 at 1:23 PM, Christine Caulfield <ccaulfie at redhat.com>wrote:
>
>> On 18/06/13 09:16, Fabio M. Di Nitto wrote:
>>
>>> On 6/18/2013 10:12 AM, Christine Caulfield wrote:
>>>
>>>> On 17/06/13 21:10, Fabio M. Di Nitto wrote:
>>>>
>>>>> On 06/17/2013 09:05 PM, Hammad Siddiqi wrote:
>>>>>
>>>>>> Dear Fabio,
>>>>>>
>>>>>> Thanks for the update. Is there any fix for this bug. Would really
>>>>>> appreciate if some patch or update is provided.
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> Hammad Siddiqi
>>>>>>
>>>>>>
>>>>>>
>>>>> The fix should be upstream already.
>>>>>
>>>>> Chrissie do you know if it's been included?
>>>>>
>>>>>
>>>> I don't have the start of this thread in my email client. Which error?
>>>>
>>>
>>> https://www.redhat.com/**archives/linux-cluster/2013-**
>>> April/msg00009.html<https://www.redhat.com/archives/linux-cluster/2013-April/msg00009.html>
>>>
>>>
>>>
>> That's a bug in modclusterd, not cman. It's been fixed upstream in ricci
>> but I'm not sure if it's out in public packages yet:
>>
>> https://bugzilla.redhat.com/**show_bug.cgi?id=951470<https://bugzilla.redhat.com/show_bug.cgi?id=951470>
>>
>> Chrissie
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/**mailman/listinfo/linux-cluster<https://www.redhat.com/mailman/listinfo/linux-cluster>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130618/3363e3c4/attachment.htm>

From ccaulfie at redhat.com  Tue Jun 18 15:22:39 2013
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Tue, 18 Jun 2013 16:22:39 +0100
Subject: [Linux-cluster] cman error in corosync
In-Reply-To: <CABE8nt_tbY1C4A1sbSnVH0gLoYTYnjrsaGjaRcM_z=9VxbPNmg@mail.gmail.com>
References: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>
	<51BF6D45.4040303@redhat.com> <51C0166F.4050208@redhat.com>
	<51C01778.6070304@redhat.com> <51C018E6.2070900@redhat.com>
	<CABE8nt_cr=E5GZ2O2rwAZ5tD3LuUSE+LgdG9xYRGuKPbSyx2Fw@mail.gmail.com>
	<CABE8nt_tbY1C4A1sbSnVH0gLoYTYnjrsaGjaRcM_z=9VxbPNmg@mail.gmail.com>
Message-ID: <51C07B3F.9050004@redhat.com>

Oh sorry.

The bug is simply an error in the way that modclusterd calls into cman - 
it passes an invalid parameter so cman prints the message.
The bug has no repercussions apart from that message as modclusterd has 
a fallback command to get the information it needs. TBH the bugzilla 
says nothing more interesting that that, apart from the patch needed to 
fix it :)

I don't have any information about the release of a fixed package, sorry.

Chrissie

On 18/06/13 15:45, Hammad Siddiqi wrote:
> Also, I had tried visiting the bugzilla link but my account does not
> have sufficient privileges to get detailed info for this bug. It would
> be really great if you enlighten what the bug really is.
>
> Thanks a lot for your prompt replies
>
>
> Hammad Siddiqi
>
>
> On Tue, Jun 18, 2013 at 6:38 PM, Hammad Siddiqi <hsiddiqi at gmail.com
> <mailto:hsiddiqi at gmail.com>> wrote:
>
>     Thanks for sharing this info. Would you please let know when the
>     update will be available in public packages and which versions.
>
>     Regards,
>
>     Hammad Siddiqi
>
>
>     On Tue, Jun 18, 2013 at 1:23 PM, Christine Caulfield
>     <ccaulfie at redhat.com <mailto:ccaulfie at redhat.com>> wrote:
>
>         On 18/06/13 09:16, Fabio M. Di Nitto wrote:
>
>             On 6/18/2013 10:12 AM, Christine Caulfield wrote:
>
>                 On 17/06/13 21:10, Fabio M. Di Nitto wrote:
>
>                     On 06/17/2013 09:05 PM, Hammad Siddiqi wrote:
>
>                         Dear Fabio,
>
>                         Thanks for the update. Is there any fix for this
>                         bug. Would really
>                         appreciate if some patch or update is provided.
>
>                         Thank you,
>
>                         Hammad Siddiqi
>
>
>
>                     The fix should be upstream already.
>
>                     Chrissie do you know if it's been included?
>
>
>                 I don't have the start of this thread in my email
>                 client. Which error?
>
>
>             https://www.redhat.com/__archives/linux-cluster/2013-__April/msg00009.html
>             <https://www.redhat.com/archives/linux-cluster/2013-April/msg00009.html>
>
>
>
>         That's a bug in modclusterd, not cman. It's been fixed upstream
>         in ricci but I'm not sure if it's out in public packages yet:
>
>         https://bugzilla.redhat.com/__show_bug.cgi?id=951470
>         <https://bugzilla.redhat.com/show_bug.cgi?id=951470>
>
>         Chrissie
>
>
>         --
>         Linux-cluster mailing list
>         Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>         https://www.redhat.com/__mailman/listinfo/linux-cluster
>         <https://www.redhat.com/mailman/listinfo/linux-cluster>
>
>
>
>
>



From hsiddiqi at gmail.com  Tue Jun 18 15:32:33 2013
From: hsiddiqi at gmail.com (Hammad Siddiqi)
Date: Tue, 18 Jun 2013 20:32:33 +0500
Subject: [Linux-cluster] cman error in corosync
In-Reply-To: <51C07B3F.9050004@redhat.com>
References: <CABE8nt88Nk0T4g1e4-1RZfo1FbE-V4HSOZE2Jv2Gu1ed9d1tPw@mail.gmail.com>
	<51BF6D45.4040303@redhat.com> <51C0166F.4050208@redhat.com>
	<51C01778.6070304@redhat.com> <51C018E6.2070900@redhat.com>
	<CABE8nt_cr=E5GZ2O2rwAZ5tD3LuUSE+LgdG9xYRGuKPbSyx2Fw@mail.gmail.com>
	<CABE8nt_tbY1C4A1sbSnVH0gLoYTYnjrsaGjaRcM_z=9VxbPNmg@mail.gmail.com>
	<51C07B3F.9050004@redhat.com>
Message-ID: <CABE8nt_Ujdo6XzANbHFYt5rJ1wKQ_OiSsyYHnQUQWWFZmwdRCA@mail.gmail.com>

Dear Crissie,

Thanks for the update. Will be waiting for the patch for this,

Regards,

Hammad Siddiqi


On Tue, Jun 18, 2013 at 8:22 PM, Christine Caulfield <ccaulfie at redhat.com>wrote:

> Oh sorry.
>
> The bug is simply an error in the way that modclusterd calls into cman -
> it passes an invalid parameter so cman prints the message.
> The bug has no repercussions apart from that message as modclusterd has a
> fallback command to get the information it needs. TBH the bugzilla says
> nothing more interesting that that, apart from the patch needed to fix it :)
>
> I don't have any information about the release of a fixed package, sorry.
>
> Chrissie
>
>
> On 18/06/13 15:45, Hammad Siddiqi wrote:
>
>> Also, I had tried visiting the bugzilla link but my account does not
>> have sufficient privileges to get detailed info for this bug. It would
>> be really great if you enlighten what the bug really is.
>>
>> Thanks a lot for your prompt replies
>>
>>
>> Hammad Siddiqi
>>
>>
>> On Tue, Jun 18, 2013 at 6:38 PM, Hammad Siddiqi <hsiddiqi at gmail.com
>> <mailto:hsiddiqi at gmail.com>> wrote:
>>
>>     Thanks for sharing this info. Would you please let know when the
>>     update will be available in public packages and which versions.
>>
>>     Regards,
>>
>>     Hammad Siddiqi
>>
>>
>>     On Tue, Jun 18, 2013 at 1:23 PM, Christine Caulfield
>>     <ccaulfie at redhat.com <mailto:ccaulfie at redhat.com>> wrote:
>>
>>         On 18/06/13 09:16, Fabio M. Di Nitto wrote:
>>
>>             On 6/18/2013 10:12 AM, Christine Caulfield wrote:
>>
>>                 On 17/06/13 21:10, Fabio M. Di Nitto wrote:
>>
>>                     On 06/17/2013 09:05 PM, Hammad Siddiqi wrote:
>>
>>                         Dear Fabio,
>>
>>                         Thanks for the update. Is there any fix for this
>>                         bug. Would really
>>                         appreciate if some patch or update is provided.
>>
>>                         Thank you,
>>
>>                         Hammad Siddiqi
>>
>>
>>
>>                     The fix should be upstream already.
>>
>>                     Chrissie do you know if it's been included?
>>
>>
>>                 I don't have the start of this thread in my email
>>                 client. Which error?
>>
>>
>>             https://www.redhat.com/__**archives/linux-cluster/2013-__**
>> April/msg00009.html<https://www.redhat.com/__archives/linux-cluster/2013-__April/msg00009.html>
>>
>>             <https://www.redhat.com/**archives/linux-cluster/2013-**
>> April/msg00009.html<https://www.redhat.com/archives/linux-cluster/2013-April/msg00009.html>
>> >
>>
>>
>>
>>         That's a bug in modclusterd, not cman. It's been fixed upstream
>>         in ricci but I'm not sure if it's out in public packages yet:
>>
>>         https://bugzilla.redhat.com/__**show_bug.cgi?id=951470<https://bugzilla.redhat.com/__show_bug.cgi?id=951470>
>>
>>         <https://bugzilla.redhat.com/**show_bug.cgi?id=951470<https://bugzilla.redhat.com/show_bug.cgi?id=951470>
>> >
>>
>>         Chrissie
>>
>>
>>         --
>>         Linux-cluster mailing list
>>         Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.**com<Linux-cluster at redhat.com>
>> >
>>         https://www.redhat.com/__**mailman/listinfo/linux-cluster<https://www.redhat.com/__mailman/listinfo/linux-cluster>
>>         <https://www.redhat.com/**mailman/listinfo/linux-cluster<https://www.redhat.com/mailman/listinfo/linux-cluster>
>> **>
>>
>>
>>
>>
>>
>>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/**mailman/listinfo/linux-cluster<https://www.redhat.com/mailman/listinfo/linux-cluster>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130618/d9c2d8ef/attachment.htm>

From lists at alteeve.ca  Fri Jun 21 01:46:34 2013
From: lists at alteeve.ca (Digimer)
Date: Thu, 20 Jun 2013 21:46:34 -0400
Subject: [Linux-cluster] Status of fencing KVM guests across multiple hosts
Message-ID: <51C3B07A.9090400@alteeve.ca>

Hi all,

   I want to update the Guest Fencing docs on 
http://clusterlabs.org/wiki/Guest_Fencing and write a little tutorial as 
well. The CL page says;

For Guests Running on Multiple Hosts

Not yet supported, check back soon.

Rough commands:

   I wanted to know where this was today, as I think that's a little out 
of date now. Has support been added? If not, does it work at a technical 
level and simply isn't supported by Red Hat?

   Cheers

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?



From bubble at hoster-ok.com  Fri Jun 21 06:34:06 2013
From: bubble at hoster-ok.com (Vladislav Bogdanov)
Date: Fri, 21 Jun 2013 09:34:06 +0300
Subject: [Linux-cluster] Status of fencing KVM guests across multiple
 hosts
In-Reply-To: <51C3B07A.9090400@alteeve.ca>
References: <51C3B07A.9090400@alteeve.ca>
Message-ID: <51C3F3DE.4020008@hoster-ok.com>

21.06.2013 04:46, Digimer wrote:
> Hi all,
> 
>   I want to update the Guest Fencing docs on
> http://clusterlabs.org/wiki/Guest_Fencing and write a little tutorial as
> well. The CL page says;
> 
> For Guests Running on Multiple Hosts
> 
> Not yet supported, check back soon.
> 
> Rough commands:
> 
>   I wanted to know where this was today, as I think that's a little out
> of date now. Has support been added? If not, does it work at a technical
> level and simply isn't supported by Red Hat?

It works for latest fedoras out-of-the-box, but was removed from EL6.4.
Package names changed a bit (qpid -> qmf), but idea still the same.

http://oss.clusterlabs.org/pipermail/pacemaker/2013-June/018662.html

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.4_Technical_Notes/matahari.html

> 
>   Cheers
> 



From lists at alteeve.ca  Fri Jun 21 19:22:42 2013
From: lists at alteeve.ca (Digimer)
Date: Fri, 21 Jun 2013 15:22:42 -0400
Subject: [Linux-cluster] Status of fencing KVM guests across multiple
 hosts
In-Reply-To: <51C3F3DE.4020008@hoster-ok.com>
References: <51C3B07A.9090400@alteeve.ca> <51C3F3DE.4020008@hoster-ok.com>
Message-ID: <51C4A802.7080704@alteeve.ca>

On 06/21/2013 02:34 AM, Vladislav Bogdanov wrote:
> 21.06.2013 04:46, Digimer wrote:
>> Hi all,
>>
>>    I want to update the Guest Fencing docs on
>> http://clusterlabs.org/wiki/Guest_Fencing and write a little tutorial as
>> well. The CL page says;
>>
>> For Guests Running on Multiple Hosts
>>
>> Not yet supported, check back soon.
>>
>> Rough commands:
>>
>>    I wanted to know where this was today, as I think that's a little out
>> of date now. Has support been added? If not, does it work at a technical
>> level and simply isn't supported by Red Hat?
>
> It works for latest fedoras out-of-the-box, but was removed from EL6.4.
> Package names changed a bit (qpid -> qmf), but idea still the same.
>
> http://oss.clusterlabs.org/pipermail/pacemaker/2013-June/018662.html
>
> https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.4_Technical_Notes/matahari.html
>
>>
>>    Cheers

Thanks, Vladislav!

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?



From sghosh at redhat.com  Fri Jun 21 20:43:44 2013
From: sghosh at redhat.com (Subhendu Ghosh)
Date: Fri, 21 Jun 2013 16:43:44 -0400
Subject: [Linux-cluster] Status of fencing KVM guests across multiple
 hosts
In-Reply-To: <51C3F3DE.4020008@hoster-ok.com>
References: <51C3B07A.9090400@alteeve.ca> <51C3F3DE.4020008@hoster-ok.com>
Message-ID: <51C4BB00.9080600@redhat.com>

On 06/21/2013 02:34 AM, Vladislav Bogdanov wrote:
> 21.06.2013 04:46, Digimer wrote:
>> Hi all,
>>
>>   I want to update the Guest Fencing docs on
>> http://clusterlabs.org/wiki/Guest_Fencing and write a little tutorial as
>> well. The CL page says;
>>
>> For Guests Running on Multiple Hosts
>>
>> Not yet supported, check back soon.
>>
>> Rough commands:
>>
>>   I wanted to know where this was today, as I think that's a little out
>> of date now. Has support been added? If not, does it work at a technical
>> level and simply isn't supported by Red Hat?
> 
> It works for latest fedoras out-of-the-box, but was removed from EL6.4.
> Package names changed a bit (qpid -> qmf), but idea still the same.
> 
> http://oss.clusterlabs.org/pipermail/pacemaker/2013-June/018662.html
> 
> https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.4_Technical_Notes/matahari.html
> 
>>
>>   Cheers
>>
> 

Note: if you have guest migration capability and use it, then your fencing
operation
should coordinate with the guest migration/location control. RHEV/oVirt or vSphere
are the typical control points. Otherwise you have dueling mechanisms for
controlling
the KVM guests.

cheers
Subhendu



From fdinitto at redhat.com  Sat Jun 22 04:28:47 2013
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Sat, 22 Jun 2013 06:28:47 +0200
Subject: [Linux-cluster] Status of fencing KVM guests across multiple
 hosts
In-Reply-To: <51C4BB00.9080600@redhat.com>
References: <51C3B07A.9090400@alteeve.ca> <51C3F3DE.4020008@hoster-ok.com>
	<51C4BB00.9080600@redhat.com>
Message-ID: <51C527FF.8020308@redhat.com>

On 06/21/2013 10:43 PM, Subhendu Ghosh wrote:
> On 06/21/2013 02:34 AM, Vladislav Bogdanov wrote:
>> 21.06.2013 04:46, Digimer wrote:
>>> Hi all,
>>>
>>>   I want to update the Guest Fencing docs on
>>> http://clusterlabs.org/wiki/Guest_Fencing and write a little tutorial as
>>> well. The CL page says;
>>>
>>> For Guests Running on Multiple Hosts
>>>
>>> Not yet supported, check back soon.
>>>
>>> Rough commands:
>>>
>>>   I wanted to know where this was today, as I think that's a little out
>>> of date now. Has support been added? If not, does it work at a technical
>>> level and simply isn't supported by Red Hat?
>>
>> It works for latest fedoras out-of-the-box, but was removed from EL6.4.
>> Package names changed a bit (qpid -> qmf), but idea still the same.
>>
>> http://oss.clusterlabs.org/pipermail/pacemaker/2013-June/018662.html
>>
>> https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.4_Technical_Notes/matahari.html
>>
>>>
>>>   Cheers
>>>
>>
> 
> Note: if you have guest migration capability and use it, then your fencing
> operation
> should coordinate with the guest migration/location control. RHEV/oVirt or vSphere
> are the typical control points. Otherwise you have dueling mechanisms for
> controlling
> the KVM guests.

We don't really support live-migration of virtualized cluster nodes.
There is a technical barrier that makes it almost impossible to achieve.
We do have some best practise and a process to do it tho.

Fabio



From adrian.gibanel at btactic.com  Wed Jun 26 17:55:13 2013
From: adrian.gibanel at btactic.com (Adrian Gibanel)
Date: Wed, 26 Jun 2013 19:55:13 +0200 (CEST)
Subject: [Linux-cluster] fence_ovh - Fence agent for OVH (Proxmox 3)
In-Reply-To: <3826574.639.1370207568347.JavaMail.adrian@adrianworktop>
Message-ID: <498927.2707.1372269310026.JavaMail.adrian@adrianworktop>

  I've improved my former fence_ovh script so that it works in Proxmox 3 and so that it uses suds library as I was suggested in the linux-cluster mailing list.

1) What is fence_ovh 

fence_ovh is a fence agent based on python for the big French datacentre provider OVH. You can get information about OVH on: http://www.ovh.co.uk/ . I also wanted to make clear that I'm not part of official OVH staff. 

2) Features 
The script has two main functions: 

* Reboot into rescue mode (action=off) 
* Reboot into the hard disk (action=on;action=reboot) 

3) Technical details 
So as you might deduce the classical fence mechanism which turns off the other node is not actually done by turning off the machine but by rebooting it into a rescue mode. 

Another particular thing to mention is that the script checks if the machine has rebooted ok into rescue mode thanks to an OVH API which reports the date when the server rebooted. By the way the OVH API is also used in the main function that consists in rebooting the machine into rescue mode. 

4) How to use it 

4.1) Make sure python-suds package is installed (Debian/Ubuntu).
4.2) Save fence_ovh in /usr/sbin 
4.3) Run: ccs_update_schema so that new metadata is put into cluster.rng 
4.4) If needed validate your configuration: 
ccs_config_validate -v -f /etc/pve/cluster.conf.new 
4.5) Here's an example of how to use it in cluster.conf:

<?xml version="1.0"?>
<cluster name="ha-008-010" config_version="3">

<cman keyfile="/var/lib/pve-cluster\
/corosync.authkey" transport="udpu" \
two_node="1" expected_votes="1">
</cman>

<fencedevices>
        <fencedevice agent="fence_ovh" \
name="fence008" email="myadmin at domain.com" \
ipaddr="ns1234" login="ab12345-ovh" passwd="MYSECRET" />
        <fencedevice agent="fence_ovh" \
name="fence010" email="myadmin at domain.com" \
ipaddr="ns5678" login="ab12345-ovh" passwd="MYSECRET" />
</fencedevices>

  <clusternodes>
<clusternode name="nodeA.your.domain" nodeid="1" votes="1">
  <fence>
    <method name="1">
      <device name="fence008" action="off"/>
    </method>
  </fence>
</clusternode>
<clusternode name="nodeB.your.domain" nodeid="2" votes="1">
  <fence>
    <method name="1">
      <device name="fence010" action="off"/>
    </method>
  </fence>
</clusternode>
</clusternodes>



</cluster>



Finally I attach to this email the first version of fence_ovh script for Proxmox 3.

The original thread from Proxmox forum from which I adapted original secofor script: http://forum.proxmox.com/threads/11066-Proxmox-HA-Cluster-at-OVH-Fencing?p=75152#post75152 

-- 

-- 
Adri?n Gibanel 
I.T. Manager 

+34 675 683 301 
www.btactic.com 



Ens podeu seguir a/Nos podeis seguir en: 

i 


Abans d?imprimir aquest missatge, pensa en el medi ambient. El medi ambient ?s cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. 

AVIS: 
El contingut d'aquest missatge i els seus annexos ?s confidencial. Si no en sou el destinatari, us fem saber que est? prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autoritzaci? corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . 

AVISO: 
El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que est? prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorizaci?n correspondiente. Si han recibido este mensaje por error, les agradecer?amos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fence_ovh
Type: text/x-python
Size: 6224 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130626/8c7aa707/attachment.py>