From jsosic at srce.hr Wed Apr 3 17:46:23 2013 From: jsosic at srce.hr (Jakov Sosic) Date: Wed, 03 Apr 2013 19:46:23 +0200 Subject: [Linux-cluster] quorum In-Reply-To: References: Message-ID: <515C6AEF.1010806@srce.hr> On 03/22/2013 02:42 AM, Prathyush wrote: > HI, > > > I have 3 node cluster with 1 vote each ,and decided to shut down the 2 > nodes for maintenance . > When a proper power off /init 0 given to two of the nodes ,i never loses > the quorum > When i am making force power off (switching off the power/pulling > the power cable) quorum was lost and services failed in the third node. > why this is happening... any idea That's the exactly as should be. When you have 3 node cluster, and first node goes down - it's irrelevant if it went down by standard shutdown or by pulling the cord, other two nodes still have enough votes to hold the quorum. But now you're on just 2 out of 3 votes and things get a little bit more complicated. If you shut down one of the remaining two nodes by standard shutdown procedure, cman service will announce that the node is leaving cluster (ccs_tool leave). Other node will now know that he's the only one left, and although it won't have quorum, it will continue to provide services. It does not have a reason not to, cluster was disassembled nicely. But, if you pull out power out of one of the two remaining nodes, simulating it's crash, it's OK for the remaining node to stop providing services. He just doesn't know what's happening to other node, other node could think the same and you would get split-brain scenario. One thing that could solve your problem is introduction of quorum disk, which would hold 2 votes, so in the latter scenario, one node that survives would make scsi reservation on quorum disk and would thus hold 3 out of 5 votes. What kind of fencing are you using? Is it IPMI? From jsosic at srce.hr Wed Apr 3 18:07:16 2013 From: jsosic at srce.hr (Jakov Sosic) Date: Wed, 03 Apr 2013 20:07:16 +0200 Subject: [Linux-cluster] Can anyone help tackling issues with custom resource agent for RHCS? In-Reply-To: References: <5127B285.40303@alteeve.ca> Message-ID: <515C6FD4.9010203@srce.hr> On 02/24/2013 11:49 AM, Ralph.Grothe at itdz-berlin.de wrote: > Hallo digimer, > > I already knew this link and have read FAQs and other stuff > there. > > Unfortunately, many features such as dependencies between cluster > services, that our customers demand from us to be enabled in > their clusters (and what they have been accustomed to in their > former clusters (e.g. Veritas) which are to be migrated from to > RHCS ones) are hardly anywhere documented. > > But when I posted my query I was mistaken. > My ifxdb agent isn't dysfunctional. It really works. > But what it still lacks is that clurgmgrd doesn't log its actions > despite the fact that I used mentioned ocf_log function (I also > check in my agent if that function is defined at run time and if > not I resource the /usr/schare/cluster/ocf-shellfuncs) and > although it logs every step whenever I run it during disabled > services through rg_test utility. > I have no explanation why clurgmgrd is so taciturn when it comes > to logging output from my ifxdb agent. > > I think that I have enabled logging up to debug level. > > [root at altair:/usr/share/cluster] > # grep rm /etc/cluster/cluster.conf > central_processing="1"> > > [root at altair:/usr/share/cluster] > # grep local6 /etc/syslog.conf > local6.* > /var/log/clurgmgrd.log You can test your agents by running them from command line. So, for example, this is one of my resources: and this is service that uses it: As you can see I have two custom RAs: mdraid and pgsql91. When I tested 'mdraid' agent, I've changed my service to look like this: After enabling the service, I would test it from the CLI by runnning: # OCF_RESKEY_config_file="/etc/mdadm-extern.conf" \ OCF_RESKEY_name="extern" \ OCF_RESKEY_ssh_check="1" \ bash /usr/share/cluster/mdraid.sh status and I get the following output (for example): mdraid: Improper setup detected [mdraid.sh] mdraid: Improper setup detected * device "extern" is active on node "database02-xc" [mdraid.sh] * device "extern" is active on node "database02-xc" and in /var/log/messages: Apr 3 20:04:50 database01 rgmanager[28145]: [mdraid.sh] mdraid: Improper setup detected Apr 3 20:04:50 database01 rgmanager[28167]: [mdraid.sh] * device "extern" is active on node "database02-xc" So, I guess what you should do is try to run your agent and get it to log to stdout and messages. Maybe you are not using the ocf_log function properly? Can you maybe share your agents with us, so somebody can maybe test it in his environment? Hope this post helps ;) From jsosic at srce.hr Wed Apr 3 18:11:02 2013 From: jsosic at srce.hr (Jakov Sosic) Date: Wed, 03 Apr 2013 20:11:02 +0200 Subject: [Linux-cluster] Cluster Shut Down Procedures In-Reply-To: <9F9C54F90C94584AAADAD6877B09C42D0F592F75@HDXDSP51.us.lmco.com> References: <9F9C54F90C94584AAADAD6877B09C42D0F592F75@HDXDSP51.us.lmco.com> Message-ID: <515C70B6.9050904@srce.hr> On 01/24/2013 07:43 PM, Krampach, Stephen wrote: > I hate to ask simple questions however, I?ve been perusing > > books and blogs for two hours and have no definitive procedure; > > > > We are having a power outage. What is the procedure to completely > > shut down and power off a Red Hat 6.3 cluster? I would personally stop all the HA services first (clusvcadm disable), and after that shutdown node by node. If you have auto-poweron in BIOS, I would recommend also disabling rgmanager and cman for the next startup, to avoid multiple poweron/offs in case of unstable power supply (chkconfig rgmanager off && chkconfig cman off). After the power is restored, and power supply is stable, I would just turn everything back on. From jsosic at srce.hr Wed Apr 3 18:13:36 2013 From: jsosic at srce.hr (Jakov Sosic) Date: Wed, 03 Apr 2013 20:13:36 +0200 Subject: [Linux-cluster] rgmanager log level Message-ID: <515C7150.1010704@srce.hr> Hi, after I change the rgmanager log level and push new cluster.conf, do I have to restart rgmanagers on all nodes for them to apply the new settings or not? ty From emi2fast at gmail.com Wed Apr 3 21:44:02 2013 From: emi2fast at gmail.com (emmanuel segura) Date: Wed, 3 Apr 2013 23:44:02 +0200 Subject: [Linux-cluster] rgmanager log level In-Reply-To: <515C7150.1010704@srce.hr> References: <515C7150.1010704@srce.hr> Message-ID: increment your cluster version and ccs_tool update /etc/cluster/cluster.conf, i think that should be enough 2013/4/3 Jakov Sosic > Hi, > > after I change the rgmanager log level and push new cluster.conf, do I > have to restart rgmanagers on all nodes for them to apply the new > settings or not? > > > ty > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From robinskthomas at gmail.com Fri Apr 5 07:47:16 2013 From: robinskthomas at gmail.com (Robins Kthomas) Date: Fri, 5 Apr 2013 13:17:16 +0530 Subject: [Linux-cluster] Problem With Starting CMAN Message-ID: when i am starting the cman service using command 'service cman start',there is a problem occured , *ccsd is not running* please give me solution -------------- next part -------------- An HTML attachment was scrubbed... URL: From prathyush.r at gmail.com Sat Apr 13 18:26:55 2013 From: prathyush.r at gmail.com (Prathyush) Date: Sat, 13 Apr 2013 23:56:55 +0530 Subject: [Linux-cluster] quorum In-Reply-To: <515C6AEF.1010806@srce.hr> References: <515C6AEF.1010806@srce.hr> Message-ID: Hi Jakov, I got your points , thanks a lot Regards Prathyush On Wed, Apr 3, 2013 at 11:16 PM, Jakov Sosic wrote: > On 03/22/2013 02:42 AM, Prathyush wrote: >> HI, >> >> >> I have 3 node cluster with 1 vote each ,and decided to shut down the 2 >> nodes for maintenance . >> When a proper power off /init 0 given to two of the nodes ,i never loses >> the quorum >> When i am making force power off (switching off the power/pulling >> the power cable) quorum was lost and services failed in the third node. >> why this is happening... any idea > > That's the exactly as should be. > > When you have 3 node cluster, and first node goes down - it's irrelevant > if it went down by standard shutdown or by pulling the cord, other two > nodes still have enough votes to hold the quorum. > > But now you're on just 2 out of 3 votes and things get a little bit more > complicated. > > If you shut down one of the remaining two nodes by standard shutdown > procedure, cman service will announce that the node is leaving cluster > (ccs_tool leave). > > Other node will now know that he's the only one left, and although it > won't have quorum, it will continue to provide services. It does not > have a reason not to, cluster was disassembled nicely. > > > But, if you pull out power out of one of the two remaining nodes, > simulating it's crash, it's OK for the remaining node to stop providing > services. He just doesn't know what's happening to other node, other > node could think the same and you would get split-brain scenario. > > One thing that could solve your problem is introduction of quorum disk, > which would hold 2 votes, so in the latter scenario, one node that > survives would make scsi reservation on quorum disk and would thus hold > 3 out of 5 votes. > > What kind of fencing are you using? Is it IPMI? > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Regards, Prathyush From prathyush.r at gmail.com Sat Apr 13 18:29:48 2013 From: prathyush.r at gmail.com (Prathyush) Date: Sat, 13 Apr 2013 23:59:48 +0530 Subject: [Linux-cluster] rgmanager log level In-Reply-To: References: <515C7150.1010704@srce.hr> Message-ID: HI, Log level change doesn't need a service restart , Just update the version and propagate On Thu, Apr 4, 2013 at 3:14 AM, emmanuel segura wrote: > increment your cluster version and ccs_tool update > /etc/cluster/cluster.conf, i think that should be enough > > > 2013/4/3 Jakov Sosic >> >> Hi, >> >> after I change the rgmanager log level and push new cluster.conf, do I >> have to restart rgmanagers on all nodes for them to apply the new >> settings or not? >> >> >> ty >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Regards, Prathyush From prathyush.r at gmail.com Sat Apr 13 18:31:39 2013 From: prathyush.r at gmail.com (Prathyush) Date: Sun, 14 Apr 2013 00:01:39 +0530 Subject: [Linux-cluster] Problem With Starting CMAN In-Reply-To: References: Message-ID: HI, It depends on lots of things ,please give more details On Fri, Apr 5, 2013 at 1:17 PM, Robins Kthomas wrote: > when i am starting the cman service using command 'service cman start',there > is a problem occured , > ccsd is not running > > please give me solution > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Regards, Prathyush From delpheye at gmail.com Wed Apr 17 17:09:43 2013 From: delpheye at gmail.com (M) Date: Wed, 17 Apr 2013 12:09:43 -0500 Subject: [Linux-cluster] cman error in corosync Message-ID: I have a 4 node cluster that's running correctly aside frequent fencing across all nodes. Even after turning up logging, I'm not able to find anything that stands out. However, the following keep presenting itself in corosync.log and I don't know to what it's referring. Apr 17 04:18:05 corosync [CMAN ] memb: cmd_get_node failed: id=0, name='?' Originally, I thought it was complaining that in cluster.conf nodeid starts at 1 instead of 0, but a quick test and a temporarily broken cluster ruled that out. So my question is, what is this error message talking about? It occurs every 5 seconds so it seems to me that cman is missing something it's looking for and I'd like to eliminate it. Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdinitto at redhat.com Wed Apr 17 17:43:21 2013 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Wed, 17 Apr 2013 19:43:21 +0200 Subject: [Linux-cluster] cman error in corosync In-Reply-To: References: Message-ID: <516EDF39.3040409@redhat.com> On 4/17/2013 7:09 PM, M wrote: > I have a 4 node cluster that's running correctly aside frequent fencing > across all nodes. Even after turning up logging, I'm not able to find > anything that stands out. However, the following keep presenting itself > in corosync.log and I don't know to what it's referring. > > Apr 17 04:18:05 corosync [CMAN ] memb: cmd_get_node failed: id=0, name='?' > > Originally, I thought it was complaining that in cluster.conf nodeid > starts at 1 instead of 0, but a quick test and a temporarily broken > cluster ruled that out. > > So my question is, what is this error message talking about? It occurs > every 5 seconds so it seems to me that cman is missing something it's > looking for and I'd like to eliminate it. It?s a bug in modclusterd. It was found last week and we are in the process to fix it. Fabio From office at 5hosting.com Wed Apr 17 19:02:16 2013 From: office at 5hosting.com (5hosting Team) Date: Wed, 17 Apr 2013 21:02:16 +0200 Subject: [Linux-cluster] GFS2 crashes - sys_rename Message-ID: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> Hey guys, We run a 40 node webcluster (only apache, php processes) and the nodes keep on crashing with a kernel panic. For me it looks like the rename of a file/directory aint working. I found someone posting the same a few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but that?s the kernel we?re running. And we just used fsck yesterday night to check for problems with the file system. So something doesn?t seem right. Here are 3 crashlogs from 3 different nodes: Apr 17 20:20:16 001 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Apr 17 20:20:16 001 kernel: Apr 17 20:20:16 001 kernel: Pid: 2915, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM Apr 17 20:20:16 001 kernel: RIP: 0010:[] [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:20:16 001 kernel: RSP: 0018:ffff880417e8ba58 EFLAGS: 00010283 Apr 17 20:20:16 001 kernel: RAX: ffff8804185a3da8 RBX: 0000000000000003 RCX: 000000000db41094 Apr 17 20:20:16 001 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff8804187ef440 Apr 17 20:20:16 001 kernel: RBP: ffff880417e8bb18 R08: 0000000000000000 R09: 0000000000000000 Apr 17 20:20:16 001 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff8804187ef000 Apr 17 20:20:16 001 kernel: R13: 0000000000000000 R14: ffff88041519c3e0 R15: ffff880417e8bb78 Apr 17 20:20:16 001 kernel: FS: 00007f07791ff7c0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 Apr 17 20:20:16 001 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 CR3: 0000000411e94000 CR4: 00000000001407f0 Apr 17 20:20:16 001 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 17 20:20:16 001 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 17 20:20:16 001 kernel: Process php-cgi (pid: 2915, threadinfo ffff880417e8a000, task ffff8804125cc040) Apr 17 20:20:16 001 kernel: Stack: Apr 17 20:20:16 001 kernel: ffff8804163dbab0 ffffffffa04012d0 ffff880417e8ba78 ffff8804163dd0c0 Apr 17 20:20:16 001 kernel: ffff8804163dbab0 00000115a04012d0 ffff880417e8ba98 ffff8804187ef000 Apr 17 20:20:16 001 kernel: ffff880417e8baf8 ffffffffa0402931 ffff8804185a3da8 0000000000000000 Apr 17 20:20:16 001 kernel: Call Trace: Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] Apr 17 20:20:16 001 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] vfs_rename+0x3ab/0x440 Apr 17 20:20:16 001 kernel: [] sys_renameat+0x1da/0x240 Apr 17 20:20:16 001 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 Apr 17 20:20:16 001 kernel: [] ? cp_new_stat+0xe4/0x100 Apr 17 20:20:16 001 kernel: [] ? sys_newstat+0x36/0x50 Apr 17 20:20:16 001 kernel: [] ? audit_syscall_entry+0x1d7/0x200 Apr 17 20:20:16 001 kernel: [] sys_rename+0x1b/0x20 Apr 17 20:20:16 001 kernel: [] system_call_fastpath+0x16/0x1b Apr 17 20:20:16 001 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 Apr 17 20:20:16 001 kernel: RIP [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:20:16 001 kernel: RSP Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 Apr 17 20:20:16 001 kernel: ---[ end trace 0647d0d2004566f6 ]--- Apr 17 20:21:00 002 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Apr 17 20:21:00 002 kernel: Apr 17 20:21:00 002 kernel: Pid: 2839, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM Apr 17 20:21:00 002 kernel: RIP: 0010:[] [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:21:00 002 kernel: RSP: 0000:ffff8803f518ba58 EFLAGS: 00010283 Apr 17 20:21:00 002 kernel: RAX: ffff88041447bda8 RBX: 0000000000000003 RCX: 000000000db41094 Apr 17 20:21:00 002 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff880414cd5440 Apr 17 20:21:00 002 kernel: RBP: ffff8803f518bb18 R08: 0000000000000000 R09: 0000000000000000 Apr 17 20:21:00 002 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff880414cd5000 Apr 17 20:21:00 002 kernel: R13: 0000000000000000 R14: ffff8803f9e918c0 R15: ffff8803f518bb78 Apr 17 20:21:00 002 kernel: FS: 00007f6a7e8a27c0(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000 Apr 17 20:21:00 002 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 CR3: 00000003f6313000 CR4: 00000000001407e0 Apr 17 20:21:00 002 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 17 20:21:00 002 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 17 20:21:00 002 kernel: Process php-cgi (pid: 2839, threadinfo ffff8803f518a000, task ffff8803f5189540) Apr 17 20:21:00 002 kernel: Stack: Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 ffffffffa04012d0 ffff8803f518ba78 ffff88041518eb60 Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 00000115a04012d0 ffff8803f518ba98 ffff880414cd5000 Apr 17 20:21:00 002 kernel: ffff8803f518baf8 ffffffffa0402931 ffff88041447bda8 0000000000000000 Apr 17 20:21:00 002 kernel: Call Trace: Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] Apr 17 20:21:00 002 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] vfs_rename+0x3ab/0x440 Apr 17 20:21:00 002 kernel: [] sys_renameat+0x1da/0x240 Apr 17 20:21:00 002 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 Apr 17 20:21:00 002 kernel: [] ? cp_new_stat+0xe4/0x100 Apr 17 20:21:00 002 kernel: [] ? sys_newstat+0x36/0x50 Apr 17 20:21:00 002 kernel: [] ? audit_syscall_entry+0x1d7/0x200 Apr 17 20:21:00 002 kernel: [] sys_rename+0x1b/0x20 Apr 17 20:21:00 002 kernel: [] system_call_fastpath+0x16/0x1b Apr 17 20:21:00 002 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 Apr 17 20:21:00 002 kernel: RIP [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:21:00 002 kernel: RSP Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 Apr 17 20:21:00 002 kernel: ---[ end trace 1425fd0e2954015a ]--- Apr 17 20:12:49 003 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 Apr 17 20:12:49 003 kernel: IP: [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:12:49 003 kernel: PGD 3d96fc067 PUD 3d2c0a067 PMD 0 Apr 17 20:12:49 003 kernel: Oops: 0002 [#1] SMP Apr 17 20:12:49 003 kernel: last sysfs file: /sys/kernel/dlm/b1/control Apr 17 20:12:49 003 kernel: CPU 1 Apr 17 20:12:49 003 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Apr 17 20:12:49 003 kernel: Apr 17 20:12:49 003 kernel: Pid: 3386, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM Apr 17 20:12:49 003 kernel: RIP: 0010:[] [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:12:49 003 kernel: RSP: 0018:ffff8803d1c27a58 EFLAGS: 00010283 Apr 17 20:12:49 003 kernel: RAX: ffff880416771da8 RBX: 0000000000000003 RCX: 000000000db41094 Apr 17 20:12:49 003 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff88041277b440 Apr 17 20:12:49 003 kernel: RBP: ffff8803d1c27b18 R08: 0000000000000000 R09: 0000000000000000 Apr 17 20:12:49 003 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff88041277b000 Apr 17 20:12:49 003 kernel: R13: 0000000000000000 R14: ffff8803a9c181c0 R15: ffff8803d1c27b78 Apr 17 20:12:49 003 kernel: FS: 00007fd494d017c0(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 Apr 17 20:12:49 003 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 CR3: 00000003d170c000 CR4: 00000000001407e0 Apr 17 20:12:49 003 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 17 20:12:49 003 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 17 20:12:49 003 kernel: Process php-cgi (pid: 3386, threadinfo ffff8803d1c26000, task ffff8803d1450080) Apr 17 20:12:49 003 kernel: Stack: Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 ffffffffa03fe2d0 ffff8803d1c27a78 ffff8803cee73800 Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 00000115a03fe2d0 ffff8803d1c27a98 ffff88041277b000 Apr 17 20:12:49 003 kernel: ffff8803d1c27af8 ffffffffa03ff931 ffff880416771da8 0000000000000000 Apr 17 20:12:49 003 kernel: Call Trace: Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] Apr 17 20:12:49 003 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] vfs_rename+0x3ab/0x440 Apr 17 20:12:49 003 kernel: [] sys_renameat+0x1da/0x240 Apr 17 20:12:49 003 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 Apr 17 20:12:49 003 kernel: [] ? cp_new_stat+0xe4/0x100 Apr 17 20:12:49 003 kernel: [] ? sys_newstat+0x36/0x50 Apr 17 20:12:49 003 kernel: [] ? audit_syscall_entry+0x1d7/0x200 Apr 17 20:12:49 003 kernel: [] sys_rename+0x1b/0x20 Apr 17 20:12:49 003 kernel: [] system_call_fastpath+0x16/0x1b Apr 17 20:12:49 003 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 Apr 17 20:12:49 003 kernel: RIP [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:12:49 003 kernel: RSP Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 Apr 17 20:12:49 003 kernel: ---[ end trace 06b117dc4fff0890 ]--- The call trace looks for me kinda the same on all nodes and after we rebooted ALL 40 nodes, the ?bug? seems to be gone and the system is running fine right now. (it?s running 20 minutes now without rebooting, before that we had a reboot every half minute) Do you know anything about that ? how can we fix it? It?s a webcluster and such crashes aren?t good. It should be online 24/7 but right now it doesn?t look that good. Thanks in advance, J?rgen -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6079 bytes Desc: not available URL: From rpeterso at redhat.com Wed Apr 17 19:28:04 2013 From: rpeterso at redhat.com (Bob Peterson) Date: Wed, 17 Apr 2013 15:28:04 -0400 (EDT) Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> Message-ID: <514887138.17961011.1366226884345.JavaMail.root@redhat.com> ----- Original Message ----- | Hey guys, | | | | We run a 40 node webcluster (only apache, php processes) and the nodes keep | on crashing with a kernel panic. For me it looks like the rename of a | file/directory aint working. I found someone posting the same a few days ago | and it should be fixed in kernel 2.6.32-358.2.1.el6, but that?s the kernel | we?re running. And we just used fsck yesterday night to check for problems | with the file system. So something doesn?t seem right. | | | | Here are 3 crashlogs from 3 different nodes: Hi, This is bugzilla bug #924847. We have a patch, but the patch has not found its way to a kernel yet, but it's in process. Regards, Bob Peterson Red Hat File Systems From laurence.schuler at nasa.gov Wed Apr 17 19:34:39 2013 From: laurence.schuler at nasa.gov (laurence.schuler) Date: Wed, 17 Apr 2013 15:34:39 -0400 Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> Message-ID: <516EF94F.5080107@nasa.gov> There's a similar bug about this same crash in the 358 kernel. Its a different bug. I rolled back to the previous for now, Redhat should have a fix soon. --larry On 04/17/2013 03:02 PM, 5hosting Team wrote: > > Hey guys, > > > > We run a 40 node webcluster (only apache, php processes) and the nodes > keep on crashing with a kernel panic. For me it looks like the rename > of a file/directory aint working. I found someone posting the same a > few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but > that?s the kernel we?re running. And we just used fsck yesterday night > to check for problems with the file system. So something doesn?t seem > right. > > > > Here are 3 crashlogs from 3 different nodes: > > Apr 17 20:20:16 001 kernel: Modules linked in: gfs2 dlm configfs sg > sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT > nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter > ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod > nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio > ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx > iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: > scsi_wait_scan] > > Apr 17 20:20:16 001 kernel: > > Apr 17 20:20:16 001 kernel: Pid: 2915, comm: php-cgi Not tainted > 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM > > Apr 17 20:20:16 001 kernel: RIP: 0010:[] > [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:20:16 001 kernel: RSP: 0018:ffff880417e8ba58 EFLAGS: 00010283 > > Apr 17 20:20:16 001 kernel: RAX: ffff8804185a3da8 RBX: > 0000000000000003 RCX: 000000000db41094 > > Apr 17 20:20:16 001 kernel: RDX: 000000000db41094 RSI: > 000000000db21756 RDI: ffff8804187ef440 > > Apr 17 20:20:16 001 kernel: RBP: ffff880417e8bb18 R08: > 0000000000000000 R09: 0000000000000000 > > Apr 17 20:20:16 001 kernel: R10: 0000000000001000 R11: > 0000000000000000 R12: ffff8804187ef000 > > Apr 17 20:20:16 001 kernel: R13: 0000000000000000 R14: > ffff88041519c3e0 R15: ffff880417e8bb78 > > Apr 17 20:20:16 001 kernel: FS: 00007f07791ff7c0(0000) > GS:ffff880028200000(0000) knlGS:0000000000000000 > > Apr 17 20:20:16 001 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > > Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 CR3: > 0000000411e94000 CR4: 00000000001407f0 > > Apr 17 20:20:16 001 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > > Apr 17 20:20:16 001 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 17 20:20:16 001 kernel: Process php-cgi (pid: 2915, threadinfo > ffff880417e8a000, task ffff8804125cc040) > > Apr 17 20:20:16 001 kernel: Stack: > > Apr 17 20:20:16 001 kernel: ffff8804163dbab0 ffffffffa04012d0 > ffff880417e8ba78 ffff8804163dd0c0 > > Apr 17 20:20:16 001 kernel: ffff8804163dbab0 00000115a04012d0 > ffff880417e8ba98 ffff8804187ef000 > > Apr 17 20:20:16 001 kernel: ffff880417e8baf8 ffffffffa0402931 > ffff8804185a3da8 0000000000000000 > > Apr 17 20:20:16 001 kernel: Call Trace: > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_dirent_find_space+0x0/0x50 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_dirent_search+0x191/0x1a0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] > gfs2_rename+0x6b1/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0x128/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0x146/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0x16c/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_glock_put+0x3f/0x180 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_holder_uninit+0x23/0x40 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_permission+0x9c/0x100 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0xd5/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] vfs_rename+0x3ab/0x440 > > Apr 17 20:20:16 001 kernel: [] sys_renameat+0x1da/0x240 > > Apr 17 20:20:16 001 kernel: [] ? > _atomic_dec_and_lock+0x55/0x80 > > Apr 17 20:20:16 001 kernel: [] ? cp_new_stat+0xe4/0x100 > > Apr 17 20:20:16 001 kernel: [] ? sys_newstat+0x36/0x50 > > Apr 17 20:20:16 001 kernel: [] ? > audit_syscall_entry+0x1d7/0x200 > > Apr 17 20:20:16 001 kernel: [] sys_rename+0x1b/0x20 > > Apr 17 20:20:16 001 kernel: [] > system_call_fastpath+0x16/0x1b > > Apr 17 20:20:16 001 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 > 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 > 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 > e9 01 fb ff ff 48 > > Apr 17 20:20:16 001 kernel: RIP [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:20:16 001 kernel: RSP > > Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 > > Apr 17 20:20:16 001 kernel: ---[ end trace 0647d0d2004566f6 ]--- > > > > > > Apr 17 20:21:00 002 kernel: Modules linked in: gfs2 dlm configfs sg > sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT > nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter > ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod > nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio > ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx > iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: > scsi_wait_scan] > > Apr 17 20:21:00 002 kernel: > > Apr 17 20:21:00 002 kernel: Pid: 2839, comm: php-cgi Not tainted > 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM > > Apr 17 20:21:00 002 kernel: RIP: 0010:[] > [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:21:00 002 kernel: RSP: 0000:ffff8803f518ba58 EFLAGS: 00010283 > > Apr 17 20:21:00 002 kernel: RAX: ffff88041447bda8 RBX: > 0000000000000003 RCX: 000000000db41094 > > Apr 17 20:21:00 002 kernel: RDX: 000000000db41094 RSI: > 000000000db21756 RDI: ffff880414cd5440 > > Apr 17 20:21:00 002 kernel: RBP: ffff8803f518bb18 R08: > 0000000000000000 R09: 0000000000000000 > > Apr 17 20:21:00 002 kernel: R10: 0000000000001000 R11: > 0000000000000000 R12: ffff880414cd5000 > > Apr 17 20:21:00 002 kernel: R13: 0000000000000000 R14: > ffff8803f9e918c0 R15: ffff8803f518bb78 > > Apr 17 20:21:00 002 kernel: FS: 00007f6a7e8a27c0(0000) > GS:ffff8800282c0000(0000) knlGS:0000000000000000 > > Apr 17 20:21:00 002 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > > Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 CR3: > 00000003f6313000 CR4: 00000000001407e0 > > Apr 17 20:21:00 002 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > > Apr 17 20:21:00 002 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 17 20:21:00 002 kernel: Process php-cgi (pid: 2839, threadinfo > ffff8803f518a000, task ffff8803f5189540) > > Apr 17 20:21:00 002 kernel: Stack: > > Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 ffffffffa04012d0 > ffff8803f518ba78 ffff88041518eb60 > > Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 00000115a04012d0 > ffff8803f518ba98 ffff880414cd5000 > > Apr 17 20:21:00 002 kernel: ffff8803f518baf8 ffffffffa0402931 > ffff88041447bda8 0000000000000000 > > Apr 17 20:21:00 002 kernel: Call Trace: > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_dirent_find_space+0x0/0x50 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_dirent_search+0x191/0x1a0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] > gfs2_rename+0x6b1/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0x128/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0x146/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0x16c/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_glock_put+0x3f/0x180 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_holder_uninit+0x23/0x40 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_permission+0x9c/0x100 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0xd5/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] vfs_rename+0x3ab/0x440 > > Apr 17 20:21:00 002 kernel: [] sys_renameat+0x1da/0x240 > > Apr 17 20:21:00 002 kernel: [] ? > _atomic_dec_and_lock+0x55/0x80 > > Apr 17 20:21:00 002 kernel: [] ? cp_new_stat+0xe4/0x100 > > Apr 17 20:21:00 002 kernel: [] ? sys_newstat+0x36/0x50 > > Apr 17 20:21:00 002 kernel: [] ? > audit_syscall_entry+0x1d7/0x200 > > Apr 17 20:21:00 002 kernel: [] sys_rename+0x1b/0x20 > > Apr 17 20:21:00 002 kernel: [] > system_call_fastpath+0x16/0x1b > > Apr 17 20:21:00 002 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 > 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 > 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 > e9 01 fb ff ff 48 > > Apr 17 20:21:00 002 kernel: RIP [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:21:00 002 kernel: RSP > > Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 > > Apr 17 20:21:00 002 kernel: ---[ end trace 1425fd0e2954015a ]--- > > > > > > Apr 17 20:12:49 003 kernel: BUG: unable to handle kernel NULL pointer > dereference at 0000000000000060 > > Apr 17 20:12:49 003 kernel: IP: [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:12:49 003 kernel: PGD 3d96fc067 PUD 3d2c0a067 PMD 0 > > Apr 17 20:12:49 003 kernel: Oops: 0002 [#1] SMP > > Apr 17 20:12:49 003 kernel: last sysfs file: /sys/kernel/dlm/b1/control > > Apr 17 20:12:49 003 kernel: CPU 1 > > Apr 17 20:12:49 003 kernel: Modules linked in: gfs2 dlm configfs sg > sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT > nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter > ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod > nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio > ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx > iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: > scsi_wait_scan] > > Apr 17 20:12:49 003 kernel: > > Apr 17 20:12:49 003 kernel: Pid: 3386, comm: php-cgi Not tainted > 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM > > Apr 17 20:12:49 003 kernel: RIP: 0010:[] > [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:12:49 003 kernel: RSP: 0018:ffff8803d1c27a58 EFLAGS: 00010283 > > Apr 17 20:12:49 003 kernel: RAX: ffff880416771da8 RBX: > 0000000000000003 RCX: 000000000db41094 > > Apr 17 20:12:49 003 kernel: RDX: 000000000db41094 RSI: > 000000000db21756 RDI: ffff88041277b440 > > Apr 17 20:12:49 003 kernel: RBP: ffff8803d1c27b18 R08: > 0000000000000000 R09: 0000000000000000 > > Apr 17 20:12:49 003 kernel: R10: 0000000000001000 R11: > 0000000000000000 R12: ffff88041277b000 > > Apr 17 20:12:49 003 kernel: R13: 0000000000000000 R14: > ffff8803a9c181c0 R15: ffff8803d1c27b78 > > Apr 17 20:12:49 003 kernel: FS: 00007fd494d017c0(0000) > GS:ffff880028240000(0000) knlGS:0000000000000000 > > Apr 17 20:12:49 003 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > > Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 CR3: > 00000003d170c000 CR4: 00000000001407e0 > > Apr 17 20:12:49 003 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > > Apr 17 20:12:49 003 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 17 20:12:49 003 kernel: Process php-cgi (pid: 3386, threadinfo > ffff8803d1c26000, task ffff8803d1450080) > > Apr 17 20:12:49 003 kernel: Stack: > > Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 ffffffffa03fe2d0 > ffff8803d1c27a78 ffff8803cee73800 > > Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 00000115a03fe2d0 > ffff8803d1c27a98 ffff88041277b000 > > Apr 17 20:12:49 003 kernel: ffff8803d1c27af8 ffffffffa03ff931 > ffff880416771da8 0000000000000000 > > Apr 17 20:12:49 003 kernel: Call Trace: > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_dirent_find_space+0x0/0x50 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_dirent_search+0x191/0x1a0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] > gfs2_rename+0x6b1/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0x128/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0x146/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0x16c/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_glock_put+0x3f/0x180 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_holder_uninit+0x23/0x40 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_permission+0x9c/0x100 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0xd5/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] vfs_rename+0x3ab/0x440 > > Apr 17 20:12:49 003 kernel: [] sys_renameat+0x1da/0x240 > > Apr 17 20:12:49 003 kernel: [] ? > _atomic_dec_and_lock+0x55/0x80 > > Apr 17 20:12:49 003 kernel: [] ? cp_new_stat+0xe4/0x100 > > Apr 17 20:12:49 003 kernel: [] ? sys_newstat+0x36/0x50 > > Apr 17 20:12:49 003 kernel: [] ? > audit_syscall_entry+0x1d7/0x200 > > Apr 17 20:12:49 003 kernel: [] sys_rename+0x1b/0x20 > > Apr 17 20:12:49 003 kernel: [] > system_call_fastpath+0x16/0x1b > > Apr 17 20:12:49 003 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 > 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 > 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 > e9 01 fb ff ff 48 > > Apr 17 20:12:49 003 kernel: RIP [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:12:49 003 kernel: RSP > > Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 > > Apr 17 20:12:49 003 kernel: ---[ end trace 06b117dc4fff0890 ]--- > > > > > > > > The call trace looks for me kinda the same on all nodes and after we > rebooted ALL 40 nodes, the ?bug? seems to be gone and the system is > running fine right now. (it?s running 20 minutes now without > rebooting, before that we had a reboot every half minute) > > > > Do you know anything about that ? how can we fix it? > > It?s a webcluster and such crashes aren?t good. It should be online > 24/7 but right now it doesn?t look that good. > > > > Thanks in advance, J?rgen > -- Laurence Schuler (Larry) Laurence.Schuler at nasa.gov Systems Support ADNET Systems, Inc Scientific Visualization Studio http://svs.gsfc.nasa.gov NASA/Goddard Space Flight Center, Code 606.4 phone: 1-301-286-1799 Greenbelt, MD 20771 fax: 1-301-286-1634 Note: I am not a government employee and have no authority to obligate any federal, state or local government to perform any action or payment. -------------- next part -------------- An HTML attachment was scrubbed... URL: From scooter at cgl.ucsf.edu Wed Apr 17 19:34:57 2013 From: scooter at cgl.ucsf.edu (Scooter Morris) Date: Wed, 17 Apr 2013 12:34:57 -0700 Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> Message-ID: <516EF961.1030202@cgl.ucsf.edu> There is a fix for that. Request a patched kernel for bugzilla bug# 92299 from your RedHat support folks. We had the same problem and the patched kernel resolved it. -- scooter On 04/17/2013 12:02 PM, 5hosting Team wrote: > > Hey guys, > > We run a 40 node webcluster (only apache, php processes) and the nodes > keep on crashing with a kernel panic. For me it looks like the rename > of a file/directory aint working. I found someone posting the same a > few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but > that's the kernel we're running. And we just used fsck yesterday night > to check for problems with the file system. So something doesn't seem > right. > > Here are 3 crashlogs from 3 different nodes: > > Apr 17 20:20:16 001 kernel: Modules linked in: gfs2 dlm configfs sg > sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT > nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter > ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod > nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio > ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx > iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: > scsi_wait_scan] > > Apr 17 20:20:16 001 kernel: > > Apr 17 20:20:16 001 kernel: Pid: 2915, comm: php-cgi Not tainted > 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM > > Apr 17 20:20:16 001 kernel: RIP: 0010:[] > [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:20:16 001 kernel: RSP: 0018:ffff880417e8ba58 EFLAGS: 00010283 > > Apr 17 20:20:16 001 kernel: RAX: ffff8804185a3da8 RBX: > 0000000000000003 RCX: 000000000db41094 > > Apr 17 20:20:16 001 kernel: RDX: 000000000db41094 RSI: > 000000000db21756 RDI: ffff8804187ef440 > > Apr 17 20:20:16 001 kernel: RBP: ffff880417e8bb18 R08: > 0000000000000000 R09: 0000000000000000 > > Apr 17 20:20:16 001 kernel: R10: 0000000000001000 R11: > 0000000000000000 R12: ffff8804187ef000 > > Apr 17 20:20:16 001 kernel: R13: 0000000000000000 R14: > ffff88041519c3e0 R15: ffff880417e8bb78 > > Apr 17 20:20:16 001 kernel: FS: 00007f07791ff7c0(0000) > GS:ffff880028200000(0000) knlGS:0000000000000000 > > Apr 17 20:20:16 001 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > > Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 CR3: > 0000000411e94000 CR4: 00000000001407f0 > > Apr 17 20:20:16 001 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > > Apr 17 20:20:16 001 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 17 20:20:16 001 kernel: Process php-cgi (pid: 2915, threadinfo > ffff880417e8a000, task ffff8804125cc040) > > Apr 17 20:20:16 001 kernel: Stack: > > Apr 17 20:20:16 001 kernel: ffff8804163dbab0 ffffffffa04012d0 > ffff880417e8ba78 ffff8804163dd0c0 > > Apr 17 20:20:16 001 kernel: ffff8804163dbab0 00000115a04012d0 > ffff880417e8ba98 ffff8804187ef000 > > Apr 17 20:20:16 001 kernel: ffff880417e8baf8 ffffffffa0402931 > ffff8804185a3da8 0000000000000000 > > Apr 17 20:20:16 001 kernel: Call Trace: > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_dirent_find_space+0x0/0x50 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_dirent_search+0x191/0x1a0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] > gfs2_rename+0x6b1/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0x128/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0x146/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0x16c/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_glock_put+0x3f/0x180 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_holder_uninit+0x23/0x40 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_permission+0x9c/0x100 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0xd5/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] vfs_rename+0x3ab/0x440 > > Apr 17 20:20:16 001 kernel: [] sys_renameat+0x1da/0x240 > > Apr 17 20:20:16 001 kernel: [] ? > _atomic_dec_and_lock+0x55/0x80 > > Apr 17 20:20:16 001 kernel: [] ? cp_new_stat+0xe4/0x100 > > Apr 17 20:20:16 001 kernel: [] ? sys_newstat+0x36/0x50 > > Apr 17 20:20:16 001 kernel: [] ? > audit_syscall_entry+0x1d7/0x200 > > Apr 17 20:20:16 001 kernel: [] sys_rename+0x1b/0x20 > > Apr 17 20:20:16 001 kernel: [] > system_call_fastpath+0x16/0x1b > > Apr 17 20:20:16 001 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 > 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 > 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 > e9 01 fb ff ff 48 > > Apr 17 20:20:16 001 kernel: RIP [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:20:16 001 kernel: RSP > > Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 > > Apr 17 20:20:16 001 kernel: ---[ end trace 0647d0d2004566f6 ]--- > > Apr 17 20:21:00 002 kernel: Modules linked in: gfs2 dlm configfs sg > sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT > nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter > ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod > nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio > ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx > iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: > scsi_wait_scan] > > Apr 17 20:21:00 002 kernel: > > Apr 17 20:21:00 002 kernel: Pid: 2839, comm: php-cgi Not tainted > 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM > > Apr 17 20:21:00 002 kernel: RIP: 0010:[] > [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:21:00 002 kernel: RSP: 0000:ffff8803f518ba58 EFLAGS: 00010283 > > Apr 17 20:21:00 002 kernel: RAX: ffff88041447bda8 RBX: > 0000000000000003 RCX: 000000000db41094 > > Apr 17 20:21:00 002 kernel: RDX: 000000000db41094 RSI: > 000000000db21756 RDI: ffff880414cd5440 > > Apr 17 20:21:00 002 kernel: RBP: ffff8803f518bb18 R08: > 0000000000000000 R09: 0000000000000000 > > Apr 17 20:21:00 002 kernel: R10: 0000000000001000 R11: > 0000000000000000 R12: ffff880414cd5000 > > Apr 17 20:21:00 002 kernel: R13: 0000000000000000 R14: > ffff8803f9e918c0 R15: ffff8803f518bb78 > > Apr 17 20:21:00 002 kernel: FS: 00007f6a7e8a27c0(0000) > GS:ffff8800282c0000(0000) knlGS:0000000000000000 > > Apr 17 20:21:00 002 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > > Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 CR3: > 00000003f6313000 CR4: 00000000001407e0 > > Apr 17 20:21:00 002 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > > Apr 17 20:21:00 002 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 17 20:21:00 002 kernel: Process php-cgi (pid: 2839, threadinfo > ffff8803f518a000, task ffff8803f5189540) > > Apr 17 20:21:00 002 kernel: Stack: > > Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 ffffffffa04012d0 > ffff8803f518ba78 ffff88041518eb60 > > Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 00000115a04012d0 > ffff8803f518ba98 ffff880414cd5000 > > Apr 17 20:21:00 002 kernel: ffff8803f518baf8 ffffffffa0402931 > ffff88041447bda8 0000000000000000 > > Apr 17 20:21:00 002 kernel: Call Trace: > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_dirent_find_space+0x0/0x50 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_dirent_search+0x191/0x1a0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] > gfs2_rename+0x6b1/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0x128/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0x146/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0x16c/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_glock_put+0x3f/0x180 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_holder_uninit+0x23/0x40 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_permission+0x9c/0x100 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0xd5/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] vfs_rename+0x3ab/0x440 > > Apr 17 20:21:00 002 kernel: [] sys_renameat+0x1da/0x240 > > Apr 17 20:21:00 002 kernel: [] ? > _atomic_dec_and_lock+0x55/0x80 > > Apr 17 20:21:00 002 kernel: [] ? cp_new_stat+0xe4/0x100 > > Apr 17 20:21:00 002 kernel: [] ? sys_newstat+0x36/0x50 > > Apr 17 20:21:00 002 kernel: [] ? > audit_syscall_entry+0x1d7/0x200 > > Apr 17 20:21:00 002 kernel: [] sys_rename+0x1b/0x20 > > Apr 17 20:21:00 002 kernel: [] > system_call_fastpath+0x16/0x1b > > Apr 17 20:21:00 002 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 > 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 > 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 > e9 01 fb ff ff 48 > > Apr 17 20:21:00 002 kernel: RIP [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:21:00 002 kernel: RSP > > Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 > > Apr 17 20:21:00 002 kernel: ---[ end trace 1425fd0e2954015a ]--- > > Apr 17 20:12:49 003 kernel: BUG: unable to handle kernel NULL pointer > dereference at 0000000000000060 > > Apr 17 20:12:49 003 kernel: IP: [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:12:49 003 kernel: PGD 3d96fc067 PUD 3d2c0a067 PMD 0 > > Apr 17 20:12:49 003 kernel: Oops: 0002 [#1] SMP > > Apr 17 20:12:49 003 kernel: last sysfs file: /sys/kernel/dlm/b1/control > > Apr 17 20:12:49 003 kernel: CPU 1 > > Apr 17 20:12:49 003 kernel: Modules linked in: gfs2 dlm configfs sg > sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT > nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter > ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod > nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio > ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx > iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: > scsi_wait_scan] > > Apr 17 20:12:49 003 kernel: > > Apr 17 20:12:49 003 kernel: Pid: 3386, comm: php-cgi Not tainted > 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM > > Apr 17 20:12:49 003 kernel: RIP: 0010:[] > [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:12:49 003 kernel: RSP: 0018:ffff8803d1c27a58 EFLAGS: 00010283 > > Apr 17 20:12:49 003 kernel: RAX: ffff880416771da8 RBX: > 0000000000000003 RCX: 000000000db41094 > > Apr 17 20:12:49 003 kernel: RDX: 000000000db41094 RSI: > 000000000db21756 RDI: ffff88041277b440 > > Apr 17 20:12:49 003 kernel: RBP: ffff8803d1c27b18 R08: > 0000000000000000 R09: 0000000000000000 > > Apr 17 20:12:49 003 kernel: R10: 0000000000001000 R11: > 0000000000000000 R12: ffff88041277b000 > > Apr 17 20:12:49 003 kernel: R13: 0000000000000000 R14: > ffff8803a9c181c0 R15: ffff8803d1c27b78 > > Apr 17 20:12:49 003 kernel: FS: 00007fd494d017c0(0000) > GS:ffff880028240000(0000) knlGS:0000000000000000 > > Apr 17 20:12:49 003 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > > Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 CR3: > 00000003d170c000 CR4: 00000000001407e0 > > Apr 17 20:12:49 003 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > > Apr 17 20:12:49 003 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 17 20:12:49 003 kernel: Process php-cgi (pid: 3386, threadinfo > ffff8803d1c26000, task ffff8803d1450080) > > Apr 17 20:12:49 003 kernel: Stack: > > Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 ffffffffa03fe2d0 > ffff8803d1c27a78 ffff8803cee73800 > > Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 00000115a03fe2d0 > ffff8803d1c27a98 ffff88041277b000 > > Apr 17 20:12:49 003 kernel: ffff8803d1c27af8 ffffffffa03ff931 > ffff880416771da8 0000000000000000 > > Apr 17 20:12:49 003 kernel: Call Trace: > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_dirent_find_space+0x0/0x50 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_dirent_search+0x191/0x1a0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] > gfs2_rename+0x6b1/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0x128/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0x146/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0x16c/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_glock_put+0x3f/0x180 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_holder_uninit+0x23/0x40 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_permission+0x9c/0x100 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0xd5/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] vfs_rename+0x3ab/0x440 > > Apr 17 20:12:49 003 kernel: [] sys_renameat+0x1da/0x240 > > Apr 17 20:12:49 003 kernel: [] ? > _atomic_dec_and_lock+0x55/0x80 > > Apr 17 20:12:49 003 kernel: [] ? cp_new_stat+0xe4/0x100 > > Apr 17 20:12:49 003 kernel: [] ? sys_newstat+0x36/0x50 > > Apr 17 20:12:49 003 kernel: [] ? > audit_syscall_entry+0x1d7/0x200 > > Apr 17 20:12:49 003 kernel: [] sys_rename+0x1b/0x20 > > Apr 17 20:12:49 003 kernel: [] > system_call_fastpath+0x16/0x1b > > Apr 17 20:12:49 003 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 > 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 > 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 > e9 01 fb ff ff 48 > > Apr 17 20:12:49 003 kernel: RIP [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:12:49 003 kernel: RSP > > Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 > > Apr 17 20:12:49 003 kernel: ---[ end trace 06b117dc4fff0890 ]--- > > The call trace looks for me kinda the same on all nodes and after we > rebooted ALL 40 nodes, the "bug" seems to be gone and the system is > running fine right now. (it's running 20 minutes now without > rebooting, before that we had a reboot every half minute) > > Do you know anything about that -- how can we fix it? > > It's a webcluster and such crashes aren't good. It should be online > 24/7 but right now it doesn't look that good. > > Thanks in advance, J?rgen > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurence.schuler at nasa.gov Wed Apr 17 19:38:21 2013 From: laurence.schuler at nasa.gov (laurence.schuler) Date: Wed, 17 Apr 2013 15:38:21 -0400 Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> Message-ID: <516EFA2D.9020400@nasa.gov> See: https://access.redhat.com/site/solutions/333883 On 04/17/2013 03:02 PM, 5hosting Team wrote: > > Hey guys, > > > > We run a 40 node webcluster (only apache, php processes) and the nodes > keep on crashing with a kernel panic. For me it looks like the rename > of a file/directory aint working. I found someone posting the same a > few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but > that?s the kernel we?re running. And we just used fsck yesterday night > to check for problems with the file system. So something doesn?t seem > right. > > > > Here are 3 crashlogs from 3 different nodes: > > Apr 17 20:20:16 001 kernel: Modules linked in: gfs2 dlm configfs sg > sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT > nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter > ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod > nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio > ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx > iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: > scsi_wait_scan] > > Apr 17 20:20:16 001 kernel: > > Apr 17 20:20:16 001 kernel: Pid: 2915, comm: php-cgi Not tainted > 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM > > Apr 17 20:20:16 001 kernel: RIP: 0010:[] > [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:20:16 001 kernel: RSP: 0018:ffff880417e8ba58 EFLAGS: 00010283 > > Apr 17 20:20:16 001 kernel: RAX: ffff8804185a3da8 RBX: > 0000000000000003 RCX: 000000000db41094 > > Apr 17 20:20:16 001 kernel: RDX: 000000000db41094 RSI: > 000000000db21756 RDI: ffff8804187ef440 > > Apr 17 20:20:16 001 kernel: RBP: ffff880417e8bb18 R08: > 0000000000000000 R09: 0000000000000000 > > Apr 17 20:20:16 001 kernel: R10: 0000000000001000 R11: > 0000000000000000 R12: ffff8804187ef000 > > Apr 17 20:20:16 001 kernel: R13: 0000000000000000 R14: > ffff88041519c3e0 R15: ffff880417e8bb78 > > Apr 17 20:20:16 001 kernel: FS: 00007f07791ff7c0(0000) > GS:ffff880028200000(0000) knlGS:0000000000000000 > > Apr 17 20:20:16 001 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > > Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 CR3: > 0000000411e94000 CR4: 00000000001407f0 > > Apr 17 20:20:16 001 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > > Apr 17 20:20:16 001 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 17 20:20:16 001 kernel: Process php-cgi (pid: 2915, threadinfo > ffff880417e8a000, task ffff8804125cc040) > > Apr 17 20:20:16 001 kernel: Stack: > > Apr 17 20:20:16 001 kernel: ffff8804163dbab0 ffffffffa04012d0 > ffff880417e8ba78 ffff8804163dd0c0 > > Apr 17 20:20:16 001 kernel: ffff8804163dbab0 00000115a04012d0 > ffff880417e8ba98 ffff8804187ef000 > > Apr 17 20:20:16 001 kernel: ffff880417e8baf8 ffffffffa0402931 > ffff8804185a3da8 0000000000000000 > > Apr 17 20:20:16 001 kernel: Call Trace: > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_dirent_find_space+0x0/0x50 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_dirent_search+0x191/0x1a0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] > gfs2_rename+0x6b1/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0x128/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0x146/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0x16c/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_glock_put+0x3f/0x180 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_holder_uninit+0x23/0x40 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_permission+0x9c/0x100 [gfs2] > > Apr 17 20:20:16 001 kernel: [] ? > gfs2_rename+0xd5/0x8c0 [gfs2] > > Apr 17 20:20:16 001 kernel: [] vfs_rename+0x3ab/0x440 > > Apr 17 20:20:16 001 kernel: [] sys_renameat+0x1da/0x240 > > Apr 17 20:20:16 001 kernel: [] ? > _atomic_dec_and_lock+0x55/0x80 > > Apr 17 20:20:16 001 kernel: [] ? cp_new_stat+0xe4/0x100 > > Apr 17 20:20:16 001 kernel: [] ? sys_newstat+0x36/0x50 > > Apr 17 20:20:16 001 kernel: [] ? > audit_syscall_entry+0x1d7/0x200 > > Apr 17 20:20:16 001 kernel: [] sys_rename+0x1b/0x20 > > Apr 17 20:20:16 001 kernel: [] > system_call_fastpath+0x16/0x1b > > Apr 17 20:20:16 001 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 > 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 > 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 > e9 01 fb ff ff 48 > > Apr 17 20:20:16 001 kernel: RIP [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:20:16 001 kernel: RSP > > Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 > > Apr 17 20:20:16 001 kernel: ---[ end trace 0647d0d2004566f6 ]--- > > > > > > Apr 17 20:21:00 002 kernel: Modules linked in: gfs2 dlm configfs sg > sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT > nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter > ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod > nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio > ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx > iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: > scsi_wait_scan] > > Apr 17 20:21:00 002 kernel: > > Apr 17 20:21:00 002 kernel: Pid: 2839, comm: php-cgi Not tainted > 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM > > Apr 17 20:21:00 002 kernel: RIP: 0010:[] > [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:21:00 002 kernel: RSP: 0000:ffff8803f518ba58 EFLAGS: 00010283 > > Apr 17 20:21:00 002 kernel: RAX: ffff88041447bda8 RBX: > 0000000000000003 RCX: 000000000db41094 > > Apr 17 20:21:00 002 kernel: RDX: 000000000db41094 RSI: > 000000000db21756 RDI: ffff880414cd5440 > > Apr 17 20:21:00 002 kernel: RBP: ffff8803f518bb18 R08: > 0000000000000000 R09: 0000000000000000 > > Apr 17 20:21:00 002 kernel: R10: 0000000000001000 R11: > 0000000000000000 R12: ffff880414cd5000 > > Apr 17 20:21:00 002 kernel: R13: 0000000000000000 R14: > ffff8803f9e918c0 R15: ffff8803f518bb78 > > Apr 17 20:21:00 002 kernel: FS: 00007f6a7e8a27c0(0000) > GS:ffff8800282c0000(0000) knlGS:0000000000000000 > > Apr 17 20:21:00 002 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > > Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 CR3: > 00000003f6313000 CR4: 00000000001407e0 > > Apr 17 20:21:00 002 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > > Apr 17 20:21:00 002 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 17 20:21:00 002 kernel: Process php-cgi (pid: 2839, threadinfo > ffff8803f518a000, task ffff8803f5189540) > > Apr 17 20:21:00 002 kernel: Stack: > > Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 ffffffffa04012d0 > ffff8803f518ba78 ffff88041518eb60 > > Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 00000115a04012d0 > ffff8803f518ba98 ffff880414cd5000 > > Apr 17 20:21:00 002 kernel: ffff8803f518baf8 ffffffffa0402931 > ffff88041447bda8 0000000000000000 > > Apr 17 20:21:00 002 kernel: Call Trace: > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_dirent_find_space+0x0/0x50 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_dirent_search+0x191/0x1a0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] > gfs2_rename+0x6b1/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0x128/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0x146/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0x16c/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_glock_put+0x3f/0x180 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_holder_uninit+0x23/0x40 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_permission+0x9c/0x100 [gfs2] > > Apr 17 20:21:00 002 kernel: [] ? > gfs2_rename+0xd5/0x8c0 [gfs2] > > Apr 17 20:21:00 002 kernel: [] vfs_rename+0x3ab/0x440 > > Apr 17 20:21:00 002 kernel: [] sys_renameat+0x1da/0x240 > > Apr 17 20:21:00 002 kernel: [] ? > _atomic_dec_and_lock+0x55/0x80 > > Apr 17 20:21:00 002 kernel: [] ? cp_new_stat+0xe4/0x100 > > Apr 17 20:21:00 002 kernel: [] ? sys_newstat+0x36/0x50 > > Apr 17 20:21:00 002 kernel: [] ? > audit_syscall_entry+0x1d7/0x200 > > Apr 17 20:21:00 002 kernel: [] sys_rename+0x1b/0x20 > > Apr 17 20:21:00 002 kernel: [] > system_call_fastpath+0x16/0x1b > > Apr 17 20:21:00 002 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 > 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 > 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 > e9 01 fb ff ff 48 > > Apr 17 20:21:00 002 kernel: RIP [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:21:00 002 kernel: RSP > > Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 > > Apr 17 20:21:00 002 kernel: ---[ end trace 1425fd0e2954015a ]--- > > > > > > Apr 17 20:12:49 003 kernel: BUG: unable to handle kernel NULL pointer > dereference at 0000000000000060 > > Apr 17 20:12:49 003 kernel: IP: [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:12:49 003 kernel: PGD 3d96fc067 PUD 3d2c0a067 PMD 0 > > Apr 17 20:12:49 003 kernel: Oops: 0002 [#1] SMP > > Apr 17 20:12:49 003 kernel: last sysfs file: /sys/kernel/dlm/b1/control > > Apr 17 20:12:49 003 kernel: CPU 1 > > Apr 17 20:12:49 003 kernel: Modules linked in: gfs2 dlm configfs sg > sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core > ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT > nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter > ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod > nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio > ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx > iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: > scsi_wait_scan] > > Apr 17 20:12:49 003 kernel: > > Apr 17 20:12:49 003 kernel: Pid: 3386, comm: php-cgi Not tainted > 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM > > Apr 17 20:12:49 003 kernel: RIP: 0010:[] > [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:12:49 003 kernel: RSP: 0018:ffff8803d1c27a58 EFLAGS: 00010283 > > Apr 17 20:12:49 003 kernel: RAX: ffff880416771da8 RBX: > 0000000000000003 RCX: 000000000db41094 > > Apr 17 20:12:49 003 kernel: RDX: 000000000db41094 RSI: > 000000000db21756 RDI: ffff88041277b440 > > Apr 17 20:12:49 003 kernel: RBP: ffff8803d1c27b18 R08: > 0000000000000000 R09: 0000000000000000 > > Apr 17 20:12:49 003 kernel: R10: 0000000000001000 R11: > 0000000000000000 R12: ffff88041277b000 > > Apr 17 20:12:49 003 kernel: R13: 0000000000000000 R14: > ffff8803a9c181c0 R15: ffff8803d1c27b78 > > Apr 17 20:12:49 003 kernel: FS: 00007fd494d017c0(0000) > GS:ffff880028240000(0000) knlGS:0000000000000000 > > Apr 17 20:12:49 003 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > > Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 CR3: > 00000003d170c000 CR4: 00000000001407e0 > > Apr 17 20:12:49 003 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > > Apr 17 20:12:49 003 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 17 20:12:49 003 kernel: Process php-cgi (pid: 3386, threadinfo > ffff8803d1c26000, task ffff8803d1450080) > > Apr 17 20:12:49 003 kernel: Stack: > > Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 ffffffffa03fe2d0 > ffff8803d1c27a78 ffff8803cee73800 > > Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 00000115a03fe2d0 > ffff8803d1c27a98 ffff88041277b000 > > Apr 17 20:12:49 003 kernel: ffff8803d1c27af8 ffffffffa03ff931 > ffff880416771da8 0000000000000000 > > Apr 17 20:12:49 003 kernel: Call Trace: > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_dirent_find_space+0x0/0x50 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_dirent_search+0x191/0x1a0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] > gfs2_rename+0x6b1/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0x128/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0x146/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0x16c/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_glock_put+0x3f/0x180 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_holder_uninit+0x23/0x40 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_permission+0x9c/0x100 [gfs2] > > Apr 17 20:12:49 003 kernel: [] ? > gfs2_rename+0xd5/0x8c0 [gfs2] > > Apr 17 20:12:49 003 kernel: [] vfs_rename+0x3ab/0x440 > > Apr 17 20:12:49 003 kernel: [] sys_renameat+0x1da/0x240 > > Apr 17 20:12:49 003 kernel: [] ? > _atomic_dec_and_lock+0x55/0x80 > > Apr 17 20:12:49 003 kernel: [] ? cp_new_stat+0xe4/0x100 > > Apr 17 20:12:49 003 kernel: [] ? sys_newstat+0x36/0x50 > > Apr 17 20:12:49 003 kernel: [] ? > audit_syscall_entry+0x1d7/0x200 > > Apr 17 20:12:49 003 kernel: [] sys_rename+0x1b/0x20 > > Apr 17 20:12:49 003 kernel: [] > system_call_fastpath+0x16/0x1b > > Apr 17 20:12:49 003 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 > 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 > 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 > e9 01 fb ff ff 48 > > Apr 17 20:12:49 003 kernel: RIP [] > gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] > > Apr 17 20:12:49 003 kernel: RSP > > Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 > > Apr 17 20:12:49 003 kernel: ---[ end trace 06b117dc4fff0890 ]--- > > > > > > > > The call trace looks for me kinda the same on all nodes and after we > rebooted ALL 40 nodes, the ?bug? seems to be gone and the system is > running fine right now. (it?s running 20 minutes now without > rebooting, before that we had a reboot every half minute) > > > > Do you know anything about that ? how can we fix it? > > It?s a webcluster and such crashes aren?t good. It should be online > 24/7 but right now it doesn?t look that good. > > > > Thanks in advance, J?rgen > -- Laurence Schuler (Larry) Laurence.Schuler at nasa.gov Systems Support ADNET Systems, Inc Scientific Visualization Studio http://svs.gsfc.nasa.gov NASA/Goddard Space Flight Center, Code 606.4 phone: 1-301-286-1799 Greenbelt, MD 20771 fax: 1-301-286-1634 Note: I am not a government employee and have no authority to obligate any federal, state or local government to perform any action or payment. -------------- next part -------------- An HTML attachment was scrubbed... URL: From office at 5hosting.com Wed Apr 17 19:46:19 2013 From: office at 5hosting.com (5hosting Team) Date: Wed, 17 Apr 2013 21:46:19 +0200 Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <514887138.17961011.1366226884345.JavaMail.root@redhat.com> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> <514887138.17961011.1366226884345.JavaMail.root@redhat.com> Message-ID: <036901ce3ba4$3c0726a0$b41573e0$@5hosting.com> Hi Bob, thanks for your message - do you know when an updated kernel version will be rolled out? Thanks in advance, J?rgen -----Urspr?ngliche Nachricht----- Von: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] Im Auftrag von Bob Peterson Gesendet: Mittwoch, 17. April 2013 21:28 An: linux clustering Betreff: Re: [Linux-cluster] GFS2 crashes - sys_rename ----- Original Message ----- | Hey guys, | | | | We run a 40 node webcluster (only apache, php processes) and the nodes | keep on crashing with a kernel panic. For me it looks like the rename | of a file/directory aint working. I found someone posting the same a | few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but | that?s the kernel we?re running. And we just used fsck yesterday night | to check for problems with the file system. So something doesn?t seem right. | | | | Here are 3 crashlogs from 3 different nodes: Hi, This is bugzilla bug #924847. We have a patch, but the patch has not found its way to a kernel yet, but it's in process. Regards, Bob Peterson Red Hat File Systems -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6079 bytes Desc: not available URL: From office at 5hosting.com Wed Apr 17 19:48:58 2013 From: office at 5hosting.com (5hosting Team) Date: Wed, 17 Apr 2013 21:48:58 +0200 Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <516EF94F.5080107@nasa.gov> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> <516EF94F.5080107@nasa.gov> Message-ID: <038d01ce3ba4$9a82c5e0$cf8851a0$@5hosting.com> Hey larry, to what version did you roll back? Did you have to fsck the cluster or did it work out of the box? Is your cluster stable right now? Thanks in advance, J?rgen Von: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] Im Auftrag von laurence.schuler Gesendet: Mittwoch, 17. April 2013 21:35 An: linux-cluster at redhat.com Betreff: Re: [Linux-cluster] GFS2 crashes - sys_rename There's a similar bug about this same crash in the 358 kernel. Its a different bug. I rolled back to the previous for now, Redhat should have a fix soon. --larry On 04/17/2013 03:02 PM, 5hosting Team wrote: Hey guys, We run a 40 node webcluster (only apache, php processes) and the nodes keep on crashing with a kernel panic. For me it looks like the rename of a file/directory aint working. I found someone posting the same a few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but that?s the kernel we?re running. And we just used fsck yesterday night to check for problems with the file system. So something doesn?t seem right. Here are 3 crashlogs from 3 different nodes: Apr 17 20:20:16 001 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Apr 17 20:20:16 001 kernel: Apr 17 20:20:16 001 kernel: Pid: 2915, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM Apr 17 20:20:16 001 kernel: RIP: 0010:[] [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:20:16 001 kernel: RSP: 0018:ffff880417e8ba58 EFLAGS: 00010283 Apr 17 20:20:16 001 kernel: RAX: ffff8804185a3da8 RBX: 0000000000000003 RCX: 000000000db41094 Apr 17 20:20:16 001 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff8804187ef440 Apr 17 20:20:16 001 kernel: RBP: ffff880417e8bb18 R08: 0000000000000000 R09: 0000000000000000 Apr 17 20:20:16 001 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff8804187ef000 Apr 17 20:20:16 001 kernel: R13: 0000000000000000 R14: ffff88041519c3e0 R15: ffff880417e8bb78 Apr 17 20:20:16 001 kernel: FS: 00007f07791ff7c0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 Apr 17 20:20:16 001 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 CR3: 0000000411e94000 CR4: 00000000001407f0 Apr 17 20:20:16 001 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 17 20:20:16 001 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 17 20:20:16 001 kernel: Process php-cgi (pid: 2915, threadinfo ffff880417e8a000, task ffff8804125cc040) Apr 17 20:20:16 001 kernel: Stack: Apr 17 20:20:16 001 kernel: ffff8804163dbab0 ffffffffa04012d0 ffff880417e8ba78 ffff8804163dd0c0 Apr 17 20:20:16 001 kernel: ffff8804163dbab0 00000115a04012d0 ffff880417e8ba98 ffff8804187ef000 Apr 17 20:20:16 001 kernel: ffff880417e8baf8 ffffffffa0402931 ffff8804185a3da8 0000000000000000 Apr 17 20:20:16 001 kernel: Call Trace: Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] Apr 17 20:20:16 001 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] vfs_rename+0x3ab/0x440 Apr 17 20:20:16 001 kernel: [] sys_renameat+0x1da/0x240 Apr 17 20:20:16 001 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 Apr 17 20:20:16 001 kernel: [] ? cp_new_stat+0xe4/0x100 Apr 17 20:20:16 001 kernel: [] ? sys_newstat+0x36/0x50 Apr 17 20:20:16 001 kernel: [] ? audit_syscall_entry+0x1d7/0x200 Apr 17 20:20:16 001 kernel: [] sys_rename+0x1b/0x20 Apr 17 20:20:16 001 kernel: [] system_call_fastpath+0x16/0x1b Apr 17 20:20:16 001 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 Apr 17 20:20:16 001 kernel: RIP [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:20:16 001 kernel: RSP Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 Apr 17 20:20:16 001 kernel: ---[ end trace 0647d0d2004566f6 ]--- Apr 17 20:21:00 002 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Apr 17 20:21:00 002 kernel: Apr 17 20:21:00 002 kernel: Pid: 2839, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM Apr 17 20:21:00 002 kernel: RIP: 0010:[] [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:21:00 002 kernel: RSP: 0000:ffff8803f518ba58 EFLAGS: 00010283 Apr 17 20:21:00 002 kernel: RAX: ffff88041447bda8 RBX: 0000000000000003 RCX: 000000000db41094 Apr 17 20:21:00 002 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff880414cd5440 Apr 17 20:21:00 002 kernel: RBP: ffff8803f518bb18 R08: 0000000000000000 R09: 0000000000000000 Apr 17 20:21:00 002 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff880414cd5000 Apr 17 20:21:00 002 kernel: R13: 0000000000000000 R14: ffff8803f9e918c0 R15: ffff8803f518bb78 Apr 17 20:21:00 002 kernel: FS: 00007f6a7e8a27c0(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000 Apr 17 20:21:00 002 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 CR3: 00000003f6313000 CR4: 00000000001407e0 Apr 17 20:21:00 002 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 17 20:21:00 002 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 17 20:21:00 002 kernel: Process php-cgi (pid: 2839, threadinfo ffff8803f518a000, task ffff8803f5189540) Apr 17 20:21:00 002 kernel: Stack: Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 ffffffffa04012d0 ffff8803f518ba78 ffff88041518eb60 Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 00000115a04012d0 ffff8803f518ba98 ffff880414cd5000 Apr 17 20:21:00 002 kernel: ffff8803f518baf8 ffffffffa0402931 ffff88041447bda8 0000000000000000 Apr 17 20:21:00 002 kernel: Call Trace: Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] Apr 17 20:21:00 002 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] vfs_rename+0x3ab/0x440 Apr 17 20:21:00 002 kernel: [] sys_renameat+0x1da/0x240 Apr 17 20:21:00 002 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 Apr 17 20:21:00 002 kernel: [] ? cp_new_stat+0xe4/0x100 Apr 17 20:21:00 002 kernel: [] ? sys_newstat+0x36/0x50 Apr 17 20:21:00 002 kernel: [] ? audit_syscall_entry+0x1d7/0x200 Apr 17 20:21:00 002 kernel: [] sys_rename+0x1b/0x20 Apr 17 20:21:00 002 kernel: [] system_call_fastpath+0x16/0x1b Apr 17 20:21:00 002 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 Apr 17 20:21:00 002 kernel: RIP [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:21:00 002 kernel: RSP Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 Apr 17 20:21:00 002 kernel: ---[ end trace 1425fd0e2954015a ]--- Apr 17 20:12:49 003 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 Apr 17 20:12:49 003 kernel: IP: [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:12:49 003 kernel: PGD 3d96fc067 PUD 3d2c0a067 PMD 0 Apr 17 20:12:49 003 kernel: Oops: 0002 [#1] SMP Apr 17 20:12:49 003 kernel: last sysfs file: /sys/kernel/dlm/b1/control Apr 17 20:12:49 003 kernel: CPU 1 Apr 17 20:12:49 003 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Apr 17 20:12:49 003 kernel: Apr 17 20:12:49 003 kernel: Pid: 3386, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM Apr 17 20:12:49 003 kernel: RIP: 0010:[] [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:12:49 003 kernel: RSP: 0018:ffff8803d1c27a58 EFLAGS: 00010283 Apr 17 20:12:49 003 kernel: RAX: ffff880416771da8 RBX: 0000000000000003 RCX: 000000000db41094 Apr 17 20:12:49 003 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff88041277b440 Apr 17 20:12:49 003 kernel: RBP: ffff8803d1c27b18 R08: 0000000000000000 R09: 0000000000000000 Apr 17 20:12:49 003 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff88041277b000 Apr 17 20:12:49 003 kernel: R13: 0000000000000000 R14: ffff8803a9c181c0 R15: ffff8803d1c27b78 Apr 17 20:12:49 003 kernel: FS: 00007fd494d017c0(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 Apr 17 20:12:49 003 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 CR3: 00000003d170c000 CR4: 00000000001407e0 Apr 17 20:12:49 003 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 17 20:12:49 003 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 17 20:12:49 003 kernel: Process php-cgi (pid: 3386, threadinfo ffff8803d1c26000, task ffff8803d1450080) Apr 17 20:12:49 003 kernel: Stack: Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 ffffffffa03fe2d0 ffff8803d1c27a78 ffff8803cee73800 Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 00000115a03fe2d0 ffff8803d1c27a98 ffff88041277b000 Apr 17 20:12:49 003 kernel: ffff8803d1c27af8 ffffffffa03ff931 ffff880416771da8 0000000000000000 Apr 17 20:12:49 003 kernel: Call Trace: Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] Apr 17 20:12:49 003 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] vfs_rename+0x3ab/0x440 Apr 17 20:12:49 003 kernel: [] sys_renameat+0x1da/0x240 Apr 17 20:12:49 003 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 Apr 17 20:12:49 003 kernel: [] ? cp_new_stat+0xe4/0x100 Apr 17 20:12:49 003 kernel: [] ? sys_newstat+0x36/0x50 Apr 17 20:12:49 003 kernel: [] ? audit_syscall_entry+0x1d7/0x200 Apr 17 20:12:49 003 kernel: [] sys_rename+0x1b/0x20 Apr 17 20:12:49 003 kernel: [] system_call_fastpath+0x16/0x1b Apr 17 20:12:49 003 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 Apr 17 20:12:49 003 kernel: RIP [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:12:49 003 kernel: RSP Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 Apr 17 20:12:49 003 kernel: ---[ end trace 06b117dc4fff0890 ]--- The call trace looks for me kinda the same on all nodes and after we rebooted ALL 40 nodes, the ?bug? seems to be gone and the system is running fine right now. (it?s running 20 minutes now without rebooting, before that we had a reboot every half minute) Do you know anything about that ? how can we fix it? It?s a webcluster and such crashes aren?t good. It should be online 24/7 but right now it doesn?t look that good. Thanks in advance, J?rgen -- Laurence Schuler (Larry) Laurence.Schuler at nasa.gov Systems Support ADNET Systems, Inc Scientific Visualization Studio http://svs.gsfc.nasa.gov NASA/Goddard Space Flight Center, Code 606.4 phone: 1-301-286-1799 Greenbelt, MD 20771 fax: 1-301-286-1634 Note: I am not a government employee and have no authority to obligate any federal, state or local government to perform any action or payment. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6079 bytes Desc: not available URL: From white.heron at yahoo.com Thu Apr 18 12:31:45 2013 From: white.heron at yahoo.com (YB Tan Sri Dato Sri' Adli a.k.a Dell) Date: Thu, 18 Apr 2013 05:31:45 -0700 (PDT) Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <036901ce3ba4$3c0726a0$b41573e0$@5hosting.com> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> <514887138.17961011.1366226884345.JavaMail.root@redhat.com> <036901ce3ba4$3c0726a0$b41573e0$@5hosting.com> Message-ID: <1366288305.15347.YahooMailNeo@web163506.mail.gq1.yahoo.com> ? Regards, YB Tan Sri Dato' Sri Adli a.k.a Dell my.linkedin.com/pub/yb-tan-sri-dato-sri-adli-a-k-a-dell/44/64b/464/ H/p number: ?(017) 362 3661 ________________________________ From: 5hosting Team To: 'linux clustering' Sent: Thursday, April 18, 2013 3:46 AM Subject: Re: [Linux-cluster] GFS2 crashes - sys_rename Hi Bob, thanks for your message - do you know when an updated kernel version will be rolled out? Thanks in advance, J?rgen -----Urspr?ngliche Nachricht----- Von: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] Im Auftrag von Bob Peterson Gesendet: Mittwoch, 17. April 2013 21:28 An: linux clustering Betreff: Re: [Linux-cluster] GFS2 crashes - sys_rename ----- Original Message ----- | Hey guys, | |? | | We run a 40 node webcluster (only apache, php processes) and the nodes | keep on crashing with a kernel panic. For me it looks like the rename | of a file/directory aint working. I found someone posting the same a | few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but | that?s the kernel we?re running. And we just used fsck yesterday night | to check for problems with the file system. So something doesn?t seem right. | |? | | Here are 3 crashlogs from 3 different nodes: Hi, This is bugzilla bug #924847. We have a patch, but the patch has not found its way to a kernel yet, but it's in process. Regards, Bob Peterson Red Hat File Systems -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From white.heron at yahoo.com Thu Apr 18 12:33:07 2013 From: white.heron at yahoo.com (YB Tan Sri Dato Sri' Adli a.k.a Dell) Date: Thu, 18 Apr 2013 05:33:07 -0700 (PDT) Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <038d01ce3ba4$9a82c5e0$cf8851a0$@5hosting.com> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> <516EF94F.5080107@nasa.gov> <038d01ce3ba4$9a82c5e0$cf8851a0$@5hosting.com> Message-ID: <1366288387.5087.YahooMailNeo@web163504.mail.gq1.yahoo.com> ? Regards, YB Tan Sri Dato' Sri Adli a.k.a Dell my.linkedin.com/pub/yb-tan-sri-dato-sri-adli-a-k-a-dell/44/64b/464/ H/p number: ?(017) 362 3661 ________________________________ From: 5hosting Team To: 'linux clustering' Sent: Thursday, April 18, 2013 3:48 AM Subject: Re: [Linux-cluster] GFS2 crashes - sys_rename Hey larry, ? to what version did you roll back? Did you have to fsck the cluster or did it work out of the box? Is your cluster stable right now? ? Thanks in advance, J?rgen ? Von:linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] Im Auftrag von laurence.schuler Gesendet: Mittwoch, 17. April 2013 21:35 An: linux-cluster at redhat.com Betreff: Re: [Linux-cluster] GFS2 crashes - sys_rename ? There's a similar bug about this same crash in the 358 kernel. Its a different bug. I rolled back to the previous for now, Redhat should have a fix soon. --larry On 04/17/2013 03:02 PM, 5hosting Team wrote: Hey guys, >? >We run a 40 node webcluster (only apache, php processes) and the nodes keep on crashing with a kernel panic. For me it looks like the rename of a file/directory aint working. I found someone posting the same a few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but that?s the kernel we?re running. And we just used fsck yesterday night to check for problems with the file system. So something doesn?t seem right. >? >Here are 3 crashlogs from 3 different nodes: >Apr 17 20:20:16 001 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] >Apr 17 20:20:16 001 kernel: >Apr 17 20:20:16 001 kernel: Pid: 2915, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM >Apr 17 20:20:16 001 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:20:16 001 kernel: RSP: 0018:ffff880417e8ba58? EFLAGS: 00010283 >Apr 17 20:20:16 001 kernel: RAX: ffff8804185a3da8 RBX: 0000000000000003 RCX: 000000000db41094 >Apr 17 20:20:16 001 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff8804187ef440 >Apr 17 20:20:16 001 kernel: RBP: ffff880417e8bb18 R08: 0000000000000000 R09: 0000000000000000 >Apr 17 20:20:16 001 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff8804187ef000 >Apr 17 20:20:16 001 kernel: R13: 0000000000000000 R14: ffff88041519c3e0 R15: ffff880417e8bb78 >Apr 17 20:20:16 001 kernel: FS:? 00007f07791ff7c0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 >Apr 17 20:20:16 001 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 CR3: 0000000411e94000 CR4: 00000000001407f0 >Apr 17 20:20:16 001 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >Apr 17 20:20:16 001 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Apr 17 20:20:16 001 kernel: Process php-cgi (pid: 2915, threadinfo ffff880417e8a000, task ffff8804125cc040) >Apr 17 20:20:16 001 kernel: Stack: >Apr 17 20:20:16 001 kernel: ffff8804163dbab0 ffffffffa04012d0 ffff880417e8ba78 ffff8804163dd0c0 >Apr 17 20:20:16 001 kernel: ffff8804163dbab0 00000115a04012d0 ffff880417e8ba98 ffff8804187ef000 >Apr 17 20:20:16 001 kernel: ffff880417e8baf8 ffffffffa0402931 ffff8804185a3da8 0000000000000000 >Apr 17 20:20:16 001 kernel: Call Trace: >Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] >Apr 17 20:20:16 001 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] vfs_rename+0x3ab/0x440 >Apr 17 20:20:16 001 kernel: [] sys_renameat+0x1da/0x240 >Apr 17 20:20:16 001 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 >Apr 17 20:20:16 001 kernel: [] ? cp_new_stat+0xe4/0x100 >Apr 17 20:20:16 001 kernel: [] ? sys_newstat+0x36/0x50 >Apr 17 20:20:16 001 kernel: [] ? audit_syscall_entry+0x1d7/0x200 >Apr 17 20:20:16 001 kernel: [] sys_rename+0x1b/0x20 >Apr 17 20:20:16 001 kernel: [] system_call_fastpath+0x16/0x1b >Apr 17 20:20:16 001 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 >Apr 17 20:20:16 001 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:20:16 001 kernel: RSP >Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 >Apr 17 20:20:16 001 kernel: ---[ end trace 0647d0d2004566f6 ]--- >? >? >Apr 17 20:21:00 002 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] >Apr 17 20:21:00 002 kernel: >Apr 17 20:21:00 002 kernel: Pid: 2839, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM >Apr 17 20:21:00 002 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:21:00 002 kernel: RSP: 0000:ffff8803f518ba58? EFLAGS: 00010283 >Apr 17 20:21:00 002 kernel: RAX: ffff88041447bda8 RBX: 0000000000000003 RCX: 000000000db41094 >Apr 17 20:21:00 002 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff880414cd5440 >Apr 17 20:21:00 002 kernel: RBP: ffff8803f518bb18 R08: 0000000000000000 R09: 0000000000000000 >Apr 17 20:21:00 002 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff880414cd5000 >Apr 17 20:21:00 002 kernel: R13: 0000000000000000 R14: ffff8803f9e918c0 R15: ffff8803f518bb78 >Apr 17 20:21:00 002 kernel: FS:? 00007f6a7e8a27c0(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000 >Apr 17 20:21:00 002 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 CR3: 00000003f6313000 CR4: 00000000001407e0 >Apr 17 20:21:00 002 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >Apr 17 20:21:00 002 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Apr 17 20:21:00 002 kernel: Process php-cgi (pid: 2839, threadinfo ffff8803f518a000, task ffff8803f5189540) >Apr 17 20:21:00 002 kernel: Stack: >Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 ffffffffa04012d0 ffff8803f518ba78 ffff88041518eb60 >Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 00000115a04012d0 ffff8803f518ba98 ffff880414cd5000 >Apr 17 20:21:00 002 kernel: ffff8803f518baf8 ffffffffa0402931 ffff88041447bda8 0000000000000000 >Apr 17 20:21:00 002 kernel: Call Trace: >Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] >Apr 17 20:21:00 002 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] vfs_rename+0x3ab/0x440 >Apr 17 20:21:00 002 kernel: [] sys_renameat+0x1da/0x240 >Apr 17 20:21:00 002 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 >Apr 17 20:21:00 002 kernel: [] ? cp_new_stat+0xe4/0x100 >Apr 17 20:21:00 002 kernel: [] ? sys_newstat+0x36/0x50 >Apr 17 20:21:00 002 kernel: [] ? audit_syscall_entry+0x1d7/0x200 >Apr 17 20:21:00 002 kernel: [] sys_rename+0x1b/0x20 >Apr 17 20:21:00 002 kernel: [] system_call_fastpath+0x16/0x1b >Apr 17 20:21:00 002 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 >Apr 17 20:21:00 002 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:21:00 002 kernel: RSP >Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 >Apr 17 20:21:00 002 kernel: ---[ end trace 1425fd0e2954015a ]--- >? >? >Apr 17 20:12:49 003 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 >Apr 17 20:12:49 003 kernel: IP: [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:12:49 003 kernel: PGD 3d96fc067 PUD 3d2c0a067 PMD 0 >Apr 17 20:12:49 003 kernel: Oops: 0002 [#1] SMP >Apr 17 20:12:49 003 kernel: last sysfs file: /sys/kernel/dlm/b1/control >Apr 17 20:12:49 003 kernel: CPU 1 >Apr 17 20:12:49 003 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] >Apr 17 20:12:49 003 kernel: >Apr 17 20:12:49 003 kernel: Pid: 3386, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM >Apr 17 20:12:49 003 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:12:49 003 kernel: RSP: 0018:ffff8803d1c27a58? EFLAGS: 00010283 >Apr 17 20:12:49 003 kernel: RAX: ffff880416771da8 RBX: 0000000000000003 RCX: 000000000db41094 >Apr 17 20:12:49 003 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff88041277b440 >Apr 17 20:12:49 003 kernel: RBP: ffff8803d1c27b18 R08: 0000000000000000 R09: 0000000000000000 >Apr 17 20:12:49 003 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff88041277b000 >Apr 17 20:12:49 003 kernel: R13: 0000000000000000 R14: ffff8803a9c181c0 R15: ffff8803d1c27b78 >Apr 17 20:12:49 003 kernel: FS:? 00007fd494d017c0(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 >Apr 17 20:12:49 003 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 CR3: 00000003d170c000 CR4: 00000000001407e0 >Apr 17 20:12:49 003 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >Apr 17 20:12:49 003 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Apr 17 20:12:49 003 kernel: Process php-cgi (pid: 3386, threadinfo ffff8803d1c26000, task ffff8803d1450080) >Apr 17 20:12:49 003 kernel: Stack: >Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 ffffffffa03fe2d0 ffff8803d1c27a78 ffff8803cee73800 >Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 00000115a03fe2d0 ffff8803d1c27a98 ffff88041277b000 >Apr 17 20:12:49 003 kernel: ffff8803d1c27af8 ffffffffa03ff931 ffff880416771da8 0000000000000000 >Apr 17 20:12:49 003 kernel: Call Trace: >Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] >Apr 17 20:12:49 003 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] vfs_rename+0x3ab/0x440 >Apr 17 20:12:49 003 kernel: [] sys_renameat+0x1da/0x240 >Apr 17 20:12:49 003 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 >Apr 17 20:12:49 003 kernel: [] ? cp_new_stat+0xe4/0x100 >Apr 17 20:12:49 003 kernel: [] ? sys_newstat+0x36/0x50 >Apr 17 20:12:49 003 kernel: [] ? audit_syscall_entry+0x1d7/0x200 >Apr 17 20:12:49 003 kernel: [] sys_rename+0x1b/0x20 >Apr 17 20:12:49 003 kernel: [] system_call_fastpath+0x16/0x1b >Apr 17 20:12:49 003 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 >Apr 17 20:12:49 003 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:12:49 003 kernel: RSP >Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 >Apr 17 20:12:49 003 kernel: ---[ end trace 06b117dc4fff0890 ]--- >? >? >? >The call trace looks for me kinda the same on all nodes and after we rebooted ALL 40 nodes, the ?bug? seems to be gone and the system is running fine right now. (it?s running 20 minutes now without rebooting, before that we had a reboot every half minute) >? >Do you know anything about that ? how can we fix it? >It?s a webcluster and such crashes aren?t good. It should be online 24/7 but right now it doesn?t look that good. >? >Thanks in advance, J?rgen -- Laurence Schuler (Larry)?????????????????????? Laurence.Schuler at nasa.gov Systems Support?????????????????????????????????????? ADNET Systems, Inc Scientific Visualization Studio???????????????? http://svs.gsfc.nasa.gov NASA/Goddard Space Flight Center, Code 606.4?????? phone: 1-301-286-1799 Greenbelt, MD 20771????????????????????????????????? fax: 1-301-286-1634 Note: I am not a government employee and have no authority to obligate any federal, state or local government to perform any action or payment. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From white.heron at yahoo.com Thu Apr 18 12:33:37 2013 From: white.heron at yahoo.com (YB Tan Sri Dato Sri' Adli a.k.a Dell) Date: Thu, 18 Apr 2013 05:33:37 -0700 (PDT) Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <516EF961.1030202@cgl.ucsf.edu> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> <516EF961.1030202@cgl.ucsf.edu> Message-ID: <1366288417.665.YahooMailNeo@web163501.mail.gq1.yahoo.com> ? Regards, YB Tan Sri Dato' Sri Adli a.k.a Dell my.linkedin.com/pub/yb-tan-sri-dato-sri-adli-a-k-a-dell/44/64b/464/ H/p number: ?(017) 362 3661 ________________________________ From: Scooter Morris To: linux clustering Sent: Thursday, April 18, 2013 3:34 AM Subject: Re: [Linux-cluster] GFS2 crashes - sys_rename There is a fix for that.? Request a patched kernel for bugzilla bug# 92299? from your RedHat support folks.? We had the same problem and the patched kernel resolved it. -- scooter On 04/17/2013 12:02 PM, 5hosting Team wrote: >Hey guys, >? >We run a 40 node webcluster (only apache, php processes) and the nodes keep on crashing with a kernel panic. For me it looks like the rename of a file/directory aint working. I found someone posting the same a few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but that?s the kernel we?re running. And we just used fsck yesterday night to check for problems with the file system. So something doesn?t seem right. >? >Here are 3 crashlogs from 3 different nodes: >Apr 17 20:20:16 001 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] >Apr 17 20:20:16 001 kernel: >Apr 17 20:20:16 001 kernel: Pid: 2915, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM >Apr 17 20:20:16 001 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:20:16 001 kernel: RSP: 0018:ffff880417e8ba58? EFLAGS: 00010283 >Apr 17 20:20:16 001 kernel: RAX: ffff8804185a3da8 RBX: 0000000000000003 RCX: 000000000db41094 >Apr 17 20:20:16 001 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff8804187ef440 >Apr 17 20:20:16 001 kernel: RBP: ffff880417e8bb18 R08: 0000000000000000 R09: 0000000000000000 >Apr 17 20:20:16 001 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff8804187ef000 >Apr 17 20:20:16 001 kernel: R13: 0000000000000000 R14: ffff88041519c3e0 R15: ffff880417e8bb78 >Apr 17 20:20:16 001 kernel: FS:? 00007f07791ff7c0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 >Apr 17 20:20:16 001 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 CR3: 0000000411e94000 CR4: 00000000001407f0 >Apr 17 20:20:16 001 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >Apr 17 20:20:16 001 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Apr 17 20:20:16 001 kernel: Process php-cgi (pid: 2915, threadinfo ffff880417e8a000, task ffff8804125cc040) >Apr 17 20:20:16 001 kernel: Stack: >Apr 17 20:20:16 001 kernel: ffff8804163dbab0 ffffffffa04012d0 ffff880417e8ba78 ffff8804163dd0c0 >Apr 17 20:20:16 001 kernel: ffff8804163dbab0 00000115a04012d0 ffff880417e8ba98 ffff8804187ef000 >Apr 17 20:20:16 001 kernel: ffff880417e8baf8 ffffffffa0402931 ffff8804185a3da8 0000000000000000 >Apr 17 20:20:16 001 kernel: Call Trace: >Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] >Apr 17 20:20:16 001 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] vfs_rename+0x3ab/0x440 >Apr 17 20:20:16 001 kernel: [] sys_renameat+0x1da/0x240 >Apr 17 20:20:16 001 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 >Apr 17 20:20:16 001 kernel: [] ? cp_new_stat+0xe4/0x100 >Apr 17 20:20:16 001 kernel: [] ? sys_newstat+0x36/0x50 >Apr 17 20:20:16 001 kernel: [] ? audit_syscall_entry+0x1d7/0x200 >Apr 17 20:20:16 001 kernel: [] sys_rename+0x1b/0x20 >Apr 17 20:20:16 001 kernel: [] system_call_fastpath+0x16/0x1b >Apr 17 20:20:16 001 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 >Apr 17 20:20:16 001 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:20:16 001 kernel: RSP >Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 >Apr 17 20:20:16 001 kernel: ---[ end trace 0647d0d2004566f6 ]--- >? >? >Apr 17 20:21:00 002 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] >Apr 17 20:21:00 002 kernel: >Apr 17 20:21:00 002 kernel: Pid: 2839, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM >Apr 17 20:21:00 002 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:21:00 002 kernel: RSP: 0000:ffff8803f518ba58? EFLAGS: 00010283 >Apr 17 20:21:00 002 kernel: RAX: ffff88041447bda8 RBX: 0000000000000003 RCX: 000000000db41094 >Apr 17 20:21:00 002 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff880414cd5440 >Apr 17 20:21:00 002 kernel: RBP: ffff8803f518bb18 R08: 0000000000000000 R09: 0000000000000000 >Apr 17 20:21:00 002 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff880414cd5000 >Apr 17 20:21:00 002 kernel: R13: 0000000000000000 R14: ffff8803f9e918c0 R15: ffff8803f518bb78 >Apr 17 20:21:00 002 kernel: FS:? 00007f6a7e8a27c0(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000 >Apr 17 20:21:00 002 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 CR3: 00000003f6313000 CR4: 00000000001407e0 >Apr 17 20:21:00 002 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >Apr 17 20:21:00 002 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Apr 17 20:21:00 002 kernel: Process php-cgi (pid: 2839, threadinfo ffff8803f518a000, task ffff8803f5189540) >Apr 17 20:21:00 002 kernel: Stack: >Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 ffffffffa04012d0 ffff8803f518ba78 ffff88041518eb60 >Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 00000115a04012d0 ffff8803f518ba98 ffff880414cd5000 >Apr 17 20:21:00 002 kernel: ffff8803f518baf8 ffffffffa0402931 ffff88041447bda8 0000000000000000 >Apr 17 20:21:00 002 kernel: Call Trace: >Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] >Apr 17 20:21:00 002 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] vfs_rename+0x3ab/0x440 >Apr 17 20:21:00 002 kernel: [] sys_renameat+0x1da/0x240 >Apr 17 20:21:00 002 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 >Apr 17 20:21:00 002 kernel: [] ? cp_new_stat+0xe4/0x100 >Apr 17 20:21:00 002 kernel: [] ? sys_newstat+0x36/0x50 >Apr 17 20:21:00 002 kernel: [] ? audit_syscall_entry+0x1d7/0x200 >Apr 17 20:21:00 002 kernel: [] sys_rename+0x1b/0x20 >Apr 17 20:21:00 002 kernel: [] system_call_fastpath+0x16/0x1b >Apr 17 20:21:00 002 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 >Apr 17 20:21:00 002 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:21:00 002 kernel: RSP >Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 >Apr 17 20:21:00 002 kernel: ---[ end trace 1425fd0e2954015a ]--- >? >? >Apr 17 20:12:49 003 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 >Apr 17 20:12:49 003 kernel: IP: [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:12:49 003 kernel: PGD 3d96fc067 PUD 3d2c0a067 PMD 0 >Apr 17 20:12:49 003 kernel: Oops: 0002 [#1] SMP >Apr 17 20:12:49 003 kernel: last sysfs file: /sys/kernel/dlm/b1/control >Apr 17 20:12:49 003 kernel: CPU 1 >Apr 17 20:12:49 003 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] >Apr 17 20:12:49 003 kernel: >Apr 17 20:12:49 003 kernel: Pid: 3386, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM >Apr 17 20:12:49 003 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:12:49 003 kernel: RSP: 0018:ffff8803d1c27a58? EFLAGS: 00010283 >Apr 17 20:12:49 003 kernel: RAX: ffff880416771da8 RBX: 0000000000000003 RCX: 000000000db41094 >Apr 17 20:12:49 003 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff88041277b440 >Apr 17 20:12:49 003 kernel: RBP: ffff8803d1c27b18 R08: 0000000000000000 R09: 0000000000000000 >Apr 17 20:12:49 003 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff88041277b000 >Apr 17 20:12:49 003 kernel: R13: 0000000000000000 R14: ffff8803a9c181c0 R15: ffff8803d1c27b78 >Apr 17 20:12:49 003 kernel: FS:? 00007fd494d017c0(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 >Apr 17 20:12:49 003 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 CR3: 00000003d170c000 CR4: 00000000001407e0 >Apr 17 20:12:49 003 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >Apr 17 20:12:49 003 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Apr 17 20:12:49 003 kernel: Process php-cgi (pid: 3386, threadinfo ffff8803d1c26000, task ffff8803d1450080) >Apr 17 20:12:49 003 kernel: Stack: >Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 ffffffffa03fe2d0 ffff8803d1c27a78 ffff8803cee73800 >Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 00000115a03fe2d0 ffff8803d1c27a98 ffff88041277b000 >Apr 17 20:12:49 003 kernel: ffff8803d1c27af8 ffffffffa03ff931 ffff880416771da8 0000000000000000 >Apr 17 20:12:49 003 kernel: Call Trace: >Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] >Apr 17 20:12:49 003 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] vfs_rename+0x3ab/0x440 >Apr 17 20:12:49 003 kernel: [] sys_renameat+0x1da/0x240 >Apr 17 20:12:49 003 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 >Apr 17 20:12:49 003 kernel: [] ? cp_new_stat+0xe4/0x100 >Apr 17 20:12:49 003 kernel: [] ? sys_newstat+0x36/0x50 >Apr 17 20:12:49 003 kernel: [] ? audit_syscall_entry+0x1d7/0x200 >Apr 17 20:12:49 003 kernel: [] sys_rename+0x1b/0x20 >Apr 17 20:12:49 003 kernel: [] system_call_fastpath+0x16/0x1b >Apr 17 20:12:49 003 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 >Apr 17 20:12:49 003 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:12:49 003 kernel: RSP >Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 >Apr 17 20:12:49 003 kernel: ---[ end trace 06b117dc4fff0890 ]--- >? >? >? >The call trace looks for me kinda the same on all nodes and after we rebooted ALL 40 nodes, the ?bug? seems to be gone and the system is running fine right now. (it?s running 20 minutes now without rebooting, before that we had a reboot every half minute) >? >Do you know anything about that ? how can we fix it? >It?s a webcluster and such crashes aren?t good. It should be online 24/7 but right now it doesn?t look that good. >? >Thanks in advance, J?rgen > > -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From white.heron at yahoo.com Thu Apr 18 12:34:06 2013 From: white.heron at yahoo.com (YB Tan Sri Dato Sri' Adli a.k.a Dell) Date: Thu, 18 Apr 2013 05:34:06 -0700 (PDT) Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <516EF94F.5080107@nasa.gov> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> <516EF94F.5080107@nasa.gov> Message-ID: <1366288446.16588.YahooMailNeo@web163506.mail.gq1.yahoo.com> ? Regards, YB Tan Sri Dato' Sri Adli a.k.a Dell my.linkedin.com/pub/yb-tan-sri-dato-sri-adli-a-k-a-dell/44/64b/464/ H/p number: ?(017) 362 3661 ________________________________ From: laurence.schuler To: linux-cluster at redhat.com Sent: Thursday, April 18, 2013 3:34 AM Subject: Re: [Linux-cluster] GFS2 crashes - sys_rename There's a similar bug about this same crash in the 358 kernel. Its a different bug. I rolled back to the previous for now, Redhat should have a fix soon. --larry On 04/17/2013 03:02 PM, 5hosting Team wrote: >Hey guys, >? >We run a 40 node webcluster (only apache, php processes) and the nodes keep on crashing with a kernel panic. For me it looks like the rename of a file/directory aint working. I found someone posting the same a few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but that?s the kernel we?re running. And we just used fsck yesterday night to check for problems with the file system. So something doesn?t seem right. >? >Here are 3 crashlogs from 3 different nodes: >Apr 17 20:20:16 001 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] >Apr 17 20:20:16 001 kernel: >Apr 17 20:20:16 001 kernel: Pid: 2915, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM >Apr 17 20:20:16 001 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:20:16 001 kernel: RSP: 0018:ffff880417e8ba58? EFLAGS: 00010283 >Apr 17 20:20:16 001 kernel: RAX: ffff8804185a3da8 RBX: 0000000000000003 RCX: 000000000db41094 >Apr 17 20:20:16 001 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff8804187ef440 >Apr 17 20:20:16 001 kernel: RBP: ffff880417e8bb18 R08: 0000000000000000 R09: 0000000000000000 >Apr 17 20:20:16 001 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff8804187ef000 >Apr 17 20:20:16 001 kernel: R13: 0000000000000000 R14: ffff88041519c3e0 R15: ffff880417e8bb78 >Apr 17 20:20:16 001 kernel: FS:? 00007f07791ff7c0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 >Apr 17 20:20:16 001 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 CR3: 0000000411e94000 CR4: 00000000001407f0 >Apr 17 20:20:16 001 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >Apr 17 20:20:16 001 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Apr 17 20:20:16 001 kernel: Process php-cgi (pid: 2915, threadinfo ffff880417e8a000, task ffff8804125cc040) >Apr 17 20:20:16 001 kernel: Stack: >Apr 17 20:20:16 001 kernel: ffff8804163dbab0 ffffffffa04012d0 ffff880417e8ba78 ffff8804163dd0c0 >Apr 17 20:20:16 001 kernel: ffff8804163dbab0 00000115a04012d0 ffff880417e8ba98 ffff8804187ef000 >Apr 17 20:20:16 001 kernel: ffff880417e8baf8 ffffffffa0402931 ffff8804185a3da8 0000000000000000 >Apr 17 20:20:16 001 kernel: Call Trace: >Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] >Apr 17 20:20:16 001 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] >Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] >Apr 17 20:20:16 001 kernel: [] vfs_rename+0x3ab/0x440 >Apr 17 20:20:16 001 kernel: [] sys_renameat+0x1da/0x240 >Apr 17 20:20:16 001 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 >Apr 17 20:20:16 001 kernel: [] ? cp_new_stat+0xe4/0x100 >Apr 17 20:20:16 001 kernel: [] ? sys_newstat+0x36/0x50 >Apr 17 20:20:16 001 kernel: [] ? audit_syscall_entry+0x1d7/0x200 >Apr 17 20:20:16 001 kernel: [] sys_rename+0x1b/0x20 >Apr 17 20:20:16 001 kernel: [] system_call_fastpath+0x16/0x1b >Apr 17 20:20:16 001 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 >Apr 17 20:20:16 001 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:20:16 001 kernel: RSP >Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 >Apr 17 20:20:16 001 kernel: ---[ end trace 0647d0d2004566f6 ]--- >? >? >Apr 17 20:21:00 002 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] >Apr 17 20:21:00 002 kernel: >Apr 17 20:21:00 002 kernel: Pid: 2839, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM >Apr 17 20:21:00 002 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:21:00 002 kernel: RSP: 0000:ffff8803f518ba58? EFLAGS: 00010283 >Apr 17 20:21:00 002 kernel: RAX: ffff88041447bda8 RBX: 0000000000000003 RCX: 000000000db41094 >Apr 17 20:21:00 002 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff880414cd5440 >Apr 17 20:21:00 002 kernel: RBP: ffff8803f518bb18 R08: 0000000000000000 R09: 0000000000000000 >Apr 17 20:21:00 002 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff880414cd5000 >Apr 17 20:21:00 002 kernel: R13: 0000000000000000 R14: ffff8803f9e918c0 R15: ffff8803f518bb78 >Apr 17 20:21:00 002 kernel: FS:? 00007f6a7e8a27c0(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000 >Apr 17 20:21:00 002 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 CR3: 00000003f6313000 CR4: 00000000001407e0 >Apr 17 20:21:00 002 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >Apr 17 20:21:00 002 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Apr 17 20:21:00 002 kernel: Process php-cgi (pid: 2839, threadinfo ffff8803f518a000, task ffff8803f5189540) >Apr 17 20:21:00 002 kernel: Stack: >Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 ffffffffa04012d0 ffff8803f518ba78 ffff88041518eb60 >Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 00000115a04012d0 ffff8803f518ba98 ffff880414cd5000 >Apr 17 20:21:00 002 kernel: ffff8803f518baf8 ffffffffa0402931 ffff88041447bda8 0000000000000000 >Apr 17 20:21:00 002 kernel: Call Trace: >Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] >Apr 17 20:21:00 002 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] >Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] >Apr 17 20:21:00 002 kernel: [] vfs_rename+0x3ab/0x440 >Apr 17 20:21:00 002 kernel: [] sys_renameat+0x1da/0x240 >Apr 17 20:21:00 002 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 >Apr 17 20:21:00 002 kernel: [] ? cp_new_stat+0xe4/0x100 >Apr 17 20:21:00 002 kernel: [] ? sys_newstat+0x36/0x50 >Apr 17 20:21:00 002 kernel: [] ? audit_syscall_entry+0x1d7/0x200 >Apr 17 20:21:00 002 kernel: [] sys_rename+0x1b/0x20 >Apr 17 20:21:00 002 kernel: [] system_call_fastpath+0x16/0x1b >Apr 17 20:21:00 002 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 >Apr 17 20:21:00 002 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:21:00 002 kernel: RSP >Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 >Apr 17 20:21:00 002 kernel: ---[ end trace 1425fd0e2954015a ]--- >? >? >Apr 17 20:12:49 003 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 >Apr 17 20:12:49 003 kernel: IP: [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:12:49 003 kernel: PGD 3d96fc067 PUD 3d2c0a067 PMD 0 >Apr 17 20:12:49 003 kernel: Oops: 0002 [#1] SMP >Apr 17 20:12:49 003 kernel: last sysfs file: /sys/kernel/dlm/b1/control >Apr 17 20:12:49 003 kernel: CPU 1 >Apr 17 20:12:49 003 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] >Apr 17 20:12:49 003 kernel: >Apr 17 20:12:49 003 kernel: Pid: 3386, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM >Apr 17 20:12:49 003 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:12:49 003 kernel: RSP: 0018:ffff8803d1c27a58? EFLAGS: 00010283 >Apr 17 20:12:49 003 kernel: RAX: ffff880416771da8 RBX: 0000000000000003 RCX: 000000000db41094 >Apr 17 20:12:49 003 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff88041277b440 >Apr 17 20:12:49 003 kernel: RBP: ffff8803d1c27b18 R08: 0000000000000000 R09: 0000000000000000 >Apr 17 20:12:49 003 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff88041277b000 >Apr 17 20:12:49 003 kernel: R13: 0000000000000000 R14: ffff8803a9c181c0 R15: ffff8803d1c27b78 >Apr 17 20:12:49 003 kernel: FS:? 00007fd494d017c0(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 >Apr 17 20:12:49 003 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 CR3: 00000003d170c000 CR4: 00000000001407e0 >Apr 17 20:12:49 003 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >Apr 17 20:12:49 003 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Apr 17 20:12:49 003 kernel: Process php-cgi (pid: 3386, threadinfo ffff8803d1c26000, task ffff8803d1450080) >Apr 17 20:12:49 003 kernel: Stack: >Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 ffffffffa03fe2d0 ffff8803d1c27a78 ffff8803cee73800 >Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 00000115a03fe2d0 ffff8803d1c27a98 ffff88041277b000 >Apr 17 20:12:49 003 kernel: ffff8803d1c27af8 ffffffffa03ff931 ffff880416771da8 0000000000000000 >Apr 17 20:12:49 003 kernel: Call Trace: >Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] >Apr 17 20:12:49 003 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] >Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] >Apr 17 20:12:49 003 kernel: [] vfs_rename+0x3ab/0x440 >Apr 17 20:12:49 003 kernel: [] sys_renameat+0x1da/0x240 >Apr 17 20:12:49 003 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 >Apr 17 20:12:49 003 kernel: [] ? cp_new_stat+0xe4/0x100 >Apr 17 20:12:49 003 kernel: [] ? sys_newstat+0x36/0x50 >Apr 17 20:12:49 003 kernel: [] ? audit_syscall_entry+0x1d7/0x200 >Apr 17 20:12:49 003 kernel: [] sys_rename+0x1b/0x20 >Apr 17 20:12:49 003 kernel: [] system_call_fastpath+0x16/0x1b >Apr 17 20:12:49 003 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 >Apr 17 20:12:49 003 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] >Apr 17 20:12:49 003 kernel: RSP >Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 >Apr 17 20:12:49 003 kernel: ---[ end trace 06b117dc4fff0890 ]--- >? >? >? >The call trace looks for me kinda the same on all nodes and after we rebooted ALL 40 nodes, the ?bug? seems to be gone and the system is running fine right now. (it?s running 20 minutes now without rebooting, before that we had a reboot every half minute) >? >Do you know anything about that ? how can we fix it? >It?s a webcluster and such crashes aren?t good. It should be online 24/7 but right now it doesn?t look that good. >? >Thanks in advance, J?rgen -- Laurence Schuler (Larry) Laurence.Schuler at nasa.gov Systems Support ADNET Systems, Inc Scientific Visualization Studio http://svs.gsfc.nasa.gov NASA/Goddard Space Flight Center, Code 606.4 phone: 1-301-286-1799 Greenbelt, MD 20771 fax: 1-301-286-1634 Note: I am not a government employee and have no authority to obligate any federal, state or local government to perform any action or payment. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From white.heron at yahoo.com Thu Apr 18 12:35:00 2013 From: white.heron at yahoo.com (YB Tan Sri Dato Sri' Adli a.k.a Dell) Date: Thu, 18 Apr 2013 05:35:00 -0700 (PDT) Subject: [Linux-cluster] GFS2 crashes - sys_rename In-Reply-To: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> References: <032d01ce3b9e$1462bd90$3d2838b0$@5hosting.com> Message-ID: <1366288500.93281.YahooMailNeo@web163503.mail.gq1.yahoo.com> ? Regards, YB Tan Sri Dato' Sri Adli a.k.a Dell my.linkedin.com/pub/yb-tan-sri-dato-sri-adli-a-k-a-dell/44/64b/464/ H/p number: ?(017) 362 3661 ________________________________ From: 5hosting Team To: linux-cluster at redhat.com Sent: Thursday, April 18, 2013 3:02 AM Subject: [Linux-cluster] GFS2 crashes - sys_rename Hey guys, ? We run a 40 node webcluster (only apache, php processes) and the nodes keep on crashing with a kernel panic. For me it looks like the rename of a file/directory aint working. I found someone posting the same a few days ago and it should be fixed in kernel 2.6.32-358.2.1.el6, but that?s the kernel we?re running. And we just used fsck yesterday night to check for problems with the file system. So something doesn?t seem right. ? Here are 3 crashlogs from 3 different nodes: Apr 17 20:20:16 001 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Apr 17 20:20:16 001 kernel: Apr 17 20:20:16 001 kernel: Pid: 2915, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM Apr 17 20:20:16 001 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:20:16 001 kernel: RSP: 0018:ffff880417e8ba58? EFLAGS: 00010283 Apr 17 20:20:16 001 kernel: RAX: ffff8804185a3da8 RBX: 0000000000000003 RCX: 000000000db41094 Apr 17 20:20:16 001 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff8804187ef440 Apr 17 20:20:16 001 kernel: RBP: ffff880417e8bb18 R08: 0000000000000000 R09: 0000000000000000 Apr 17 20:20:16 001 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff8804187ef000 Apr 17 20:20:16 001 kernel: R13: 0000000000000000 R14: ffff88041519c3e0 R15: ffff880417e8bb78 Apr 17 20:20:16 001 kernel: FS:? 00007f07791ff7c0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 Apr 17 20:20:16 001 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 CR3: 0000000411e94000 CR4: 00000000001407f0 Apr 17 20:20:16 001 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 17 20:20:16 001 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 17 20:20:16 001 kernel: Process php-cgi (pid: 2915, threadinfo ffff880417e8a000, task ffff8804125cc040) Apr 17 20:20:16 001 kernel: Stack: Apr 17 20:20:16 001 kernel: ffff8804163dbab0 ffffffffa04012d0 ffff880417e8ba78 ffff8804163dd0c0 Apr 17 20:20:16 001 kernel: ffff8804163dbab0 00000115a04012d0 ffff880417e8ba98 ffff8804187ef000 Apr 17 20:20:16 001 kernel: ffff880417e8baf8 ffffffffa0402931 ffff8804185a3da8 0000000000000000 Apr 17 20:20:16 001 kernel: Call Trace: Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] Apr 17 20:20:16 001 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] Apr 17 20:20:16 001 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] Apr 17 20:20:16 001 kernel: [] vfs_rename+0x3ab/0x440 Apr 17 20:20:16 001 kernel: [] sys_renameat+0x1da/0x240 Apr 17 20:20:16 001 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 Apr 17 20:20:16 001 kernel: [] ? cp_new_stat+0xe4/0x100 Apr 17 20:20:16 001 kernel: [] ? sys_newstat+0x36/0x50 Apr 17 20:20:16 001 kernel: [] ? audit_syscall_entry+0x1d7/0x200 Apr 17 20:20:16 001 kernel: [] sys_rename+0x1b/0x20 Apr 17 20:20:16 001 kernel: [] system_call_fastpath+0x16/0x1b Apr 17 20:20:16 001 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 Apr 17 20:20:16 001 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:20:16 001 kernel: RSP Apr 17 20:20:16 001 kernel: CR2: 0000000000000060 Apr 17 20:20:16 001 kernel: ---[ end trace 0647d0d2004566f6 ]--- ? ? Apr 17 20:21:00 002 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Apr 17 20:21:00 002 kernel: Apr 17 20:21:00 002 kernel: Pid: 2839, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM Apr 17 20:21:00 002 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:21:00 002 kernel: RSP: 0000:ffff8803f518ba58? EFLAGS: 00010283 Apr 17 20:21:00 002 kernel: RAX: ffff88041447bda8 RBX: 0000000000000003 RCX: 000000000db41094 Apr 17 20:21:00 002 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff880414cd5440 Apr 17 20:21:00 002 kernel: RBP: ffff8803f518bb18 R08: 0000000000000000 R09: 0000000000000000 Apr 17 20:21:00 002 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff880414cd5000 Apr 17 20:21:00 002 kernel: R13: 0000000000000000 R14: ffff8803f9e918c0 R15: ffff8803f518bb78 Apr 17 20:21:00 002 kernel: FS:? 00007f6a7e8a27c0(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000 Apr 17 20:21:00 002 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 CR3: 00000003f6313000 CR4: 00000000001407e0 Apr 17 20:21:00 002 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 17 20:21:00 002 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 17 20:21:00 002 kernel: Process php-cgi (pid: 2839, threadinfo ffff8803f518a000, task ffff8803f5189540) Apr 17 20:21:00 002 kernel: Stack: Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 ffffffffa04012d0 ffff8803f518ba78 ffff88041518eb60 Apr 17 20:21:00 002 kernel: ffff880411bfbcb0 00000115a04012d0 ffff8803f518ba98 ffff880414cd5000 Apr 17 20:21:00 002 kernel: ffff8803f518baf8 ffffffffa0402931 ffff88041447bda8 0000000000000000 Apr 17 20:21:00 002 kernel: Call Trace: Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] Apr 17 20:21:00 002 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] Apr 17 20:21:00 002 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] Apr 17 20:21:00 002 kernel: [] vfs_rename+0x3ab/0x440 Apr 17 20:21:00 002 kernel: [] sys_renameat+0x1da/0x240 Apr 17 20:21:00 002 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 Apr 17 20:21:00 002 kernel: [] ? cp_new_stat+0xe4/0x100 Apr 17 20:21:00 002 kernel: [] ? sys_newstat+0x36/0x50 Apr 17 20:21:00 002 kernel: [] ? audit_syscall_entry+0x1d7/0x200 Apr 17 20:21:00 002 kernel: [] sys_rename+0x1b/0x20 Apr 17 20:21:00 002 kernel: [] system_call_fastpath+0x16/0x1b Apr 17 20:21:00 002 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 Apr 17 20:21:00 002 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:21:00 002 kernel: RSP Apr 17 20:21:00 002 kernel: CR2: 0000000000000060 Apr 17 20:21:00 002 kernel: ---[ end trace 1425fd0e2954015a ]--- ? ? Apr 17 20:12:49 003 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 Apr 17 20:12:49 003 kernel: IP: [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:12:49 003 kernel: PGD 3d96fc067 PUD 3d2c0a067 PMD 0 Apr 17 20:12:49 003 kernel: Oops: 0002 [#1] SMP Apr 17 20:12:49 003 kernel: last sysfs file: /sys/kernel/dlm/b1/control Apr 17 20:12:49 003 kernel: CPU 1 Apr 17 20:12:49 003 kernel: Modules linked in: gfs2 dlm configfs sg sd_mod crc_t10dif ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ahci video output e1000e dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Apr 17 20:12:49 003 kernel: Apr 17 20:12:49 003 kernel: Pid: 3386, comm: php-cgi Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM Apr 17 20:12:49 003 kernel: RIP: 0010:[]? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:12:49 003 kernel: RSP: 0018:ffff8803d1c27a58? EFLAGS: 00010283 Apr 17 20:12:49 003 kernel: RAX: ffff880416771da8 RBX: 0000000000000003 RCX: 000000000db41094 Apr 17 20:12:49 003 kernel: RDX: 000000000db41094 RSI: 000000000db21756 RDI: ffff88041277b440 Apr 17 20:12:49 003 kernel: RBP: ffff8803d1c27b18 R08: 0000000000000000 R09: 0000000000000000 Apr 17 20:12:49 003 kernel: R10: 0000000000001000 R11: 0000000000000000 R12: ffff88041277b000 Apr 17 20:12:49 003 kernel: R13: 0000000000000000 R14: ffff8803a9c181c0 R15: ffff8803d1c27b78 Apr 17 20:12:49 003 kernel: FS:? 00007fd494d017c0(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 Apr 17 20:12:49 003 kernel: CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 CR3: 00000003d170c000 CR4: 00000000001407e0 Apr 17 20:12:49 003 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 17 20:12:49 003 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 17 20:12:49 003 kernel: Process php-cgi (pid: 3386, threadinfo ffff8803d1c26000, task ffff8803d1450080) Apr 17 20:12:49 003 kernel: Stack: Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 ffffffffa03fe2d0 ffff8803d1c27a78 ffff8803cee73800 Apr 17 20:12:49 003 kernel: ffff8803a9c2e270 00000115a03fe2d0 ffff8803d1c27a98 ffff88041277b000 Apr 17 20:12:49 003 kernel: ffff8803d1c27af8 ffffffffa03ff931 ffff880416771da8 0000000000000000 Apr 17 20:12:49 003 kernel: Call Trace: Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_find_space+0x0/0x50 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_dirent_search+0x191/0x1a0 [gfs2] Apr 17 20:12:49 003 kernel: [] gfs2_rename+0x6b1/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x128/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x146/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0x16c/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_put+0x3f/0x180 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_holder_uninit+0x23/0x40 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_glock_dq_uninit+0x1e/0x30 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_permission+0x9c/0x100 [gfs2] Apr 17 20:12:49 003 kernel: [] ? gfs2_rename+0xd5/0x8c0 [gfs2] Apr 17 20:12:49 003 kernel: [] vfs_rename+0x3ab/0x440 Apr 17 20:12:49 003 kernel: [] sys_renameat+0x1da/0x240 Apr 17 20:12:49 003 kernel: [] ? _atomic_dec_and_lock+0x55/0x80 Apr 17 20:12:49 003 kernel: [] ? cp_new_stat+0xe4/0x100 Apr 17 20:12:49 003 kernel: [] ? sys_newstat+0x36/0x50 Apr 17 20:12:49 003 kernel: [] ? audit_syscall_entry+0x1d7/0x200 Apr 17 20:12:49 003 kernel: [] sys_rename+0x1b/0x20 Apr 17 20:12:49 003 kernel: [] system_call_fastpath+0x16/0x1b Apr 17 20:12:49 003 kernel: Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 Apr 17 20:12:49 003 kernel: RIP? [] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2] Apr 17 20:12:49 003 kernel: RSP Apr 17 20:12:49 003 kernel: CR2: 0000000000000060 Apr 17 20:12:49 003 kernel: ---[ end trace 06b117dc4fff0890 ]--- ? ? ? The call trace looks for me kinda the same on all nodes and after we rebooted ALL 40 nodes, the ?bug? seems to be gone and the system is running fine right now. (it?s running 20 minutes now without rebooting, before that we had a reboot every half minute) ? Do you know anything about that ? how can we fix it? It?s a webcluster and such crashes aren?t good. It should be online 24/7 but right now it doesn?t look that good. ? Thanks in advance, J?rgen -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From white.heron at yahoo.com Thu Apr 18 12:35:36 2013 From: white.heron at yahoo.com (YB Tan Sri Dato Sri' Adli a.k.a Dell) Date: Thu, 18 Apr 2013 05:35:36 -0700 (PDT) Subject: [Linux-cluster] cman error in corosync In-Reply-To: <516EDF39.3040409@redhat.com> References: <516EDF39.3040409@redhat.com> Message-ID: <1366288536.99156.YahooMailNeo@web163504.mail.gq1.yahoo.com> ? Regards, YB Tan Sri Dato' Sri Adli a.k.a Dell my.linkedin.com/pub/yb-tan-sri-dato-sri-adli-a-k-a-dell/44/64b/464/ H/p number: ?(017) 362 3661 ________________________________ From: Fabio M. Di Nitto To: linux-cluster at redhat.com Sent: Thursday, April 18, 2013 1:43 AM Subject: Re: [Linux-cluster] cman error in corosync On 4/17/2013 7:09 PM, M wrote: > I have a 4 node cluster that's running correctly aside frequent fencing > across all nodes.? Even after turning up logging, I'm not able to find > anything that stands out.? However, the following keep presenting itself > in corosync.log and I don't know to what it's referring. > > Apr 17 04:18:05 corosync [CMAN? ] memb: cmd_get_node failed: id=0, name='?' > > Originally, I thought it was complaining that in cluster.conf nodeid > starts at 1 instead of 0, but a quick test and a temporarily broken > cluster ruled that out. > > So my question is, what is this error message talking about?? It occurs > every 5 seconds so it seems to me that cman is missing something it's > looking for and I'd like to eliminate it. It?s a bug in modclusterd. It was found last week and we are in the process to fix it. Fabio -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From devin.bougie at cornell.edu Thu Apr 18 18:53:24 2013 From: devin.bougie at cornell.edu (Devin A. Bougie) Date: Thu, 18 Apr 2013 18:53:24 +0000 Subject: [Linux-cluster] HA-LVM with KVM Message-ID: <6DDAD357-C80B-45DE-88ED-B0F0E4700196@cornell.edu> Is HA-LVM supported with KVM virtual machines using raw disks on logical volumes? For example, we have a VM defined in our EL6 cluster that has /dev/vgift1/pc56 as the source for its raw virtio disk. With the following line in cluster.conf, live migration works fine as long as the logical volume is active on every cluster member. If we move the vm within a service, we gain HA-LVM but lose live migration: And of course, the following fails: Any suggestions for configuring HA-LVM with a KVM VM as described above would be greatly appreciated. Please let me know if there is any more information I can provide. Many thanks, Devin From lists at alteeve.ca Thu Apr 18 21:54:54 2013 From: lists at alteeve.ca (Digimer) Date: Thu, 18 Apr 2013 17:54:54 -0400 Subject: [Linux-cluster] HA-LVM with KVM In-Reply-To: <6DDAD357-C80B-45DE-88ED-B0F0E4700196@cornell.edu> References: <6DDAD357-C80B-45DE-88ED-B0F0E4700196@cornell.edu> Message-ID: <51706BAE.6090603@alteeve.ca> On 04/18/2013 02:53 PM, Devin A. Bougie wrote: > Is HA-LVM supported with KVM virtual machines using raw disks on logical volumes? > > For example, we have a VM defined in our EL6 cluster that has /dev/vgift1/pc56 as the source for its raw virtio disk. With the following line in cluster.conf, live migration works fine as long as the logical volume is active on every cluster member. > > > > If we move the vm within a service, we gain HA-LVM but lose live migration: > > > > > > > And of course, the following fails: > > > > > > Any suggestions for configuring HA-LVM with a KVM VM as described above would be greatly appreciated. Please let me know if there is any more information I can provide. > > Many thanks, > Devin > I use live migration of KVM VMs backed by dedicated LVs per VM. I do this by using clustered LVM (clvmd) backed by DRBD, but backed by a SAN is just fine, too. Have you tried this? I've not use "HA LVM" and am not sure if that's a description or a name. :) -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From devin.bougie at cornell.edu Fri Apr 19 14:15:08 2013 From: devin.bougie at cornell.edu (Devin A. Bougie) Date: Fri, 19 Apr 2013 14:15:08 +0000 Subject: [Linux-cluster] HA-LVM with KVM In-Reply-To: <51706BAE.6090603@alteeve.ca> References: <6DDAD357-C80B-45DE-88ED-B0F0E4700196@cornell.edu> <51706BAE.6090603@alteeve.ca> Message-ID: Hi Digimer, On Apr 18, 2013, at 5:54 PM, Digimer wrote: > I use live migration of KVM VMs backed by dedicated LVs per VM. I do this by using clustered LVM (clvmd) backed by DRBD, but backed by a SAN is just fine, too. Have you tried this? I've not use "HA LVM" and am not sure if that's a description or a name. :) Sorry for the ambiguity! By "HA-LVM," I meant an active-passive LVM configuration where the logical volume is only active on one cluster member at a time. This has been setup and works well for services as documented here: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/ap-ha-halvm-CA.html I haven't figured out yet how to do that with a vm instead of a service. In other words, we want to make sure an LV used by an individual KVM VM is only active on one cluster member at a time. Thanks for your reply, Devin > On 04/18/2013 02:53 PM, Devin A. Bougie wrote: >> Is HA-LVM supported with KVM virtual machines using raw disks on logical volumes? >> >> For example, we have a VM defined in our EL6 cluster that has /dev/vgift1/pc56 as the source for its raw virtio disk. With the following line in cluster.conf, live migration works fine as long as the logical volume is active on every cluster member. >> >> >> >> If we move the vm within a service, we gain HA-LVM but lose live migration: >> >> >> >> >> >> >> And of course, the following fails: >> >> >> >> >> >> Any suggestions for configuring HA-LVM with a KVM VM as described above would be greatly appreciated. Please let me know if there is any more information I can provide. >> >> Many thanks, >> Devin From stefan at lsd.co.za Fri Apr 19 14:41:29 2013 From: stefan at lsd.co.za (stefan at lsd.co.za) Date: Fri, 19 Apr 2013 16:41:29 +0200 Subject: [Linux-cluster] HA-LVM with KVM In-Reply-To: References: <6DDAD357-C80B-45DE-88ED-B0F0E4700196@cornell.edu> <51706BAE.6090603@alteeve.ca> Message-ID: Heys, I've built a Red Hat with KVM using HA-LVM. The real reason for doing so is the customer didn't want to purchase the resilient storage addon - which includes clvm, but only high availability. I would say the right way is with clvm. It seemed to work well for the most part - just can add some complexity by needing to have the right tags to mount if a person doesn't know that it is being used. Additionally clvm manages modifying lv's on both sides in a clustered fashion, i think there may be a risk with LVM-HA if doing so manually. hth Stefan -- Stefan Lesicnik Linux System Dynamics mail: stefan at lsd.co.za tel : +27 (0)86 111 6094 cell: +27 (0)84 951 9321 gpg : http://www.lsd.co.za/files/keys/stefan.gpg.pub On Fri, Apr 19, 2013 at 4:15 PM, Devin A. Bougie wrote: > Hi Digimer, > > On Apr 18, 2013, at 5:54 PM, Digimer wrote: > > I use live migration of KVM VMs backed by dedicated LVs per VM. I do > this by using clustered LVM (clvmd) backed by DRBD, but backed by a SAN is > just fine, too. Have you tried this? I've not use "HA LVM" and am not sure > if that's a description or a name. :) > > Sorry for the ambiguity! By "HA-LVM," I meant an active-passive LVM > configuration where the logical volume is only active on one cluster member > at a time. This has been setup and works well for services as documented > here: > > https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/ap-ha-halvm-CA.html > > I haven't figured out yet how to do that with a vm instead of a service. > In other words, we want to make sure an LV used by an individual KVM VM is > only active on one cluster member at a time. > > Thanks for your reply, > Devin > > > > On 04/18/2013 02:53 PM, Devin A. Bougie wrote: > >> Is HA-LVM supported with KVM virtual machines using raw disks on > logical volumes? > >> > >> For example, we have a VM defined in our EL6 cluster that has > /dev/vgift1/pc56 as the source for its raw virtio disk. With the following > line in cluster.conf, live migration works fine as long as the logical > volume is active on every cluster member. > >> > >> path="/gfs/cluster/vm_defs" recovery="relocate"/> > >> > >> If we move the vm within a service, we gain HA-LVM but lose live > migration: > >> > >> > >> > >> path="/gfs/cluster/vm_defs" recovery="relocate"/> > >> > >> > >> And of course, the following fails: > >> > >> path="/gfs/cluster/vm_defs" recovery="relocate"> > >> > >> > >> > >> Any suggestions for configuring HA-LVM with a KVM VM as described above > would be greatly appreciated. Please let me know if there is any more > information I can provide. > >> > >> Many thanks, > >> Devin > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at lsd.co.za Fri Apr 19 14:46:01 2013 From: stefan at lsd.co.za (Stefan Lesicnik) Date: Fri, 19 Apr 2013 16:46:01 +0200 (CEST) Subject: [Linux-cluster] 2 node technology (qdisk, sbd, sfex...) In-Reply-To: <1282216281.1439929.1366382649096.JavaMail.root@lsd.co.za> Message-ID: <252550845.1440983.1366382761388.JavaMail.root@lsd.co.za> Hi, At one point on IRC it was mentioned that in a 2 node ha cluster using a qdisk was potentially more hassle than it was worth. Is this still the case? Are there any preferred technologies in a 2 node ha cluster? I see SBD and SFEX are mentioned? Any comments on which is the right one to use in most cases? Thanks Stefan -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Fri Apr 19 16:10:06 2013 From: lists at alteeve.ca (Digimer) Date: Fri, 19 Apr 2013 12:10:06 -0400 Subject: [Linux-cluster] HA-LVM with KVM In-Reply-To: References: <6DDAD357-C80B-45DE-88ED-B0F0E4700196@cornell.edu> <51706BAE.6090603@alteeve.ca> Message-ID: <51716C5E.7020403@alteeve.ca> If it helps, I've documented how I achieve this here; https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial If you want, please share your full config and what, if any, error messages and log entries you have and I (or others) might be able to provide specific advise. digimer On 04/19/2013 10:15 AM, Devin A. Bougie wrote: > Hi Digimer, > > On Apr 18, 2013, at 5:54 PM, Digimer wrote: >> I use live migration of KVM VMs backed by dedicated LVs per VM. I do this by using clustered LVM (clvmd) backed by DRBD, but backed by a SAN is just fine, too. Have you tried this? I've not use "HA LVM" and am not sure if that's a description or a name. :) > > Sorry for the ambiguity! By "HA-LVM," I meant an active-passive LVM configuration where the logical volume is only active on one cluster member at a time. This has been setup and works well for services as documented here: > https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/ap-ha-halvm-CA.html > > I haven't figured out yet how to do that with a vm instead of a service. In other words, we want to make sure an LV used by an individual KVM VM is only active on one cluster member at a time. > > Thanks for your reply, > Devin > > >> On 04/18/2013 02:53 PM, Devin A. Bougie wrote: >>> Is HA-LVM supported with KVM virtual machines using raw disks on logical volumes? >>> >>> For example, we have a VM defined in our EL6 cluster that has /dev/vgift1/pc56 as the source for its raw virtio disk. With the following line in cluster.conf, live migration works fine as long as the logical volume is active on every cluster member. >>> >>> >>> >>> If we move the vm within a service, we gain HA-LVM but lose live migration: >>> >>> >>> >>> >>> >>> >>> And of course, the following fails: >>> >>> >>> >>> >>> >>> Any suggestions for configuring HA-LVM with a KVM VM as described above would be greatly appreciated. Please let me know if there is any more information I can provide. >>> >>> Many thanks, >>> Devin > > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From Michael.Richmond at sandisk.com Mon Apr 22 18:36:41 2013 From: Michael.Richmond at sandisk.com (Michael Richmond) Date: Mon, 22 Apr 2013 18:36:41 +0000 Subject: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components Message-ID: Hello, I am researching the new cluster stack that is scheduled to be delivered in Fedora 19. Does anyone on this list have a sense for the timeframe for this new stack to be rolled into a RHEL release? (I assume the earliest would be RHEL 7.) On the Windows platform, Microsoft Cluster Services provides a cluster-wide registry service that is basically a cluster-wide key:value store with atomic updates and support to store the registry on shared disk. The storage on shared disk allows access and use of the registry in cases where nodes are frequently joining and leaving the cluster. Are there any component(s) that can be used to provide a similar registry in the Linux cluster stack? (The current RHEL 6 stack, and/or the new Fedora 19 stack.) Thanks in advance for your information, Michael Richmond michael richmond | principal software engineer | flashsoft, sandisk | +1.408.425.6731 ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Mon Apr 22 18:59:58 2013 From: lists at alteeve.ca (Digimer) Date: Mon, 22 Apr 2013 14:59:58 -0400 Subject: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components In-Reply-To: References: Message-ID: <517588AE.5090703@alteeve.ca> On 04/22/2013 02:36 PM, Michael Richmond wrote: > Hello, > I am researching the new cluster stack that is scheduled to be delivered > in Fedora 19. Does anyone on this list have a sense for the timeframe > for this new stack to be rolled into a RHEL release? (I assume the > earliest would be RHEL 7.) > > On the Windows platform, Microsoft Cluster Services provides a > cluster-wide registry service that is basically a cluster-wide key:value > store with atomic updates and support to store the registry on shared > disk. The storage on shared disk allows access and use of the registry > in cases where nodes are frequently joining and leaving the cluster. > > Are there any component(s) that can be used to provide a similar > registry in the Linux cluster stack? (The current RHEL 6 stack, and/or > the new Fedora 19 stack.) > > Thanks in advance for your information, > Michael Richmond Hi Michael, First up, Red Hat's policy of what is coming is "we'll announce on release day". So anything else is a guess. As it is, Pacemaker is in tech-preview in RHEL 6, and the best guess is that it will be the official resource manager in RHEL 7, but it's just that, a guess. As for the registry question; I am not entirely sure what it is you are asking here (sorry, not familiar with windows). I can say that pacemaker uses something called the CIB (cluster information base) which is an XML file containing the cluster's configuration and state. It can be updated from any node and the changes will push to the other nodes immediately. Does this answer your question? The current RHEL 6 cluster is corosync + cman + rgmanager. It also uses an XML config and it can be updated from any node and push out to the other nodes. Perhaps a better way to help would be to ask what, exactly, you want to build your cluster for? Cheers -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From andrew at beekhof.net Mon Apr 22 23:37:26 2013 From: andrew at beekhof.net (Andrew Beekhof) Date: Tue, 23 Apr 2013 09:37:26 +1000 Subject: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components In-Reply-To: <517588AE.5090703@alteeve.ca> References: <517588AE.5090703@alteeve.ca> Message-ID: On 23/04/2013, at 4:59 AM, Digimer wrote: > On 04/22/2013 02:36 PM, Michael Richmond wrote: >> Hello, >> I am researching the new cluster stack that is scheduled to be delivered >> in Fedora 19. Does anyone on this list have a sense for the timeframe >> for this new stack to be rolled into a RHEL release? (I assume the >> earliest would be RHEL 7.) >> >> On the Windows platform, Microsoft Cluster Services provides a >> cluster-wide registry service that is basically a cluster-wide key:value >> store with atomic updates and support to store the registry on shared >> disk. The storage on shared disk allows access and use of the registry >> in cases where nodes are frequently joining and leaving the cluster. >> >> Are there any component(s) that can be used to provide a similar >> registry in the Linux cluster stack? (The current RHEL 6 stack, and/or >> the new Fedora 19 stack.) >> >> Thanks in advance for your information, >> Michael Richmond > > Hi Michael, > > First up, Red Hat's policy of what is coming is "we'll announce on release day". So anything else is a guess. As it is, Pacemaker is in tech-preview in RHEL 6, and the best guess is that it will be the official resource manager in RHEL 7, but it's just that, a guess. I believe we're officially allowed to say that it is our _intention_ that Pacemaker will be the one and only supported stack in RHEL7. > > As for the registry question; I am not entirely sure what it is you are asking here (sorry, not familiar with windows). I can say that pacemaker uses something called the CIB (cluster information base) which is an XML file containing the cluster's configuration and state. It can be updated from any node and the changes will push to the other nodes immediately. How many of these attributes are you planning to have? You can throw a few in there, but I'd not use it for 100's or 1000's of them - its mainly designed to store the resource/service configuration. > Does this answer your question? > > The current RHEL 6 cluster is corosync + cman + rgmanager. It also uses an XML config and it can be updated from any node and push out to the other nodes. > > Perhaps a better way to help would be to ask what, exactly, you want to build your cluster for? > > Cheers > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without access to education? > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From Ralf.Aumueller at informatik.uni-stuttgart.de Tue Apr 23 10:11:45 2013 From: Ralf.Aumueller at informatik.uni-stuttgart.de (Ralf Aumueller) Date: Tue, 23 Apr 2013 12:11:45 +0200 Subject: [Linux-cluster] Migrate VMs between two clusters Message-ID: <51765E61.8070407@informatik.uni-stuttgart.de> Hello, we have two 2-node clusters running on CentOS 6.4. Only service managed by clusters are KVM virtual machines. Both clusters have access to an NFS-server on which the config and storage files of the VMs reside. Now I want to move a VM from cluster1 to cluster2 with the following approach: On cluster1: Edit cluster.conf and remove vm1 config cman_tool version -r virsh migrate --live vm1 qemu+ssh://HOST1_CL2/system tcp:HOST1_CL2 (After live migration finished) On cluster2: Edit cluster.conf and add vm1 config cman_tool version -r clusvcadm -e vm:vm1 (I have autostart="0" so I need to enable the VM. Cluster-SW finds running vm1 and clustat is OK). I tried it and it worked -- but is it save? Any comments (expect to ask why we have two 2-node clusters :-) ) ? Best regard, Ralf From lists at alteeve.ca Tue Apr 23 15:42:35 2013 From: lists at alteeve.ca (Digimer) Date: Tue, 23 Apr 2013 11:42:35 -0400 Subject: [Linux-cluster] Migrate VMs between two clusters In-Reply-To: <51765E61.8070407@informatik.uni-stuttgart.de> References: <51765E61.8070407@informatik.uni-stuttgart.de> Message-ID: <5176ABEB.60206@alteeve.ca> On 04/23/2013 06:11 AM, Ralf Aumueller wrote: > Hello, > > we have two 2-node clusters running on CentOS 6.4. Only service managed by > clusters are KVM virtual machines. Both clusters have access to an NFS-server on > which the config and storage files of the VMs reside. > Now I want to move a VM from cluster1 to cluster2 with the following approach: > > On cluster1: > Edit cluster.conf and remove vm1 config > cman_tool version -r > virsh migrate --live vm1 qemu+ssh://HOST1_CL2/system tcp:HOST1_CL2 > > (After live migration finished) > > On cluster2: > Edit cluster.conf and add vm1 config > cman_tool version -r > clusvcadm -e vm:vm1 > (I have autostart="0" so I need to enable the VM. Cluster-SW finds running vm1 > and clustat is OK). > > I tried it and it worked -- but is it save? Any comments (expect to ask why we > have two 2-node clusters :-) ) ? > > Best regard, > Ralf Did you try; clusvcadm -M vm:foo -m target.node That is how I live-migrate VMs. No need to remove it from the cluster and then re-add it. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From keith.schincke at gmail.com Tue Apr 23 16:05:19 2013 From: keith.schincke at gmail.com (Keith Schincke) Date: Tue, 23 Apr 2013 11:05:19 -0500 Subject: [Linux-cluster] Migrate VMs between two clusters In-Reply-To: <5176ABEB.60206@alteeve.ca> References: <51765E61.8070407@informatik.uni-stuttgart.de> <5176ABEB.60206@alteeve.ca> Message-ID: Once your VM is under cluster control, you should use the cluster commands to start, stop or migrate it. Using a combination of Cluster and virsh commands can result in a corrupt VM disk image as you can end up with the VM running on multiple nodes at the same time. I messed this up once while building on of my clusters. See the below links: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/High_Availability_Add-On_Overview/s2-rgmanager-virt-features.html https://fedorahosted.org/cluster/wiki/VirtualMachineBehaviors Keith On Tue, Apr 23, 2013 at 10:42 AM, Digimer wrote: > On 04/23/2013 06:11 AM, Ralf Aumueller wrote: > >> Hello, >> >> we have two 2-node clusters running on CentOS 6.4. Only service managed by >> clusters are KVM virtual machines. Both clusters have access to an >> NFS-server on >> which the config and storage files of the VMs reside. >> Now I want to move a VM from cluster1 to cluster2 with the following >> approach: >> >> On cluster1: >> Edit cluster.conf and remove vm1 config >> cman_tool version -r >> virsh migrate --live vm1 qemu+ssh://HOST1_CL2/system tcp:HOST1_CL2 >> >> (After live migration finished) >> >> On cluster2: >> Edit cluster.conf and add vm1 config >> cman_tool version -r >> clusvcadm -e vm:vm1 >> (I have autostart="0" so I need to enable the VM. Cluster-SW finds >> running vm1 >> and clustat is OK). >> >> I tried it and it worked -- but is it save? Any comments (expect to ask >> why we >> have two 2-node clusters :-) ) ? >> >> Best regard, >> Ralf >> > > Did you try; > > clusvcadm -M vm:foo -m target.node > > That is how I live-migrate VMs. No need to remove it from the cluster and > then re-add it. > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/**mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Michael.Richmond at sandisk.com Tue Apr 23 18:07:13 2013 From: Michael.Richmond at sandisk.com (Michael Richmond) Date: Tue, 23 Apr 2013 18:07:13 +0000 Subject: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components In-Reply-To: Message-ID: Andrew and Digimer, Thank you for taking the time to respond, you have collaborated some of what I've been putting together as the likely direction. I am working on adapting some cluster-aware storage features for use in a Linux cluster environment. With this kind of project it is useful to try and predict where the Linux community is heading so that I can focus my development work on what will be the "current" cluster stack around my anticipated release dates. Any predictions are simply educated guesses that may prove to be wrong, but are useful with regard to developing plans. From my reading of various web pages and piecing things together I found that RHEL 7 is intended to be based on Fedora 18, so I assume that the new Pacemaker stack has a good chance of being rolled out in RHEL 7.1/7.2, or even possibly 7.0. Hearing that there is official word that the intention is for Pacemaker to be the official cluster stack helps me put my development plans together. The project I am working on is focused on two-node clusters. But I also need a persistent, cluster-wide data store to hold a small amount of state (less than 1KB). This data store is what I refer to as a cluster-registry. The state data records the last-known operational state for the storage feature. This last-known state helps drive recovery operations for the storage feature during node bring-up. This project is specifically aimed at integrating generic functionality into the Linux cluster stack. I have been thinking about using the cluster configuration file for this storage which I assume is the CIB referenced by Andrew. But I can imagine cases where the CIB file may loose updates if it does not utilize shared storage media. My understanding is that the CIB file is stored on each node using local disk storage. For example, consider a two-node cluster that is configured with a quorum disk on shared storage media. If at a given point in time NodeB is up and NodeB is down. NodeA can form quorate and start cluster services (including HA applications). Assume that NodeA updates the CIB to record some state update. If NodeB starts booting but before NodeB joins the cluster, NodeA crashes. At this point, the updated CIB only resides on NodeA and cannot be accessed by NodeB even if NodeB can access the quorum disk as form quorate. Effectively, NodeB cannot be aware of the update from NodeA which will result in an implicit roll-back of any updates performed by NodeA. With a two-node cluster, there are two options for resolving this: * prevent any update to the cluster registry/CIB unless all nodes are part of the cluster. (This is not practical since it undermines some of the reasons for building clusters.) * store the cluster registry on shared storage so that there is one source of truth. It is possible that the nature of the data stored in the CIB is resilient to the example scenario that I describe. In this case, maybe the CIB is not an appropriate data store for my cluster registry data. In this case I am either looking for an appropriate Linux component to use for my cluster registry, or I will build a custom data store that provides atomic update semantics on shared storage. Any thoughts and/or pointers would be appreciated. Thanks, Michael Richmond -- michael richmond | principal software engineer | flashsoft, sandisk | +1.408.425.6731 On 22/4/13 4:37 PM, "Andrew Beekhof" wrote: > >On 23/04/2013, at 4:59 AM, Digimer wrote: > >> On 04/22/2013 02:36 PM, Michael Richmond wrote: >>> Hello, >>> I am researching the new cluster stack that is scheduled to be >>>delivered >>> in Fedora 19. Does anyone on this list have a sense for the timeframe >>> for this new stack to be rolled into a RHEL release? (I assume the >>> earliest would be RHEL 7.) >>> >>> On the Windows platform, Microsoft Cluster Services provides a >>> cluster-wide registry service that is basically a cluster-wide >>>key:value >>> store with atomic updates and support to store the registry on shared >>> disk. The storage on shared disk allows access and use of the registry >>> in cases where nodes are frequently joining and leaving the cluster. >>> >>> Are there any component(s) that can be used to provide a similar >>> registry in the Linux cluster stack? (The current RHEL 6 stack, and/or >>> the new Fedora 19 stack.) >>> >>> Thanks in advance for your information, >>> Michael Richmond >> >> Hi Michael, >> >> First up, Red Hat's policy of what is coming is "we'll announce on >>release day". So anything else is a guess. As it is, Pacemaker is in >>tech-preview in RHEL 6, and the best guess is that it will be the >>official resource manager in RHEL 7, but it's just that, a guess. > >I believe we're officially allowed to say that it is our _intention_ that >Pacemaker will be the one and only supported stack in RHEL7. > >> >> As for the registry question; I am not entirely sure what it is you >>are asking here (sorry, not familiar with windows). I can say that >>pacemaker uses something called the CIB (cluster information base) which >>is an XML file containing the cluster's configuration and state. It can >>be updated from any node and the changes will push to the other nodes >>immediately. > >How many of these attributes are you planning to have? >You can throw a few in there, but I'd not use it for 100's or 1000's of >them - its mainly designed to store the resource/service configuration. > > >> Does this answer your question? >> >> The current RHEL 6 cluster is corosync + cman + rgmanager. It also >>uses an XML config and it can be updated from any node and push out to >>the other nodes. >> >> Perhaps a better way to help would be to ask what, exactly, you want >>to build your cluster for? >> >> Cheers >> >> -- >> Digimer >> Papers and Projects: https://alteeve.ca/w/ >> What if the cure for cancer is trapped in the mind of a person without >>access to education? >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). From devin.bougie at cornell.edu Tue Apr 23 22:20:45 2013 From: devin.bougie at cornell.edu (Devin A. Bougie) Date: Tue, 23 Apr 2013 22:20:45 +0000 Subject: [Linux-cluster] HA-LVM with KVM In-Reply-To: References: <6DDAD357-C80B-45DE-88ED-B0F0E4700196@cornell.edu> <51706BAE.6090603@alteeve.ca> Message-ID: <3E8EC214-1312-415F-A5B1-EDCF2927AED6@cornell.edu> Just to close out this thread, sanlock should be used to ensure a VM is only running on one node at a time: http://libvirt.org/locking.html https://access.redhat.com/site/solutions/186853 Devin On Apr 19, 2013, at 10:15 AM, Devin Bougie wrote: > On Apr 18, 2013, at 5:54 PM, Digimer wrote: >> I use live migration of KVM VMs backed by dedicated LVs per VM. I do this by using clustered LVM (clvmd) backed by DRBD, but backed by a SAN is just fine, too. Have you tried this? I've not use "HA LVM" and am not sure if that's a description or a name. :) > > Sorry for the ambiguity! By "HA-LVM," I meant an active-passive LVM configuration where the logical volume is only active on one cluster member at a time. This has been setup and works well for services as documented here: > https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/ap-ha-halvm-CA.html > > I haven't figured out yet how to do that with a vm instead of a service. In other words, we want to make sure an LV used by an individual KVM VM is only active on one cluster member at a time. > > Thanks for your reply, > Devin > > >> On 04/18/2013 02:53 PM, Devin A. Bougie wrote: >>> Is HA-LVM supported with KVM virtual machines using raw disks on logical volumes? >>> >>> For example, we have a VM defined in our EL6 cluster that has /dev/vgift1/pc56 as the source for its raw virtio disk. With the following line in cluster.conf, live migration works fine as long as the logical volume is active on every cluster member. >>> >>> >>> >>> If we move the vm within a service, we gain HA-LVM but lose live migration: >>> >>> >>> >>> >>> >>> >>> And of course, the following fails: >>> >>> >>> >>> >>> >>> Any suggestions for configuring HA-LVM with a KVM VM as described above would be greatly appreciated. Please let me know if there is any more information I can provide. >>> >>> Many thanks, >>> Devin > From lists at alteeve.ca Wed Apr 24 01:07:16 2013 From: lists at alteeve.ca (Digimer) Date: Tue, 23 Apr 2013 21:07:16 -0400 Subject: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components In-Reply-To: References: Message-ID: <51773044.3000104@alteeve.ca> First up, before I begin, I am looking to pacemaker for the future as well and do not yet use it. So please take whatever I say about pacemaker with a grain of sand. Andrew, on the other hand, is the author and anything he says can be taken as authoritative on the topic. On the future; I also have a 2-node project/product that I am working to update in time for the release of RHEL 7. Speaking entirely for myself, I can tell you that I am planning to use Pacemaker from RHEL 7.0. As a Red hat outsider, I can only speak as a member of the community, but I have every reason to believe that the pacemaker resource manager will be the one used from 7.0 and forward. As for the CIB, yes, it's a local XML file stored on each node. Synchronization occurs via updates pushed over corosync to nodes active in the cluster. As I understand it, when a node that had been offline connects to the cluster, it receives any updates to the CIB. Dealing with 2-node clusters, setting aside qdisk which has an uncertain future I believe, you can not use quorum. For this reason, it is possible for a node to boot up, fail to reach it's peer and think it's the only one running. It will start your HA services and voila, two nodes offering the same services at the same time in an uncoordinated manner. This is bad and it is called a "split-brain". The way to avoid split-brains in 2-node clusters is to use fence devices, aka stonith devices (exact same thing by two different names). This is _always_ wise to use, but in 2-node clusters, it is critical. So imagine back to your scenario; If a node came up and tried to connect to it's peer but failed to do so, before proceeding, it would fence (usually forcibly power off) the other node. Only after doing so would it start the HA services. In this way, both nodes can never be offering the same HA service at the same time. The risk here though is a "fence loop". If you set the cluster to start on boot and if there is a break in the connection, you can have an initial state where, upon the break in the network, both try to fence the other. The faster node wins, forcing the other node off and resuming to operate on it's own. This is fine and exactly what you want. However, now the fenced node powers back up, starts it's cluster stack, fails to reach it's peer and fences it. It finishes starting, offers the HA services and goes on it's way ... until the other node boots back up. :) Personally, I avoid this by _not_ starting the cluster stack on boot. My reasoning is that, if a node fails and gets rebooted, I want to check it over myself before I let it back into the cluster (I get alert emails when something like this happens). It's not a risk from an HA perspective because it's services would have recovered on the surviving peer long before it reboots anyway. This also has the added benefit of avoiding a fence loop, no matter what happens. Cheers digimer On 04/23/2013 02:07 PM, Michael Richmond wrote: > Andrew and Digimer, > Thank you for taking the time to respond, you have collaborated some of > what I've been putting together as the likely direction. > > I am working on adapting some cluster-aware storage features for use in a > Linux cluster environment. With this kind of project it is useful to try > and predict where the Linux community is heading so that I can focus my > development work on what will be the "current" cluster stack around my > anticipated release dates. Any predictions are simply educated guesses > that may prove to be wrong, but are useful with regard to developing > plans. From my reading of various web pages and piecing things together I > found that RHEL 7 is intended to be based on Fedora 18, so I assume that > the new Pacemaker stack has a good chance of being rolled out in RHEL > 7.1/7.2, or even possibly 7.0. > > Hearing that there is official word that the intention is for Pacemaker to > be the official cluster stack helps me put my development plans together. > > > The project I am working on is focused on two-node clusters. But I also > need a persistent, cluster-wide data store to hold a small amount of state > (less than 1KB). This data store is what I refer to as a cluster-registry. > The state data records the last-known operational state for the storage > feature. This last-known state helps drive recovery operations for the > storage feature during node bring-up. This project is specifically aimed > at integrating generic functionality into the Linux cluster stack. > > I have been thinking about using the cluster configuration file for this > storage which I assume is the CIB referenced by Andrew. But I can imagine > cases where the CIB file may loose updates if it does not utilize shared > storage media. My understanding is that the CIB file is stored on each > node using local disk storage. > > For example, consider a two-node cluster that is configured with a quorum > disk on shared storage media. If at a given point in time NodeB is up and > NodeB is down. NodeA can form quorate and start cluster services > (including HA applications). Assume that NodeA updates the CIB to record > some state update. If NodeB starts booting but before NodeB joins the > cluster, NodeA crashes. At this point, the updated CIB only resides on > NodeA and cannot be accessed by NodeB even if NodeB can access the quorum > disk as form quorate. Effectively, NodeB cannot be aware of the update > from NodeA which will result in an implicit roll-back of any updates > performed by NodeA. > > With a two-node cluster, there are two options for resolving this: > * prevent any update to the cluster registry/CIB unless all nodes are part > of the cluster. (This is not practical since it undermines some of the > reasons for building clusters.) > * store the cluster registry on shared storage so that there is one source > of truth. > > It is possible that the nature of the data stored in the CIB is resilient > to the example scenario that I describe. In this case, maybe the CIB is > not an appropriate data store for my cluster registry data. In this case I > am either looking for an appropriate Linux component to use for my cluster > registry, or I will build a custom data store that provides atomic update > semantics on shared storage. > > Any thoughts and/or pointers would be appreciated. > > Thanks, > Michael Richmond > > -- > michael richmond | principal software engineer | flashsoft, sandisk | > +1.408.425.6731 > > > > > On 22/4/13 4:37 PM, "Andrew Beekhof" wrote: > >> >> On 23/04/2013, at 4:59 AM, Digimer wrote: >> >>> On 04/22/2013 02:36 PM, Michael Richmond wrote: >>>> Hello, >>>> I am researching the new cluster stack that is scheduled to be >>>> delivered >>>> in Fedora 19. Does anyone on this list have a sense for the timeframe >>>> for this new stack to be rolled into a RHEL release? (I assume the >>>> earliest would be RHEL 7.) >>>> >>>> On the Windows platform, Microsoft Cluster Services provides a >>>> cluster-wide registry service that is basically a cluster-wide >>>> key:value >>>> store with atomic updates and support to store the registry on shared >>>> disk. The storage on shared disk allows access and use of the registry >>>> in cases where nodes are frequently joining and leaving the cluster. >>>> >>>> Are there any component(s) that can be used to provide a similar >>>> registry in the Linux cluster stack? (The current RHEL 6 stack, and/or >>>> the new Fedora 19 stack.) >>>> >>>> Thanks in advance for your information, >>>> Michael Richmond >>> >>> Hi Michael, >>> >>> First up, Red Hat's policy of what is coming is "we'll announce on >>> release day". So anything else is a guess. As it is, Pacemaker is in >>> tech-preview in RHEL 6, and the best guess is that it will be the >>> official resource manager in RHEL 7, but it's just that, a guess. >> >> I believe we're officially allowed to say that it is our _intention_ that >> Pacemaker will be the one and only supported stack in RHEL7. >> >>> >>> As for the registry question; I am not entirely sure what it is you >>> are asking here (sorry, not familiar with windows). I can say that >>> pacemaker uses something called the CIB (cluster information base) which >>> is an XML file containing the cluster's configuration and state. It can >>> be updated from any node and the changes will push to the other nodes >>> immediately. >> >> How many of these attributes are you planning to have? >> You can throw a few in there, but I'd not use it for 100's or 1000's of >> them - its mainly designed to store the resource/service configuration. >> >> >>> Does this answer your question? >>> >>> The current RHEL 6 cluster is corosync + cman + rgmanager. It also >>> uses an XML config and it can be updated from any node and push out to >>> the other nodes. >>> >>> Perhaps a better way to help would be to ask what, exactly, you want >>> to build your cluster for? >>> >>> Cheers >>> >>> -- >>> Digimer >>> Papers and Projects: https://alteeve.ca/w/ >>> What if the cure for cancer is trapped in the mind of a person without >>> access to education? >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > ________________________________ > > PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). > > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From Ralf.Aumueller at informatik.uni-stuttgart.de Wed Apr 24 07:02:53 2013 From: Ralf.Aumueller at informatik.uni-stuttgart.de (Ralf Aumueller) Date: Wed, 24 Apr 2013 09:02:53 +0200 Subject: [Linux-cluster] Migrate VMs between two clusters In-Reply-To: References: <51765E61.8070407@informatik.uni-stuttgart.de> <5176ABEB.60206@alteeve.ca> Message-ID: <5177839D.3070106@informatik.uni-stuttgart.de> Hello, probably my first description was not clear enough. We have two "two-node-clusters" and I want to migrate a VM from cluster1 to cluster2 (Not from node1 to node2). Best regards, Ralf From raju.rajsand at gmail.com Wed Apr 24 09:59:38 2013 From: raju.rajsand at gmail.com (Rajagopal Swaminathan) Date: Wed, 24 Apr 2013 15:29:38 +0530 Subject: [Linux-cluster] Migrate VMs between two clusters In-Reply-To: <5177839D.3070106@informatik.uni-stuttgart.de> References: <51765E61.8070407@informatik.uni-stuttgart.de> <5176ABEB.60206@alteeve.ca> <5177839D.3070106@informatik.uni-stuttgart.de> Message-ID: Greetings, On Wed, Apr 24, 2013 at 12:32 PM, Ralf Aumueller < Ralf.Aumueller at informatik.uni-stuttgart.de> wrote: > Hello, > > probably my first description was not clear enough. We have two > "two-node-clusters" and I want to migrate a VM from cluster1 to cluster2 > (Not > from node1 to node2). > I am afraid it may not be possible -- Regards, Rajagopal -------------- next part -------------- An HTML attachment was scrubbed... URL: From list at fajar.net Wed Apr 24 10:12:29 2013 From: list at fajar.net (Fajar A. Nugraha) Date: Wed, 24 Apr 2013 17:12:29 +0700 Subject: [Linux-cluster] Migrate VMs between two clusters In-Reply-To: References: <51765E61.8070407@informatik.uni-stuttgart.de> <5176ABEB.60206@alteeve.ca> <5177839D.3070106@informatik.uni-stuttgart.de> Message-ID: On Wed, Apr 24, 2013 at 4:59 PM, Rajagopal Swaminathan wrote: > Greetings, > > On Wed, Apr 24, 2013 at 12:32 PM, Ralf Aumueller > wrote: >> >> Hello, >> >> probably my first description was not clear enough. We have two >> "two-node-clusters" and I want to migrate a VM from cluster1 to cluster2 >> (Not >> from node1 to node2). > > > I am afraid it may not be possible Sure it is. As long as: - those clusters have access to same storage, on the same path (Ralf already said they share an nfs server, so it should be OK) - those clusters have access to the same network, with the same bridge name. - cluster software (or to be exact, the script used to monitor whether the VM is up or not) is smart enough to detect the VM is already up, and NOT try to start it again. -- Fajar From Michael.Richmond at sandisk.com Wed Apr 24 17:54:55 2013 From: Michael.Richmond at sandisk.com (Michael Richmond) Date: Wed, 24 Apr 2013 17:54:55 +0000 Subject: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components In-Reply-To: <51773044.3000104@alteeve.ca> Message-ID: Hi Digimer, Thanks for your detailed comments. What you have described with regard to fencing is common practice for two node clusters that I have implemented in a few proprietary cluster implementations that I have worked on. However, fencing is does not completely solve the split-brain problem in two-node clusters. There is still the potential for both NodeA and NodeB to decide to fence at the same time. In this case, each node performs the fencing operation to fence the other node with the result that both nodes get fenced. To avoid this, most clustering systems can be optionally configured with a shared resource (usually a shared LUN) that is used to weight the decision about which node gets fenced. Additionally, the shared LUN can be used as a coarse communication mechanism to aid the election of a winning node. As I'm sure you are aware, a quorum disk is typically used to determine which partition has access to the larger/important portion of the cluster resources to determine the nodes that must be fenced because they are in a separate network partition. Since you mention that qdiskd has an uncertain future, it would appear that the pacemaker-based stack has a potential functionality gap with regard to two-node clusters. That is, unless some other approach is taken to resolve network partitions. >From what I understand, the CIB is at risk for unintended roll-back of a write in the case where a two-node cluster has nodes up at differing times. For example, assuming time Time 0 Node A up Node B up (CIB contains "CIB0") Time 1 Node A up Node B down Time 2 Node A writes update to CIB Node B booting (not joined cluster) (CIB contains "CIB1") Time 3 Node A down Node B up (CIB contains "CIB0") After Time 3, Node B is operating with a CIB that contains "CIB0" and has no way of seeing the CIB contents "CIB1" written by Node A. In effect, the write by Node A was rolled-back when Node A went down. Thanks again for your input. Is there any description available about how to configure the pacemaker/chorosync stack on RHEL6.4? Regards, Michael Richmond michael richmond | principal software engineer | flashsoft, sandisk | +1.408.425.6731 On 23/4/13 6:07 PM, "Digimer" wrote: >First up, before I begin, I am looking to pacemaker for the future as >well and do not yet use it. So please take whatever I say about >pacemaker with a grain of sand. Andrew, on the other hand, is the author >and anything he says can be taken as authoritative on the topic. > >On the future; > >I also have a 2-node project/product that I am working to update in time >for the release of RHEL 7. Speaking entirely for myself, I can tell you >that I am planning to use Pacemaker from RHEL 7.0. As a Red hat >outsider, I can only speak as a member of the community, but I have >every reason to believe that the pacemaker resource manager will be the >one used from 7.0 and forward. > >As for the CIB, yes, it's a local XML file stored on each node. >Synchronization occurs via updates pushed over corosync to nodes active >in the cluster. As I understand it, when a node that had been offline >connects to the cluster, it receives any updates to the CIB. > >Dealing with 2-node clusters, setting aside qdisk which has an uncertain >future I believe, you can not use quorum. For this reason, it is >possible for a node to boot up, fail to reach it's peer and think it's >the only one running. It will start your HA services and voila, two >nodes offering the same services at the same time in an uncoordinated >manner. This is bad and it is called a "split-brain". > >The way to avoid split-brains in 2-node clusters is to use fence >devices, aka stonith devices (exact same thing by two different names). >This is _always_ wise to use, but in 2-node clusters, it is critical. > >So imagine back to your scenario; > >If a node came up and tried to connect to it's peer but failed to do so, >before proceeding, it would fence (usually forcibly power off) the other >node. Only after doing so would it start the HA services. In this way, >both nodes can never be offering the same HA service at the same time. > >The risk here though is a "fence loop". If you set the cluster to start >on boot and if there is a break in the connection, you can have an >initial state where, upon the break in the network, both try to fence >the other. The faster node wins, forcing the other node off and resuming >to operate on it's own. This is fine and exactly what you want. However, >now the fenced node powers back up, starts it's cluster stack, fails to >reach it's peer and fences it. It finishes starting, offers the HA >services and goes on it's way ... until the other node boots back up. :) > >Personally, I avoid this by _not_ starting the cluster stack on boot. My >reasoning is that, if a node fails and gets rebooted, I want to check it >over myself before I let it back into the cluster (I get alert emails >when something like this happens). It's not a risk from an HA >perspective because it's services would have recovered on the surviving >peer long before it reboots anyway. This also has the added benefit of >avoiding a fence loop, no matter what happens. > >Cheers > >digimer > >On 04/23/2013 02:07 PM, Michael Richmond wrote: >> Andrew and Digimer, >> Thank you for taking the time to respond, you have collaborated some of >> what I've been putting together as the likely direction. >> >> I am working on adapting some cluster-aware storage features for use in >>a >> Linux cluster environment. With this kind of project it is useful to try >> and predict where the Linux community is heading so that I can focus my >> development work on what will be the "current" cluster stack around my >> anticipated release dates. Any predictions are simply educated guesses >> that may prove to be wrong, but are useful with regard to developing >> plans. From my reading of various web pages and piecing things together >>I >> found that RHEL 7 is intended to be based on Fedora 18, so I assume that >> the new Pacemaker stack has a good chance of being rolled out in RHEL >> 7.1/7.2, or even possibly 7.0. >> >> Hearing that there is official word that the intention is for Pacemaker >>to >> be the official cluster stack helps me put my development plans >>together. >> >> >> The project I am working on is focused on two-node clusters. But I also >> need a persistent, cluster-wide data store to hold a small amount of >>state >> (less than 1KB). This data store is what I refer to as a >>cluster-registry. >> The state data records the last-known operational state for the storage >> feature. This last-known state helps drive recovery operations for the >> storage feature during node bring-up. This project is specifically aimed >> at integrating generic functionality into the Linux cluster stack. >> >> I have been thinking about using the cluster configuration file for this >> storage which I assume is the CIB referenced by Andrew. But I can >>imagine >> cases where the CIB file may loose updates if it does not utilize shared >> storage media. My understanding is that the CIB file is stored on each >> node using local disk storage. >> >> For example, consider a two-node cluster that is configured with a >>quorum >> disk on shared storage media. If at a given point in time NodeB is up >>and >> NodeB is down. NodeA can form quorate and start cluster services >> (including HA applications). Assume that NodeA updates the CIB to record >> some state update. If NodeB starts booting but before NodeB joins the >> cluster, NodeA crashes. At this point, the updated CIB only resides on >> NodeA and cannot be accessed by NodeB even if NodeB can access the >>quorum >> disk as form quorate. Effectively, NodeB cannot be aware of the update >> from NodeA which will result in an implicit roll-back of any updates >> performed by NodeA. >> >> With a two-node cluster, there are two options for resolving this: >> * prevent any update to the cluster registry/CIB unless all nodes are >>part >> of the cluster. (This is not practical since it undermines some of the >> reasons for building clusters.) >> * store the cluster registry on shared storage so that there is one >>source >> of truth. >> >> It is possible that the nature of the data stored in the CIB is >>resilient >> to the example scenario that I describe. In this case, maybe the CIB is >> not an appropriate data store for my cluster registry data. In this >>case I >> am either looking for an appropriate Linux component to use for my >>cluster >> registry, or I will build a custom data store that provides atomic >>update >> semantics on shared storage. >> >> Any thoughts and/or pointers would be appreciated. >> >> Thanks, >> Michael Richmond >> >> -- >> michael richmond | principal software engineer | flashsoft, sandisk | >> +1.408.425.6731 >> >> >> >> >> On 22/4/13 4:37 PM, "Andrew Beekhof" wrote: >> >>> >>> On 23/04/2013, at 4:59 AM, Digimer wrote: >>> >>>> On 04/22/2013 02:36 PM, Michael Richmond wrote: >>>>> Hello, >>>>> I am researching the new cluster stack that is scheduled to be >>>>> delivered >>>>> in Fedora 19. Does anyone on this list have a sense for the timeframe >>>>> for this new stack to be rolled into a RHEL release? (I assume the >>>>> earliest would be RHEL 7.) >>>>> >>>>> On the Windows platform, Microsoft Cluster Services provides a >>>>> cluster-wide registry service that is basically a cluster-wide >>>>> key:value >>>>> store with atomic updates and support to store the registry on shared >>>>> disk. The storage on shared disk allows access and use of the >>>>>registry >>>>> in cases where nodes are frequently joining and leaving the cluster. >>>>> >>>>> Are there any component(s) that can be used to provide a similar >>>>> registry in the Linux cluster stack? (The current RHEL 6 stack, >>>>>and/or >>>>> the new Fedora 19 stack.) >>>>> >>>>> Thanks in advance for your information, >>>>> Michael Richmond >>>> >>>> Hi Michael, >>>> >>>> First up, Red Hat's policy of what is coming is "we'll announce on >>>> release day". So anything else is a guess. As it is, Pacemaker is in >>>> tech-preview in RHEL 6, and the best guess is that it will be the >>>> official resource manager in RHEL 7, but it's just that, a guess. >>> >>> I believe we're officially allowed to say that it is our _intention_ >>>that >>> Pacemaker will be the one and only supported stack in RHEL7. >>> >>>> >>>> As for the registry question; I am not entirely sure what it is you >>>> are asking here (sorry, not familiar with windows). I can say that >>>> pacemaker uses something called the CIB (cluster information base) >>>>which >>>> is an XML file containing the cluster's configuration and state. It >>>>can >>>> be updated from any node and the changes will push to the other nodes >>>> immediately. >>> >>> How many of these attributes are you planning to have? >>> You can throw a few in there, but I'd not use it for 100's or 1000's of >>> them - its mainly designed to store the resource/service configuration. >>> >>> >>>> Does this answer your question? >>>> >>>> The current RHEL 6 cluster is corosync + cman + rgmanager. It also >>>> uses an XML config and it can be updated from any node and push out to >>>> the other nodes. >>>> >>>> Perhaps a better way to help would be to ask what, exactly, you want >>>> to build your cluster for? >>>> >>>> Cheers >>>> >>>> -- >>>> Digimer >>>> Papers and Projects: https://alteeve.ca/w/ >>>> What if the cure for cancer is trapped in the mind of a person without >>>> access to education? >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> ________________________________ >> >> PLEASE NOTE: The information contained in this electronic mail message >>is intended only for the use of the designated recipient(s) named above. >>If the reader of this message is not the intended recipient, you are >>hereby notified that you have received this message in error and that >>any review, dissemination, distribution, or copying of this message is >>strictly prohibited. If you have received this communication in error, >>please notify the sender by telephone or e-mail (as shown above) >>immediately and destroy any and all copies of this message in your >>possession (whether hard copies or electronically stored copies). >> >> > > >-- >Digimer >Papers and Projects: https://alteeve.ca/w/ >What if the cure for cancer is trapped in the mind of a person without >access to education? ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). From lists at alteeve.ca Wed Apr 24 18:21:45 2013 From: lists at alteeve.ca (Digimer) Date: Wed, 24 Apr 2013 14:21:45 -0400 Subject: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components In-Reply-To: References: Message-ID: <517822B9.60106@alteeve.ca> Hi, The way I deal with avoiding dual-fence is to put a delay into one of the nodes. For example, I can specify that if Node 1 is to be fenced, Node 2 will pause for X seconds (usually 15 in my setups). This way, if both nodes try to fence the other at the same time, Node 1 will have killed Node 2 long before 2's 15 second timer expired. However, if Node 1 really was dead, Node 2 would still fence 1 and then recover, albeit with a 15 second delay in recovery. Simple and effective. :) I'm not sure if there is a specific RHEL 6.4 + pacemaker tutorial up yet, but keep an eye on clusterlabs. I *think* Andrew is working on that. If not, I plan to go back to working on my tutorial when I return to the office in May. However, that will still be *many* months before it's done. digimer On 04/24/2013 01:54 PM, Michael Richmond wrote: > Hi Digimer, > Thanks for your detailed comments. > > What you have described with regard to fencing is common practice for two > node clusters that I have implemented in a few proprietary cluster > implementations that I have worked on. However, fencing is does not > completely solve the split-brain problem in two-node clusters. There is > still the potential for both NodeA and NodeB to decide to fence at the > same time. In this case, each node performs the fencing operation to fence > the other node with the result that both nodes get fenced. > > To avoid this, most clustering systems can be optionally configured with a > shared resource (usually a shared LUN) that is used to weight the decision > about which node gets fenced. Additionally, the shared LUN can be used as > a coarse communication mechanism to aid the election of a winning node. As > I'm sure you are aware, a quorum disk is typically used to determine which > partition has access to the larger/important portion of the cluster > resources to determine the nodes that must be fenced because they are in a > separate network partition. > > Since you mention that qdiskd has an uncertain future, it would appear > that the pacemaker-based stack has a potential functionality gap with > regard to two-node clusters. That is, unless some other approach is taken > to resolve network partitions. > > From what I understand, the CIB is at risk for unintended roll-back of a > write in the case where a two-node cluster has nodes up at differing > times. For example, assuming time > > Time 0 Node A up Node B up (CIB contains "CIB0") > Time 1 Node A up Node B down > Time 2 Node A writes update to CIB Node B booting (not joined cluster) > (CIB contains "CIB1") > Time 3 Node A down Node B up (CIB contains "CIB0") > > After Time 3, Node B is operating with a CIB that contains "CIB0" and has > no way of seeing the CIB contents "CIB1" written by Node A. In effect, the > write by Node A was rolled-back when Node A went down. > > Thanks again for your input. > > Is there any description available about how to configure the > pacemaker/chorosync stack on RHEL6.4? > > Regards, > Michael Richmond > > michael richmond | principal software engineer | flashsoft, sandisk | > +1.408.425.6731 > > > > > On 23/4/13 6:07 PM, "Digimer" wrote: > >> First up, before I begin, I am looking to pacemaker for the future as >> well and do not yet use it. So please take whatever I say about >> pacemaker with a grain of sand. Andrew, on the other hand, is the author >> and anything he says can be taken as authoritative on the topic. >> >> On the future; >> >> I also have a 2-node project/product that I am working to update in time >> for the release of RHEL 7. Speaking entirely for myself, I can tell you >> that I am planning to use Pacemaker from RHEL 7.0. As a Red hat >> outsider, I can only speak as a member of the community, but I have >> every reason to believe that the pacemaker resource manager will be the >> one used from 7.0 and forward. >> >> As for the CIB, yes, it's a local XML file stored on each node. >> Synchronization occurs via updates pushed over corosync to nodes active >> in the cluster. As I understand it, when a node that had been offline >> connects to the cluster, it receives any updates to the CIB. >> >> Dealing with 2-node clusters, setting aside qdisk which has an uncertain >> future I believe, you can not use quorum. For this reason, it is >> possible for a node to boot up, fail to reach it's peer and think it's >> the only one running. It will start your HA services and voila, two >> nodes offering the same services at the same time in an uncoordinated >> manner. This is bad and it is called a "split-brain". >> >> The way to avoid split-brains in 2-node clusters is to use fence >> devices, aka stonith devices (exact same thing by two different names). >> This is _always_ wise to use, but in 2-node clusters, it is critical. >> >> So imagine back to your scenario; >> >> If a node came up and tried to connect to it's peer but failed to do so, >> before proceeding, it would fence (usually forcibly power off) the other >> node. Only after doing so would it start the HA services. In this way, >> both nodes can never be offering the same HA service at the same time. >> >> The risk here though is a "fence loop". If you set the cluster to start >> on boot and if there is a break in the connection, you can have an >> initial state where, upon the break in the network, both try to fence >> the other. The faster node wins, forcing the other node off and resuming >> to operate on it's own. This is fine and exactly what you want. However, >> now the fenced node powers back up, starts it's cluster stack, fails to >> reach it's peer and fences it. It finishes starting, offers the HA >> services and goes on it's way ... until the other node boots back up. :) >> >> Personally, I avoid this by _not_ starting the cluster stack on boot. My >> reasoning is that, if a node fails and gets rebooted, I want to check it >> over myself before I let it back into the cluster (I get alert emails >> when something like this happens). It's not a risk from an HA >> perspective because it's services would have recovered on the surviving >> peer long before it reboots anyway. This also has the added benefit of >> avoiding a fence loop, no matter what happens. >> >> Cheers >> >> digimer >> >> On 04/23/2013 02:07 PM, Michael Richmond wrote: >>> Andrew and Digimer, >>> Thank you for taking the time to respond, you have collaborated some of >>> what I've been putting together as the likely direction. >>> >>> I am working on adapting some cluster-aware storage features for use in >>> a >>> Linux cluster environment. With this kind of project it is useful to try >>> and predict where the Linux community is heading so that I can focus my >>> development work on what will be the "current" cluster stack around my >>> anticipated release dates. Any predictions are simply educated guesses >>> that may prove to be wrong, but are useful with regard to developing >>> plans. From my reading of various web pages and piecing things together >>> I >>> found that RHEL 7 is intended to be based on Fedora 18, so I assume that >>> the new Pacemaker stack has a good chance of being rolled out in RHEL >>> 7.1/7.2, or even possibly 7.0. >>> >>> Hearing that there is official word that the intention is for Pacemaker >>> to >>> be the official cluster stack helps me put my development plans >>> together. >>> >>> >>> The project I am working on is focused on two-node clusters. But I also >>> need a persistent, cluster-wide data store to hold a small amount of >>> state >>> (less than 1KB). This data store is what I refer to as a >>> cluster-registry. >>> The state data records the last-known operational state for the storage >>> feature. This last-known state helps drive recovery operations for the >>> storage feature during node bring-up. This project is specifically aimed >>> at integrating generic functionality into the Linux cluster stack. >>> >>> I have been thinking about using the cluster configuration file for this >>> storage which I assume is the CIB referenced by Andrew. But I can >>> imagine >>> cases where the CIB file may loose updates if it does not utilize shared >>> storage media. My understanding is that the CIB file is stored on each >>> node using local disk storage. >>> >>> For example, consider a two-node cluster that is configured with a >>> quorum >>> disk on shared storage media. If at a given point in time NodeB is up >>> and >>> NodeB is down. NodeA can form quorate and start cluster services >>> (including HA applications). Assume that NodeA updates the CIB to record >>> some state update. If NodeB starts booting but before NodeB joins the >>> cluster, NodeA crashes. At this point, the updated CIB only resides on >>> NodeA and cannot be accessed by NodeB even if NodeB can access the >>> quorum >>> disk as form quorate. Effectively, NodeB cannot be aware of the update >>> from NodeA which will result in an implicit roll-back of any updates >>> performed by NodeA. >>> >>> With a two-node cluster, there are two options for resolving this: >>> * prevent any update to the cluster registry/CIB unless all nodes are >>> part >>> of the cluster. (This is not practical since it undermines some of the >>> reasons for building clusters.) >>> * store the cluster registry on shared storage so that there is one >>> source >>> of truth. >>> >>> It is possible that the nature of the data stored in the CIB is >>> resilient >>> to the example scenario that I describe. In this case, maybe the CIB is >>> not an appropriate data store for my cluster registry data. In this >>> case I >>> am either looking for an appropriate Linux component to use for my >>> cluster >>> registry, or I will build a custom data store that provides atomic >>> update >>> semantics on shared storage. >>> >>> Any thoughts and/or pointers would be appreciated. >>> >>> Thanks, >>> Michael Richmond >>> >>> -- >>> michael richmond | principal software engineer | flashsoft, sandisk | >>> +1.408.425.6731 >>> >>> >>> >>> >>> On 22/4/13 4:37 PM, "Andrew Beekhof" wrote: >>> >>>> >>>> On 23/04/2013, at 4:59 AM, Digimer wrote: >>>> >>>>> On 04/22/2013 02:36 PM, Michael Richmond wrote: >>>>>> Hello, >>>>>> I am researching the new cluster stack that is scheduled to be >>>>>> delivered >>>>>> in Fedora 19. Does anyone on this list have a sense for the timeframe >>>>>> for this new stack to be rolled into a RHEL release? (I assume the >>>>>> earliest would be RHEL 7.) >>>>>> >>>>>> On the Windows platform, Microsoft Cluster Services provides a >>>>>> cluster-wide registry service that is basically a cluster-wide >>>>>> key:value >>>>>> store with atomic updates and support to store the registry on shared >>>>>> disk. The storage on shared disk allows access and use of the >>>>>> registry >>>>>> in cases where nodes are frequently joining and leaving the cluster. >>>>>> >>>>>> Are there any component(s) that can be used to provide a similar >>>>>> registry in the Linux cluster stack? (The current RHEL 6 stack, >>>>>> and/or >>>>>> the new Fedora 19 stack.) >>>>>> >>>>>> Thanks in advance for your information, >>>>>> Michael Richmond >>>>> >>>>> Hi Michael, >>>>> >>>>> First up, Red Hat's policy of what is coming is "we'll announce on >>>>> release day". So anything else is a guess. As it is, Pacemaker is in >>>>> tech-preview in RHEL 6, and the best guess is that it will be the >>>>> official resource manager in RHEL 7, but it's just that, a guess. >>>> >>>> I believe we're officially allowed to say that it is our _intention_ >>>> that >>>> Pacemaker will be the one and only supported stack in RHEL7. >>>> >>>>> >>>>> As for the registry question; I am not entirely sure what it is you >>>>> are asking here (sorry, not familiar with windows). I can say that >>>>> pacemaker uses something called the CIB (cluster information base) >>>>> which >>>>> is an XML file containing the cluster's configuration and state. It >>>>> can >>>>> be updated from any node and the changes will push to the other nodes >>>>> immediately. >>>> >>>> How many of these attributes are you planning to have? >>>> You can throw a few in there, but I'd not use it for 100's or 1000's of >>>> them - its mainly designed to store the resource/service configuration. >>>> >>>> >>>>> Does this answer your question? >>>>> >>>>> The current RHEL 6 cluster is corosync + cman + rgmanager. It also >>>>> uses an XML config and it can be updated from any node and push out to >>>>> the other nodes. >>>>> >>>>> Perhaps a better way to help would be to ask what, exactly, you want >>>>> to build your cluster for? >>>>> >>>>> Cheers >>>>> >>>>> -- >>>>> Digimer >>>>> Papers and Projects: https://alteeve.ca/w/ >>>>> What if the cure for cancer is trapped in the mind of a person without >>>>> access to education? >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> ________________________________ >>> >>> PLEASE NOTE: The information contained in this electronic mail message >>> is intended only for the use of the designated recipient(s) named above. >>> If the reader of this message is not the intended recipient, you are >>> hereby notified that you have received this message in error and that >>> any review, dissemination, distribution, or copying of this message is >>> strictly prohibited. If you have received this communication in error, >>> please notify the sender by telephone or e-mail (as shown above) >>> immediately and destroy any and all copies of this message in your >>> possession (whether hard copies or electronically stored copies). >>> >>> >> >> >> -- >> Digimer >> Papers and Projects: https://alteeve.ca/w/ >> What if the cure for cancer is trapped in the mind of a person without >> access to education? > > > ________________________________ > > PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From JMaxwell at pbp1.com Thu Apr 25 19:21:40 2013 From: JMaxwell at pbp1.com (Maxwell, Jamison [HDS]) Date: Thu, 25 Apr 2013 15:21:40 -0400 Subject: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components In-Reply-To: <517822B9.60106@alteeve.ca> References: <517822B9.60106@alteeve.ca> Message-ID: <8D3E04735E183443BE64F98A192E58160312FE4CAA@GHDMBX04.hsi.hughessupply.com> Genius! Jamison Maxwell Sr. Systems Administrator -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer Sent: Wednesday, April 24, 2013 2:22 PM To: Michael Richmond Cc: linux clustering Subject: Re: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components Hi, The way I deal with avoiding dual-fence is to put a delay into one of the nodes. For example, I can specify that if Node 1 is to be fenced, Node 2 will pause for X seconds (usually 15 in my setups). This way, if both nodes try to fence the other at the same time, Node 1 will have killed Node 2 long before 2's 15 second timer expired. However, if Node 1 really was dead, Node 2 would still fence 1 and then recover, albeit with a 15 second delay in recovery. Simple and effective. :) I'm not sure if there is a specific RHEL 6.4 + pacemaker tutorial up yet, but keep an eye on clusterlabs. I *think* Andrew is working on that. If not, I plan to go back to working on my tutorial when I return to the office in May. However, that will still be *many* months before it's done. digimer On 04/24/2013 01:54 PM, Michael Richmond wrote: > Hi Digimer, > Thanks for your detailed comments. > > What you have described with regard to fencing is common practice for > two node clusters that I have implemented in a few proprietary cluster > implementations that I have worked on. However, fencing is does not > completely solve the split-brain problem in two-node clusters. There > is still the potential for both NodeA and NodeB to decide to fence at > the same time. In this case, each node performs the fencing operation > to fence the other node with the result that both nodes get fenced. > > To avoid this, most clustering systems can be optionally configured > with a shared resource (usually a shared LUN) that is used to weight > the decision about which node gets fenced. Additionally, the shared > LUN can be used as a coarse communication mechanism to aid the > election of a winning node. As I'm sure you are aware, a quorum disk > is typically used to determine which partition has access to the > larger/important portion of the cluster resources to determine the > nodes that must be fenced because they are in a separate network partition. > > Since you mention that qdiskd has an uncertain future, it would appear > that the pacemaker-based stack has a potential functionality gap with > regard to two-node clusters. That is, unless some other approach is > taken to resolve network partitions. > > From what I understand, the CIB is at risk for unintended roll-back > of a write in the case where a two-node cluster has nodes up at > differing times. For example, assuming time > > Time 0 Node A up Node B up (CIB contains "CIB0") > Time 1 Node A up Node B down > Time 2 Node A writes update to CIB Node B booting (not joined cluster) > (CIB contains "CIB1") > Time 3 Node A down Node B up (CIB contains "CIB0") > > After Time 3, Node B is operating with a CIB that contains "CIB0" and > has no way of seeing the CIB contents "CIB1" written by Node A. In > effect, the write by Node A was rolled-back when Node A went down. > > Thanks again for your input. > > Is there any description available about how to configure the > pacemaker/chorosync stack on RHEL6.4? > > Regards, > Michael Richmond > > michael richmond | principal software engineer | flashsoft, sandisk | > +1.408.425.6731 > > > > > On 23/4/13 6:07 PM, "Digimer" wrote: > >> First up, before I begin, I am looking to pacemaker for the future as >> well and do not yet use it. So please take whatever I say about >> pacemaker with a grain of sand. Andrew, on the other hand, is the >> author and anything he says can be taken as authoritative on the topic. >> >> On the future; >> >> I also have a 2-node project/product that I am working to update in >> time for the release of RHEL 7. Speaking entirely for myself, I can >> tell you that I am planning to use Pacemaker from RHEL 7.0. As a Red >> hat outsider, I can only speak as a member of the community, but I >> have every reason to believe that the pacemaker resource manager will >> be the one used from 7.0 and forward. >> >> As for the CIB, yes, it's a local XML file stored on each node. >> Synchronization occurs via updates pushed over corosync to nodes >> active in the cluster. As I understand it, when a node that had been >> offline connects to the cluster, it receives any updates to the CIB. >> >> Dealing with 2-node clusters, setting aside qdisk which has an >> uncertain future I believe, you can not use quorum. For this reason, >> it is possible for a node to boot up, fail to reach it's peer and >> think it's the only one running. It will start your HA services and >> voila, two nodes offering the same services at the same time in an >> uncoordinated manner. This is bad and it is called a "split-brain". >> >> The way to avoid split-brains in 2-node clusters is to use fence >> devices, aka stonith devices (exact same thing by two different names). >> This is _always_ wise to use, but in 2-node clusters, it is critical. >> >> So imagine back to your scenario; >> >> If a node came up and tried to connect to it's peer but failed to do >> so, before proceeding, it would fence (usually forcibly power off) >> the other node. Only after doing so would it start the HA services. >> In this way, both nodes can never be offering the same HA service at the same time. >> >> The risk here though is a "fence loop". If you set the cluster to >> start on boot and if there is a break in the connection, you can have >> an initial state where, upon the break in the network, both try to >> fence the other. The faster node wins, forcing the other node off and >> resuming to operate on it's own. This is fine and exactly what you >> want. However, now the fenced node powers back up, starts it's >> cluster stack, fails to reach it's peer and fences it. It finishes >> starting, offers the HA services and goes on it's way ... until the >> other node boots back up. :) >> >> Personally, I avoid this by _not_ starting the cluster stack on boot. >> My reasoning is that, if a node fails and gets rebooted, I want to >> check it over myself before I let it back into the cluster (I get >> alert emails when something like this happens). It's not a risk from >> an HA perspective because it's services would have recovered on the >> surviving peer long before it reboots anyway. This also has the added >> benefit of avoiding a fence loop, no matter what happens. >> >> Cheers >> >> digimer >> >> On 04/23/2013 02:07 PM, Michael Richmond wrote: >>> Andrew and Digimer, >>> Thank you for taking the time to respond, you have collaborated some >>> of what I've been putting together as the likely direction. >>> >>> I am working on adapting some cluster-aware storage features for use >>> in a Linux cluster environment. With this kind of project it is >>> useful to try and predict where the Linux community is heading so >>> that I can focus my development work on what will be the "current" >>> cluster stack around my anticipated release dates. Any predictions >>> are simply educated guesses that may prove to be wrong, but are >>> useful with regard to developing plans. From my reading of various >>> web pages and piecing things together I found that RHEL 7 is >>> intended to be based on Fedora 18, so I assume that the new >>> Pacemaker stack has a good chance of being rolled out in RHEL >>> 7.1/7.2, or even possibly 7.0. >>> >>> Hearing that there is official word that the intention is for >>> Pacemaker to be the official cluster stack helps me put my >>> development plans together. >>> >>> >>> The project I am working on is focused on two-node clusters. But I >>> also need a persistent, cluster-wide data store to hold a small >>> amount of state (less than 1KB). This data store is what I refer to >>> as a cluster-registry. >>> The state data records the last-known operational state for the >>> storage feature. This last-known state helps drive recovery >>> operations for the storage feature during node bring-up. This >>> project is specifically aimed at integrating generic functionality into the Linux cluster stack. >>> >>> I have been thinking about using the cluster configuration file for >>> this storage which I assume is the CIB referenced by Andrew. But I >>> can imagine cases where the CIB file may loose updates if it does >>> not utilize shared storage media. My understanding is that the CIB >>> file is stored on each node using local disk storage. >>> >>> For example, consider a two-node cluster that is configured with a >>> quorum disk on shared storage media. If at a given point in time >>> NodeB is up and NodeB is down. NodeA can form quorate and start >>> cluster services (including HA applications). Assume that NodeA >>> updates the CIB to record some state update. If NodeB starts booting >>> but before NodeB joins the cluster, NodeA crashes. At this point, >>> the updated CIB only resides on NodeA and cannot be accessed by >>> NodeB even if NodeB can access the quorum disk as form quorate. >>> Effectively, NodeB cannot be aware of the update from NodeA which >>> will result in an implicit roll-back of any updates performed by >>> NodeA. >>> >>> With a two-node cluster, there are two options for resolving this: >>> * prevent any update to the cluster registry/CIB unless all nodes >>> are part of the cluster. (This is not practical since it undermines >>> some of the reasons for building clusters.) >>> * store the cluster registry on shared storage so that there is one >>> source of truth. >>> >>> It is possible that the nature of the data stored in the CIB is >>> resilient to the example scenario that I describe. In this case, >>> maybe the CIB is not an appropriate data store for my cluster >>> registry data. In this case I am either looking for an appropriate >>> Linux component to use for my cluster registry, or I will build a >>> custom data store that provides atomic update semantics on shared >>> storage. >>> >>> Any thoughts and/or pointers would be appreciated. >>> >>> Thanks, >>> Michael Richmond >>> >>> -- >>> michael richmond | principal software engineer | flashsoft, sandisk >>> | >>> +1.408.425.6731 >>> >>> >>> >>> >>> On 22/4/13 4:37 PM, "Andrew Beekhof" wrote: >>> >>>> >>>> On 23/04/2013, at 4:59 AM, Digimer wrote: >>>> >>>>> On 04/22/2013 02:36 PM, Michael Richmond wrote: >>>>>> Hello, >>>>>> I am researching the new cluster stack that is scheduled to be >>>>>> delivered in Fedora 19. Does anyone on this list have a sense for >>>>>> the timeframe for this new stack to be rolled into a RHEL >>>>>> release? (I assume the earliest would be RHEL 7.) >>>>>> >>>>>> On the Windows platform, Microsoft Cluster Services provides a >>>>>> cluster-wide registry service that is basically a cluster-wide >>>>>> key:value store with atomic updates and support to store the >>>>>> registry on shared disk. The storage on shared disk allows >>>>>> access and use of the registry in cases where nodes are >>>>>> frequently joining and leaving the cluster. >>>>>> >>>>>> Are there any component(s) that can be used to provide a similar >>>>>> registry in the Linux cluster stack? (The current RHEL 6 stack, >>>>>> and/or the new Fedora 19 stack.) >>>>>> >>>>>> Thanks in advance for your information, Michael Richmond >>>>> >>>>> Hi Michael, >>>>> >>>>> First up, Red Hat's policy of what is coming is "we'll announce >>>>> on release day". So anything else is a guess. As it is, Pacemaker >>>>> is in tech-preview in RHEL 6, and the best guess is that it will >>>>> be the official resource manager in RHEL 7, but it's just that, a guess. >>>> >>>> I believe we're officially allowed to say that it is our >>>> _intention_ that Pacemaker will be the one and only supported stack >>>> in RHEL7. >>>> >>>>> >>>>> As for the registry question; I am not entirely sure what it is >>>>> you are asking here (sorry, not familiar with windows). I can say >>>>> that pacemaker uses something called the CIB (cluster information >>>>> base) which is an XML file containing the cluster's configuration >>>>> and state. It can be updated from any node and the changes will >>>>> push to the other nodes immediately. >>>> >>>> How many of these attributes are you planning to have? >>>> You can throw a few in there, but I'd not use it for 100's or >>>> 1000's of them - its mainly designed to store the resource/service configuration. >>>> >>>> >>>>> Does this answer your question? >>>>> >>>>> The current RHEL 6 cluster is corosync + cman + rgmanager. It >>>>> also uses an XML config and it can be updated from any node and >>>>> push out to the other nodes. >>>>> >>>>> Perhaps a better way to help would be to ask what, exactly, you >>>>> want to build your cluster for? >>>>> >>>>> Cheers >>>>> >>>>> -- >>>>> Digimer >>>>> Papers and Projects: https://alteeve.ca/w/ What if the cure for >>>>> cancer is trapped in the mind of a person without access to >>>>> education? >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> ________________________________ >>> >>> PLEASE NOTE: The information contained in this electronic mail >>> message is intended only for the use of the designated recipient(s) named above. >>> If the reader of this message is not the intended recipient, you are >>> hereby notified that you have received this message in error and >>> that any review, dissemination, distribution, or copying of this >>> message is strictly prohibited. If you have received this >>> communication in error, please notify the sender by telephone or >>> e-mail (as shown above) immediately and destroy any and all copies >>> of this message in your possession (whether hard copies or electronically stored copies). >>> >>> >> >> >> -- >> Digimer >> Papers and Projects: https://alteeve.ca/w/ What if the cure for >> cancer is trapped in the mind of a person without access to >> education? > > > ________________________________ > > PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From andrew at beekhof.net Thu Apr 25 23:28:20 2013 From: andrew at beekhof.net (Andrew Beekhof) Date: Fri, 26 Apr 2013 09:28:20 +1000 Subject: [Linux-cluster] Fedora 19 cluster stack and Cluster registry components In-Reply-To: <517822B9.60106@alteeve.ca> References: <517822B9.60106@alteeve.ca> Message-ID: <7128ECEE-1149-4F87-879D-BA2D70B234BF@beekhof.net> On 25/04/2013, at 4:21 AM, Digimer wrote: > Hi, > > The way I deal with avoiding dual-fence is to put a delay into one of the nodes. For example, I can specify that if Node 1 is to be fenced, Node 2 will pause for X seconds (usually 15 in my setups). This way, if both nodes try to fence the other at the same time, Node 1 will have killed Node 2 long before 2's 15 second timer expired. However, if Node 1 really was dead, Node 2 would still fence 1 and then recover, albeit with a 15 second delay in recovery. Simple and effective. :) > > I'm not sure if there is a specific RHEL 6.4 + pacemaker tutorial up yet, but keep an eye on clusterlabs. I *think* Andrew is working on that. There is a rhel-6 quickstart and "clusters from scratch" document that includes cman is also applicable > If not, I plan to go back to working on my tutorial when I return to the office in May. However, that will still be *many* months before it's done. > > digimer > > On 04/24/2013 01:54 PM, Michael Richmond wrote: >> Hi Digimer, >> Thanks for your detailed comments. >> >> What you have described with regard to fencing is common practice for two >> node clusters that I have implemented in a few proprietary cluster >> implementations that I have worked on. However, fencing is does not >> completely solve the split-brain problem in two-node clusters. There is >> still the potential for both NodeA and NodeB to decide to fence at the >> same time. In this case, each node performs the fencing operation to fence >> the other node with the result that both nodes get fenced. >> >> To avoid this, most clustering systems can be optionally configured with a >> shared resource (usually a shared LUN) that is used to weight the decision >> about which node gets fenced. Additionally, the shared LUN can be used as >> a coarse communication mechanism to aid the election of a winning node. As >> I'm sure you are aware, a quorum disk is typically used to determine which >> partition has access to the larger/important portion of the cluster >> resources to determine the nodes that must be fenced because they are in a >> separate network partition. >> >> Since you mention that qdiskd has an uncertain future, it would appear >> that the pacemaker-based stack has a potential functionality gap with >> regard to two-node clusters. That is, unless some other approach is taken >> to resolve network partitions. >> >> From what I understand, the CIB is at risk for unintended roll-back of a >> write in the case where a two-node cluster has nodes up at differing >> times. For example, assuming time >> >> Time 0 Node A up Node B up (CIB contains "CIB0") >> Time 1 Node A up Node B down >> Time 2 Node A writes update to CIB Node B booting (not joined cluster) >> (CIB contains "CIB1") >> Time 3 Node A down Node B up (CIB contains "CIB0") >> >> After Time 3, Node B is operating with a CIB that contains "CIB0" and has >> no way of seeing the CIB contents "CIB1" written by Node A. In effect, the >> write by Node A was rolled-back when Node A went down. >> >> Thanks again for your input. >> >> Is there any description available about how to configure the >> pacemaker/chorosync stack on RHEL6.4? >> >> Regards, >> Michael Richmond >> >> michael richmond | principal software engineer | flashsoft, sandisk | >> +1.408.425.6731 >> >> >> >> >> On 23/4/13 6:07 PM, "Digimer" wrote: >> >>> First up, before I begin, I am looking to pacemaker for the future as >>> well and do not yet use it. So please take whatever I say about >>> pacemaker with a grain of sand. Andrew, on the other hand, is the author >>> and anything he says can be taken as authoritative on the topic. >>> >>> On the future; >>> >>> I also have a 2-node project/product that I am working to update in time >>> for the release of RHEL 7. Speaking entirely for myself, I can tell you >>> that I am planning to use Pacemaker from RHEL 7.0. As a Red hat >>> outsider, I can only speak as a member of the community, but I have >>> every reason to believe that the pacemaker resource manager will be the >>> one used from 7.0 and forward. >>> >>> As for the CIB, yes, it's a local XML file stored on each node. >>> Synchronization occurs via updates pushed over corosync to nodes active >>> in the cluster. As I understand it, when a node that had been offline >>> connects to the cluster, it receives any updates to the CIB. >>> >>> Dealing with 2-node clusters, setting aside qdisk which has an uncertain >>> future I believe, you can not use quorum. For this reason, it is >>> possible for a node to boot up, fail to reach it's peer and think it's >>> the only one running. It will start your HA services and voila, two >>> nodes offering the same services at the same time in an uncoordinated >>> manner. This is bad and it is called a "split-brain". >>> >>> The way to avoid split-brains in 2-node clusters is to use fence >>> devices, aka stonith devices (exact same thing by two different names). >>> This is _always_ wise to use, but in 2-node clusters, it is critical. >>> >>> So imagine back to your scenario; >>> >>> If a node came up and tried to connect to it's peer but failed to do so, >>> before proceeding, it would fence (usually forcibly power off) the other >>> node. Only after doing so would it start the HA services. In this way, >>> both nodes can never be offering the same HA service at the same time. >>> >>> The risk here though is a "fence loop". If you set the cluster to start >>> on boot and if there is a break in the connection, you can have an >>> initial state where, upon the break in the network, both try to fence >>> the other. The faster node wins, forcing the other node off and resuming >>> to operate on it's own. This is fine and exactly what you want. However, >>> now the fenced node powers back up, starts it's cluster stack, fails to >>> reach it's peer and fences it. It finishes starting, offers the HA >>> services and goes on it's way ... until the other node boots back up. :) >>> >>> Personally, I avoid this by _not_ starting the cluster stack on boot. My >>> reasoning is that, if a node fails and gets rebooted, I want to check it >>> over myself before I let it back into the cluster (I get alert emails >>> when something like this happens). It's not a risk from an HA >>> perspective because it's services would have recovered on the surviving >>> peer long before it reboots anyway. This also has the added benefit of >>> avoiding a fence loop, no matter what happens. >>> >>> Cheers >>> >>> digimer >>> >>> On 04/23/2013 02:07 PM, Michael Richmond wrote: >>>> Andrew and Digimer, >>>> Thank you for taking the time to respond, you have collaborated some of >>>> what I've been putting together as the likely direction. >>>> >>>> I am working on adapting some cluster-aware storage features for use in >>>> a >>>> Linux cluster environment. With this kind of project it is useful to try >>>> and predict where the Linux community is heading so that I can focus my >>>> development work on what will be the "current" cluster stack around my >>>> anticipated release dates. Any predictions are simply educated guesses >>>> that may prove to be wrong, but are useful with regard to developing >>>> plans. From my reading of various web pages and piecing things together >>>> I >>>> found that RHEL 7 is intended to be based on Fedora 18, so I assume that >>>> the new Pacemaker stack has a good chance of being rolled out in RHEL >>>> 7.1/7.2, or even possibly 7.0. >>>> >>>> Hearing that there is official word that the intention is for Pacemaker >>>> to >>>> be the official cluster stack helps me put my development plans >>>> together. >>>> >>>> >>>> The project I am working on is focused on two-node clusters. But I also >>>> need a persistent, cluster-wide data store to hold a small amount of >>>> state >>>> (less than 1KB). This data store is what I refer to as a >>>> cluster-registry. >>>> The state data records the last-known operational state for the storage >>>> feature. This last-known state helps drive recovery operations for the >>>> storage feature during node bring-up. This project is specifically aimed >>>> at integrating generic functionality into the Linux cluster stack. >>>> >>>> I have been thinking about using the cluster configuration file for this >>>> storage which I assume is the CIB referenced by Andrew. But I can >>>> imagine >>>> cases where the CIB file may loose updates if it does not utilize shared >>>> storage media. My understanding is that the CIB file is stored on each >>>> node using local disk storage. >>>> >>>> For example, consider a two-node cluster that is configured with a >>>> quorum >>>> disk on shared storage media. If at a given point in time NodeB is up >>>> and >>>> NodeB is down. NodeA can form quorate and start cluster services >>>> (including HA applications). Assume that NodeA updates the CIB to record >>>> some state update. If NodeB starts booting but before NodeB joins the >>>> cluster, NodeA crashes. At this point, the updated CIB only resides on >>>> NodeA and cannot be accessed by NodeB even if NodeB can access the >>>> quorum >>>> disk as form quorate. Effectively, NodeB cannot be aware of the update >>>> from NodeA which will result in an implicit roll-back of any updates >>>> performed by NodeA. >>>> >>>> With a two-node cluster, there are two options for resolving this: >>>> * prevent any update to the cluster registry/CIB unless all nodes are >>>> part >>>> of the cluster. (This is not practical since it undermines some of the >>>> reasons for building clusters.) >>>> * store the cluster registry on shared storage so that there is one >>>> source >>>> of truth. >>>> >>>> It is possible that the nature of the data stored in the CIB is >>>> resilient >>>> to the example scenario that I describe. In this case, maybe the CIB is >>>> not an appropriate data store for my cluster registry data. In this >>>> case I >>>> am either looking for an appropriate Linux component to use for my >>>> cluster >>>> registry, or I will build a custom data store that provides atomic >>>> update >>>> semantics on shared storage. >>>> >>>> Any thoughts and/or pointers would be appreciated. >>>> >>>> Thanks, >>>> Michael Richmond >>>> >>>> -- >>>> michael richmond | principal software engineer | flashsoft, sandisk | >>>> +1.408.425.6731 >>>> >>>> >>>> >>>> >>>> On 22/4/13 4:37 PM, "Andrew Beekhof" wrote: >>>> >>>>> >>>>> On 23/04/2013, at 4:59 AM, Digimer wrote: >>>>> >>>>>> On 04/22/2013 02:36 PM, Michael Richmond wrote: >>>>>>> Hello, >>>>>>> I am researching the new cluster stack that is scheduled to be >>>>>>> delivered >>>>>>> in Fedora 19. Does anyone on this list have a sense for the timeframe >>>>>>> for this new stack to be rolled into a RHEL release? (I assume the >>>>>>> earliest would be RHEL 7.) >>>>>>> >>>>>>> On the Windows platform, Microsoft Cluster Services provides a >>>>>>> cluster-wide registry service that is basically a cluster-wide >>>>>>> key:value >>>>>>> store with atomic updates and support to store the registry on shared >>>>>>> disk. The storage on shared disk allows access and use of the >>>>>>> registry >>>>>>> in cases where nodes are frequently joining and leaving the cluster. >>>>>>> >>>>>>> Are there any component(s) that can be used to provide a similar >>>>>>> registry in the Linux cluster stack? (The current RHEL 6 stack, >>>>>>> and/or >>>>>>> the new Fedora 19 stack.) >>>>>>> >>>>>>> Thanks in advance for your information, >>>>>>> Michael Richmond >>>>>> >>>>>> Hi Michael, >>>>>> >>>>>> First up, Red Hat's policy of what is coming is "we'll announce on >>>>>> release day". So anything else is a guess. As it is, Pacemaker is in >>>>>> tech-preview in RHEL 6, and the best guess is that it will be the >>>>>> official resource manager in RHEL 7, but it's just that, a guess. >>>>> >>>>> I believe we're officially allowed to say that it is our _intention_ >>>>> that >>>>> Pacemaker will be the one and only supported stack in RHEL7. >>>>> >>>>>> >>>>>> As for the registry question; I am not entirely sure what it is you >>>>>> are asking here (sorry, not familiar with windows). I can say that >>>>>> pacemaker uses something called the CIB (cluster information base) >>>>>> which >>>>>> is an XML file containing the cluster's configuration and state. It >>>>>> can >>>>>> be updated from any node and the changes will push to the other nodes >>>>>> immediately. >>>>> >>>>> How many of these attributes are you planning to have? >>>>> You can throw a few in there, but I'd not use it for 100's or 1000's of >>>>> them - its mainly designed to store the resource/service configuration. >>>>> >>>>> >>>>>> Does this answer your question? >>>>>> >>>>>> The current RHEL 6 cluster is corosync + cman + rgmanager. It also >>>>>> uses an XML config and it can be updated from any node and push out to >>>>>> the other nodes. >>>>>> >>>>>> Perhaps a better way to help would be to ask what, exactly, you want >>>>>> to build your cluster for? >>>>>> >>>>>> Cheers >>>>>> >>>>>> -- >>>>>> Digimer >>>>>> Papers and Projects: https://alteeve.ca/w/ >>>>>> What if the cure for cancer is trapped in the mind of a person without >>>>>> access to education? >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> ________________________________ >>>> >>>> PLEASE NOTE: The information contained in this electronic mail message >>>> is intended only for the use of the designated recipient(s) named above. >>>> If the reader of this message is not the intended recipient, you are >>>> hereby notified that you have received this message in error and that >>>> any review, dissemination, distribution, or copying of this message is >>>> strictly prohibited. If you have received this communication in error, >>>> please notify the sender by telephone or e-mail (as shown above) >>>> immediately and destroy any and all copies of this message in your >>>> possession (whether hard copies or electronically stored copies). >>>> >>>> >>> >>> >>> -- >>> Digimer >>> Papers and Projects: https://alteeve.ca/w/ >>> What if the cure for cancer is trapped in the mind of a person without >>> access to education? >> >> >> ________________________________ >> >> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). >> > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without access to education? From Ralf.Aumueller at informatik.uni-stuttgart.de Fri Apr 26 13:58:00 2013 From: Ralf.Aumueller at informatik.uni-stuttgart.de (=?ISO-8859-1?Q?Ralf_Aum=FCller?=) Date: Fri, 26 Apr 2013 15:58:00 +0200 Subject: [Linux-cluster] Problem with second ring config Message-ID: <517A87E8.30703@informatik.uni-stuttgart.de> Hello, we have a two node cluster running CentOS 6.4 (fully patched:corosync-1.4.1-15, cman-3.0.12.1-49). When I configure a second ring (passive mode) for cluster-interconnect I get the following messages in corosync.log (around every 2 minutes): ... Apr 26 15:34:54 corosync [TOTEM ] Marking ringid 1 interface 192.168.216.24 FAULTY Apr 26 15:34:55 corosync [TOTEM ] Automatically recovered ring 1 Apr 26 15:36:52 corosync [TOTEM ] Marking ringid 1 interface 192.168.216.24 FAULTY Apr 26 15:36:53 corosync [TOTEM ] Automatically recovered ring 1 Apr 26 15:38:50 corosync [TOTEM ] Marking ringid 1 interface 192.168.216.24 FAULTY Apr 26 15:38:51 corosync [TOTEM ] Automatically recovered ring 1 ... It seems related to bug https://bugzilla.redhat.com/show_bug.cgi?id=850757 (but this one should be fixed in corosync-1.4.1-15). Also "corosync-cfgtool -s" lists both rings as active: > corosync-cfgtool -s Printing ring status. Local node ID 1 RING ID 0 id = 10.0.0.5 status = ring 0 active with no faults RING ID 1 id = 192.168.216.22 status = ring 1 active with no faults > corosync-objctl | grep rrp cluster.totem.rrp_mode=passive totem.rrp_mode=passive When I change the config to active () I don't get these messages. Any comments? Thanks and best regards, Ralf From jfriesse at redhat.com Mon Apr 29 10:39:20 2013 From: jfriesse at redhat.com (Jan Friesse) Date: Mon, 29 Apr 2013 12:39:20 +0200 Subject: [Linux-cluster] Problem with second ring config In-Reply-To: <517A87E8.30703@informatik.uni-stuttgart.de> References: <517A87E8.30703@informatik.uni-stuttgart.de> Message-ID: <517E4DD8.7070400@redhat.com> Ralf, can you please try to describe load you are generating on machine? I mean, can you please try to provide as much informations as possible so I can eventually try to reproduce problem? Thanks, Honza Ralf Aum?ller napsal(a): > Hello, > > we have a two node cluster running CentOS 6.4 (fully patched:corosync-1.4.1-15, > cman-3.0.12.1-49). > When I configure a second ring (passive mode) for cluster-interconnect I get the > following messages in corosync.log (around every 2 minutes): > > ... > Apr 26 15:34:54 corosync [TOTEM ] Marking ringid 1 interface 192.168.216.24 FAULTY > Apr 26 15:34:55 corosync [TOTEM ] Automatically recovered ring 1 > Apr 26 15:36:52 corosync [TOTEM ] Marking ringid 1 interface 192.168.216.24 FAULTY > Apr 26 15:36:53 corosync [TOTEM ] Automatically recovered ring 1 > Apr 26 15:38:50 corosync [TOTEM ] Marking ringid 1 interface 192.168.216.24 FAULTY > Apr 26 15:38:51 corosync [TOTEM ] Automatically recovered ring 1 > ... > > It seems related to bug https://bugzilla.redhat.com/show_bug.cgi?id=850757 (but > this one should be fixed in corosync-1.4.1-15). > > Also "corosync-cfgtool -s" lists both rings as active: > >> corosync-cfgtool -s > Printing ring status. > Local node ID 1 > RING ID 0 > id = 10.0.0.5 > status = ring 0 active with no faults > RING ID 1 > id = 192.168.216.22 > status = ring 1 active with no faults > >> corosync-objctl | grep rrp > cluster.totem.rrp_mode=passive > totem.rrp_mode=passive > > When I change the config to active () I > don't get these messages. > > Any comments? > > Thanks and best regards, > Ralf > From rhurst at bidmc.harvard.edu Tue Apr 30 12:44:53 2013 From: rhurst at bidmc.harvard.edu (rhurst at bidmc.harvard.edu) Date: Tue, 30 Apr 2013 08:44:53 -0400 Subject: [Linux-cluster] GFS2 on RHEV managed guests Message-ID: <50168EC934B8D64AA8D8DD37F840F3DE7E7F43B1CF@EVS2CCR.its.caregroup.org> A couple of years ago, I staged a test environment using RHEL 5u1 with a few KVM guests that were provisioned with a direct LUN for use with Cluster Suite and resilient storage (GFS2). For whatever reason (on reflection, I may have overlooked the hypervisor?s virtio default setting for cache), the GFS2 filesystem would eventually ?break? and leave it fencing guests. We had always run our clusters (dev-test-prod) on physical hosts before and since, so cluster configuration and operational understanding is not any issue. We now have RHEV-M in place to begin a whole new provisioning process on newer RHEL 6 hypervisors with RHEL 6 guests. My question (or fear) before embarking into this space is how resilient is resilient storage (GFS2) on KVM guests now? Are there any pitfalls to avoid out there? Robert Hurst, Cach? Systems Manager Beth Israel Deaconess Medical Center 1135 Tremont Street, REN-7 Boston, Massachusetts 02120-2140 617-754-8754 ? Fax: 617-754-8730 ? Cell: 401-787-3154 Any technology distinguishable from magic is insufficiently advanced. -------------- next part -------------- An HTML attachment was scrubbed... URL: From swhiteho at redhat.com Tue Apr 30 13:30:21 2013 From: swhiteho at redhat.com (Steven Whitehouse) Date: Tue, 30 Apr 2013 14:30:21 +0100 Subject: [Linux-cluster] GFS2 on RHEV managed guests In-Reply-To: <50168EC934B8D64AA8D8DD37F840F3DE7E7F43B1CF@EVS2CCR.its.caregroup.org> References: <50168EC934B8D64AA8D8DD37F840F3DE7E7F43B1CF@EVS2CCR.its.caregroup.org> Message-ID: <1367328621.2748.15.camel@menhir> Hi, On Tue, 2013-04-30 at 08:44 -0400, rhurst at bidmc.harvard.edu wrote: > A couple of years ago, I staged a test environment using RHEL 5u1 with > a few KVM guests that were provisioned with a direct LUN for use with > Cluster Suite and resilient storage (GFS2). For whatever reason (on > reflection, I may have overlooked the hypervisor?s virtio default > setting for cache), the GFS2 filesystem would eventually ?break? and > leave it fencing guests. We had always run our clusters > (dev-test-prod) on physical hosts before and since, so cluster > configuration and operational understanding is not any issue. > > > > We now have RHEV-M in place to begin a whole new provisioning process > on newer RHEL 6 hypervisors with RHEL 6 guests. My question (or fear) > before embarking into this space is how resilient is resilient storage > (GFS2) on KVM guests now? Are there any pitfalls to avoid out there? > The issue is less likely to be related to KVM and more likely to be related to the workload that you intend to run within the guests. Provided you are able to use a supported fencing method, then there should be no real difference to running on bare metal in terms of what you can expect from GFS2. The requirements are still the same in that you'll need a shared block device that can be accesses symmetrically from all nodes, virtual or otherwise, Steve. > > > > > Robert Hurst, Cach? Systems Manager > Beth Israel Deaconess Medical Center > 1135 Tremont Street, REN-7 > Boston, Massachusetts 02120-2140 > 617-754-8754 ? Fax: 617-754-8730 ? Cell: 401-787-3154 > Any technology distinguishable from magic is insufficiently advanced. > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From Ralf.Aumueller at informatik.uni-stuttgart.de Tue Apr 30 13:32:40 2013 From: Ralf.Aumueller at informatik.uni-stuttgart.de (=?ISO-8859-1?Q?Ralf_Aum=FCller?=) Date: Tue, 30 Apr 2013 15:32:40 +0200 Subject: [Linux-cluster] Problem with second ring config (SOLVED) In-Reply-To: <517E4DD8.7070400@redhat.com> References: <517A87E8.30703@informatik.uni-stuttgart.de> <517E4DD8.7070400@redhat.com> Message-ID: <517FC7F8.1000104@informatik.uni-stuttgart.de> Hello, was my fault. I had iptables running with wrong configuration. I allowed input traffic (output traffic is not filtered) like: iptables -I INPUT -s XXX.XXX.XXX.0/24 -j ACCEPT After adding a LOG target to iptables I found dropped IGMP-packets with SRC 0.0.0.0 and DST: 224.0.0.1 After adding the following rule (found in Doc "Cluster Administration") error messages are gone : iptables -I INPUT -p igmp -j ACCEPT Thanks again. Ralf From junaidkhan1081 at yahoo.co.uk Tue Apr 30 15:13:11 2013 From: junaidkhan1081 at yahoo.co.uk (Junaid Khan) Date: Tue, 30 Apr 2013 16:13:11 +0100 (BST) Subject: [Linux-cluster] greetings Message-ID: <1367334791.28034.YahooMailNeo@web171704.mail.ir2.yahoo.com> http://www.packaging-cartoelx.com/hwork57.php -------------- next part -------------- An HTML attachment was scrubbed... URL: