[Linux-cluster] RHEL 5.5 Crash in gfs2

Scooter Morris scooter at cgl.ucsf.edu
Tue Apr 6 15:36:40 UTC 2010


No, we had seen this crash on the older kernel and were hoping that the 
new kernel (which has a number of gfs2 fixes in it) would correct it.  
Apparently, it didn't.  The bugzilla entry is #579801

-- scooter

On 04/06/2010 08:13 AM, Paras pradhan wrote:
> Curious to know.
>
>
> Is this the issue with this particular version of kernel ie kernel 
> 2.6.18-194.el5 or part of new gfs packages too?  Reboot to the older 
> kernel works in this case?
>
> Paras.
>
>
> On Tue, Apr 6, 2010 at 4:44 AM, Steven Whitehouse <swhiteho at redhat.com 
> <mailto:swhiteho at redhat.com>> wrote:
>
>     Hi,
>
>     Can you open a bugzilla about this? Thanks,
>
>     Steve.
>
>     On Sun, 2010-04-04 at 13:58 -0700, Scooter Morris wrote:
>     > Hi all,
>     >      We recently upgraded to 5.5 (kernel 2.6.18-194.el5) to get
>     some of
>     > the gfs2 fixes on a 3 node cluster, but crashed two days later
>     with the
>     > following stack trace:
>     >
>     > [2010-04-04 10:28:48]Unable to handle kernel NULL pointer
>     dereference at
>     > 0000000000000078 RIP: ^M
>     > [2010-04-04 10:28:48] [<ffffffff887dc3d3>]
>     :gfs2:revoke_lo_add+0x1a/0x32^M
>     > [2010-04-04 10:28:48]PGD 7d4297067 PUD 13a24c067 PMD 0 ^M
>     > [2010-04-04 10:28:48]Oops: 0002 [1] SMP ^M
>     > [2010-04-04 10:28:48]last sysfs file:
>     >
>     /devices/pci0000:00/0000:00:01.0/0000:03:00.0/0000:04:01.0/0000:07:00.0/0000:08:00.0/irq^M
>     > [2010-04-04 10:28:48]CPU 8 ^M
>     > [2010-04-04 10:28:48]Modules linked in: ipt_MASQUERADE iptable_nat
>     > ip_nat bridge autofs4 hidp l2cap bluetooth lock_dlm gfs2 dlm
>     configfs
>     > lockd sunrpc ip_conntrack_netbios_ns xt_state ip_conntrack nfnetlink
>     > xt_tcpudp ipt_REJECT iptable_filter ip_tables arpt_mangle
>     > arptable_filter arp_tables x_tables ib_iser libiscsi2
>     > scsi_transport_iscsi2 scsi_transport_iscsi ib_srp rds ib_sdp
>     ib_ipoib
>     > ipoib_helper ipv6 xfrm_nalgo crypto_api rdma_ucm rdma_cm ib_ucm
>     > ib_uverbs ib_umad ib_cm iw_cm ib_addr ib_sa ib_mad ib_core
>     > dm_round_robin dm_multipath scsi_dh video backlight sbs power_meter
>     > hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi
>     > acpi_memhotplug ac parport_pc lp parport sg ide_cd bnx2 cdrom hpilo
>     > serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache
>     > dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx
>     scsi_transport_fc
>     > ata_piix libata shpchp cciss sd_mod scsi_mod ext3 jbd uhci_hcd
>     ohci_hcd
>     > ehci_hcd^M
>     > [2010-04-04 10:28:49]Pid: 795, comm: kswapd0 Not tainted
>     2.6.18-194.el5 #1^M
>     > [2010-04-04 10:28:49]RIP: 0010:[<ffffffff887dc3d3>]
>     > [<ffffffff887dc3d3>] :gfs2:revoke_lo_add+0x1a/0x32^M
>     > [2010-04-04 10:28:49]RSP: 0018:ffff81082efcdae8  EFLAGS: 00010282^M
>     > [2010-04-04 10:28:49]RAX: 0000000000000000 RBX: ffff810256e037f0
>     RCX:
>     > ffff8100207fd180^M
>     > [2010-04-04 10:28:49]RDX: ffff81051abdf630 RSI: ffff810819032720
>     RDI:
>     > ffff810819032000^M
>     > [2010-04-04 10:28:49]RBP: ffff81051abdf610 R08: ffff81011cb31b06
>     R09:
>     > ffff81082efcdb20^M
>     > [2010-04-04 10:28:49]R10: ffff8101135d8330 R11: ffffffff887dc3b9
>     R12:
>     > ffff810819032000^M
>     > [2010-04-04 10:28:49]R13: 0000000000000000 R14: ffff810256e037f0
>     R15:
>     > ffff810819032000^M
>     > [2010-04-04 10:28:50]FS:  0000000000000000(0000)
>     > GS:ffff81011cb319c0(0000) knlGS:0000000000000000^M
>     > [2010-04-04 10:28:50]CS:  0010 DS: 0018 ES: 0018 CR0:
>     000000008005003b^M
>     > [2010-04-04 10:28:50]CR2: 0000000000000078 CR3: 000000024ed62000
>     CR4:
>     > 00000000000006e0^M
>     > [2010-04-04 10:28:50]Process kswapd0 (pid: 795, threadinfo
>     > ffff81082efcc000, task ffff81082f5d17a0)^M
>     > [2010-04-04 10:28:50]Stack:  ffffffff887dd88c 000000002efcde10
>     > ffff810256e037f0 ffff81011c7fadd8^M
>     > [2010-04-04 10:28:50] 0000000000000000 0000000000000000
>     ffffffff887deaf6
>     > 000000000000000e^M
>     > [2010-04-04 10:28:50] ffff81011c7fadd8 00000000000000b0
>     ffff81082efcdcf0
>     > ffff810819032000^M
>     > [2010-04-04 10:28:50]Call Trace:^M
>     > [2010-04-04 10:28:50] [<ffffffff887dd88c>]
>     > :gfs2:gfs2_remove_from_journal+0x11f/0x131^M
>     > [2010-04-04 10:28:50] [<ffffffff887deaf6>]
>     > :gfs2:gfs2_invalidatepage+0xea/0x151^M
>     > [2010-04-04 10:28:50] [<ffffffff887de739>]
>     > :gfs2:gfs2_writepage_common+0x95/0xb1^M
>     > [2010-04-04 10:28:50] [<ffffffff887ded63>]
>     > :gfs2:gfs2_jdata_writepage+0x56/0x98^M
>     > [2010-04-04 10:28:50] [<ffffffff800cbf1b>]
>     > shrink_inactive_list+0x3fd/0x8d8^M
>     > [2010-04-04 10:28:50] [<ffffffff800484ad>]
>     __pagevec_release+0x19/0x22^M
>     > [2010-04-04 10:28:51] [<ffffffff800cb9fd>]
>     shrink_active_list+0x4b4/0x4c4^M
>     > [2010-04-04 10:28:51] [<ffffffff8001314a>] shrink_zone+0x127/0x18d^M
>     > [2010-04-04 10:28:51] [<ffffffff800581a7>] kswapd+0x323/0x46c^M
>     > [2010-04-04 10:28:51] [<ffffffff800a1ba4>]
>     > autoremove_wake_function+0x0/0x2e^M
>     > [2010-04-04 10:28:51] [<ffffffff800a198c>]
>     keventd_create_kthread+0x0/0xc4^M
>     > [2010-04-04 10:28:51] [<ffffffff80057e84>] kswapd+0x0/0x46c^M
>     > [2010-04-04 10:28:51] [<ffffffff800a198c>]
>     keventd_create_kthread+0x0/0xc4^M
>     > [2010-04-04 10:28:51] [<ffffffff80032bdc>] kthread+0xfe/0x132^M
>     > [2010-04-04 10:28:51] [<ffffffff8009e81a>]
>     request_module+0x0/0x14d^M
>     > [2010-04-04 10:28:51] [<ffffffff8005efb1>] child_rip+0xa/0x11^M
>     > [2010-04-04 10:28:51] [<ffffffff800a198c>]
>     keventd_create_kthread+0x0/0xc4^M
>     > [2010-04-04 10:28:51] [<ffffffff80032ade>] kthread+0x0/0x132^M
>     > [2010-04-04 10:28:51] [<ffffffff8005efa7>] child_rip+0x0/0x11^M
>     >
>     > This looks exactly like bug 437803, but that was closed early last
>     > year.  Does anyone have any ideas what might be going on?  I
>     > double-checked, and we definitely do not have the old kmod-gfs2
>     installed.
>     >
>     > -- scooter
>     >
>     > --
>     > Linux-cluster mailing list
>     > Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>     > https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>     --
>     Linux-cluster mailing list
>     Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>     https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100406/bf534bf0/attachment.htm>


More information about the Linux-cluster mailing list