[Linux-cluster] gfs2 blocking tasks

Tue Aug 21 14:29:40 UTC 2012

Hi Steven,

I installed kernel 3.3 from 
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.3-precise/ on 1 of the 
hosts.

Next, i mounted my gfs2 volume on 1 node, created a new empty file in 
it, unmounted the gfs volume, and then mounted it on all 3 nodes.

Then i simply rm'ed the file on each node ( starting with the normal 
3.2-kernel based machines ).

On the 2 3.2 machines, rm triggered an oops, and the file wasn't gone. 
I then rm'ed it on the 3.3 kernel machine, which went perfectly. gfs 
mount still reachable afterwards, no oops. Both other nodes hung when 
trying to ls into the mount.

I tried booting the kernel from 
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.2.28-precise/ to try 
and narrow down the scope, but that one doesnt seem to be coming up 
anymore, so i'll have remote hands look at it asap.

But at least this kind of narrows down the scope?

Kind regards,

Bart

Steven Whitehouse schreef op 21.08.2012 14:37:
> Hi,
>
> On Tue, 2012-08-21 at 14:23 +0200, Bart Verwilst wrote:
>> Hi Steven,
>>
>> The kernel of Ubuntu 12.04 LTS is based on 3.2.0, while the patch 
>> you
>> mentioned seems to be for a newer(?) version.
>> What should I do, offer an altered version of this patch for 
>> inclusion
>> into Ubuntu's 3.2.0 kernel, or is it a little less straightforward 
>> than
>> this? :)
>>
>> Kind regards,
>>
>> Bart
>>
> Well, since you've not got that patch in your existing kernel, then
> there are really two issues here. Firstly to try and verify that this
> patch really is a fix for the problem, and then to figure out what 
> needs
> to be done wrt Ubuntu distro kernels. One solution may be to post it 
> for
> the upstream -stable kernel as I think most distros will then pick 
> this
> up.
>
> Are you able to build a new Ubuntu kernel with that patch in it? That
> would be one way to test it. Another way which doesn't require 
> building
> kernels is this:
>
> The problem occurs when the resource groups are not uptodate, and
> various actions taken on the filesystem will ensure that they are
> uptodate. Mounting a filesystem and immediately running an unlink of 
> a
> file which is known to exist on the filesystem (before performing any
> other action) should trigger that bug if it is present.
>
> It may not be that particular bug, but that looks the most likely of 
> any
> recent patch to that bit of code. Are you able to try a more recent
> Ubuntu kernel? If you could try one based on a more recent upstream
> which has that patch in it, then that might also help narrow down the
> problem,
>
> Steve.
>
>> Steven Whitehouse schreef op 21.08.2012 13:27:
>> > Hi,
>> >
>> > On Tue, 2012-08-21 at 13:03 +0200, Bart Verwilst wrote:
>> >> Hi Steven,
>> >>
>> >> There is no drbd in the mix ( which is why i changed the title of
>> >> the
>> >> bugreport now ). I'm only using plain iSCSI. The original posted 
>> had
>> >> it
>> >> with drbd :)
>> >>
>> >> Kind regards,
>> >>
>> >> Bart
>> >>
>> > Ah, I see sorry.. I misunderstood the report. I wonder whether 
>> your
>> > distro kernel has this patch:
>> >
>> > 
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=718b97bd6b03445be53098e3c8f896aeebc304aa
>> >
>> > Thats the most likely thing that I can see that has been fixed
>> > recently,
>> >
>> > Steve.
>> >
>> >
>> >> Steven Whitehouse schreef op 21.08.2012 12:59:
>> >> > Hi,
>> >> >
>> >> > On Tue, 2012-08-21 at 12:39 +0200, Bart Verwilst wrote:
>> >> >> Hi Steven
>> >> >>
>> >> >> Shared storage is iSCSI,
>> >> >>
>> >> >> <totem rrp_mode="none" secauth="off" token="20000"/>
>> >> >> <quorumd tko="4" interval="2"
>> >> >> device="/dev/mapper/iscsi_cluster_quorum"></quorumd>
>> >> >>
>> >> >> Actually i know why this is happening now, and can reproduce 
>> 100%
>> >> of
>> >> >> the time, i've added my findings as a comment to this bug from
>> >> >> somebody
>> >> >> having the same problem:
>> >> >>
>> >> >> 
>> https://bugs.launchpad.net/ubuntu/+source/gfs2-utils/+bug/1020207
>> >> >>
>> >> >> Create file on one node's gfs2 mount, rm on the other -> 
>> hanging
>> >> >> mountpoint + kernel OOPS.
>> >> >>
>> >> >> Happy that i'm finally getting somewhere with this :P
>> >> >>
>> >> >> Anything i can do to help Steven?
>> >> >>
>> >> >> Kind regards,
>> >> >>
>> >> >> Bart Verwilst
>> >> >>
>> >> > Can you reproduce this without drbd in the mix? That should 
>> remove
>> >> > one
>> >> > complication and make this easier to track down.
>> >> >
>> >> > I'll take a look at see what that dereference is likely to be 
>> in
>> >> the
>> >> > mean time,
>> >> >
>> >> > Steve.
>> >> >
>> >> >> Steven Whitehouse schreef op 21.08.2012 12:17:
>> >> >> > Hi,
>> >> >> >
>> >> >> > On Tue, 2012-08-21 at 12:08 +0200, Bart Verwilst wrote:
>> >> >> >> As yet another reply to my own post, i found this on the 
>> node
>> >> >> where
>> >> >> >> it
>> >> >> >> hangs ( this time it's vm01, and /var/lib/libvirt/sanlock
>> >> that's
>> >> >> >> hanging
>> >> >> >> ):
>> >> >> >>
>> >> >> >>
>> >> >> >> [ 1219.640653] GFS2: fsid=: Trying to join cluster 
>> "lock_dlm",
>> >> >> >> "kvm:sanlock"
>> >> >> >> [ 1219.660035] GFS2: fsid=kvm:sanlock.0: Joined cluster. 
>> Now
>> >> >> >> mounting
>> >> >> >> FS...
>> >> >> >> [ 1219.720108] GFS2: fsid=kvm:sanlock.0: jid=0, already 
>> locked
>> >> >> for
>> >> >> >> use
>> >> >> >> [ 1219.720113] GFS2: fsid=kvm:sanlock.0: jid=0: Looking at
>> >> >> >> journal...
>> >> >> >> [ 1219.772606] GFS2: fsid=kvm:sanlock.0: jid=0: Acquiring 
>> the
>> >> >> >> transaction lock...
>> >> >> >> [ 1219.772659] GFS2: fsid=kvm:sanlock.0: jid=0: Replaying
>> >> >> journal...
>> >> >> >> [ 1219.772675] GFS2: fsid=kvm:sanlock.0: jid=0: Replayed 0 
>> of
>> >> 0
>> >> >> >> blocks
>> >> >> >> [ 1219.772679] GFS2: fsid=kvm:sanlock.0: jid=0: Found 1 
>> revoke
>> >> >> tags
>> >> >> >> [ 1219.782611] init: libvirt-bin main process ended,
>> >> respawning
>> >> >> >> [ 1219.784161] GFS2: fsid=kvm:sanlock.0: jid=0: Journal
>> >> replayed
>> >> >> in
>> >> >> >> 1s
>> >> >> >> [ 1219.784268] GFS2: fsid=kvm:sanlock.0: jid=0: Done
>> >> >> >> [ 1219.784329] GFS2: fsid=kvm:sanlock.0: jid=1: Trying to
>> >> acquire
>> >> >> >> journal lock...
>> >> >> >> [ 1219.788349] GFS2: fsid=kvm:sanlock.0: jid=1: Looking at
>> >> >> >> journal...
>> >> >> >> [ 1219.886047] GFS2: fsid=kvm:sanlock.0: jid=1: Done
>> >> >> >> [ 1219.886110] GFS2: fsid=kvm:sanlock.0: jid=2: Trying to
>> >> acquire
>> >> >> >> journal lock...
>> >> >> >> [ 1219.891121] GFS2: fsid=kvm:sanlock.0: jid=2: Looking at
>> >> >> >> journal...
>> >> >> >> [ 1219.943994] init: ttyS1 main process (20318) terminated
>> >> with
>> >> >> >> status
>> >> >> >> 1
>> >> >> >> [ 1219.944037] init: ttyS1 main process ended, respawning
>> >> >> >> [ 1219.967054] init: ttyS0 main process (20320) terminated
>> >> with
>> >> >> >> status
>> >> >> >> 1
>> >> >> >> [ 1219.967100] init: ttyS0 main process ended, respawning
>> >> >> >> [ 1219.972037] ttyS0: LSR safety check engaged!
>> >> >> >> [ 1220.226027] GFS2: fsid=kvm:sanlock.0: jid=2: Acquiring 
>> the
>> >> >> >> transaction lock...
>> >> >> >> [ 1220.226160] GFS2: fsid=kvm:sanlock.0: jid=2: Replaying
>> >> >> journal...
>> >> >> >> [ 1220.311801] GFS2: fsid=kvm:sanlock.0: jid=2: Replayed 3 
>> of
>> >> 106
>> >> >> >> blocks
>> >> >> >> [ 1220.311805] GFS2: fsid=kvm:sanlock.0: jid=2: Found 4487
>> >> revoke
>> >> >> >> tags
>> >> >> >> [ 1220.322148] GFS2: fsid=kvm:sanlock.0: jid=2: Journal
>> >> replayed
>> >> >> in
>> >> >> >> 1s
>> >> >> >> [ 1220.322253] GFS2: fsid=kvm:sanlock.0: jid=2: Done
>> >> >> >
>> >> >> > So that looks like it successfully recovered the journals 
>> for
>> >> >> nodes
>> >> >> > one
>> >> >> > and two. How many nodes are in the cluster? What is the 
>> fencing
>> >> >> > quorum
>> >> >> > set up being used?
>> >> >> >
>> >> >> >> [ 1221.457120] BUG: unable to handle kernel NULL pointer
>> >> >> dereference
>> >> >> >> at
>> >> >> >> 0000000000000018
>> >> >> >
>> >> >> > So this is a dereference of something which is 24 bytes into
>> >> some
>> >> >> > structure or other. Certainly something which should not 
>> happen
>> >> so
>> >> >> we
>> >> >> > need to take a look at that.
>> >> >> >
>> >> >> > Was this a one off, or something that you can reproduce?
>> >> >> >
>> >> >> > Steve.
>> >> >> >
>> >> >> >
>> >> >> >> [ 1221.457508] IP: [<ffffffffa04f800a>] 
>> gfs2_unlink+0x8a/0x220
>> >> >> >> [gfs2]
>> >> >> >> [ 1221.457958] PGD 2170f2067 PUD 20c0ff067 PMD 0
>> >> >> >> [ 1221.458197] Oops: 0000 [#1] SMP
>> >> >> >> [ 1221.458374] CPU 0
>> >> >> >> [ 1221.458460] Modules linked in: gfs2 dlm ipmi_si mptctl
>> >> mptbase
>> >> >> >> ipmi_devintf ipmi_msghandler dell_rbu ip6table_filter
>> >> ip6_tables
>> >> >> >> iptable_filter ip_tables ebtable_nat ebtables x_tables
>> >> kvm_intel
>> >> >> kvm
>> >> >> >> ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager
>> >> >> >> ocfs2_stackglue
>> >> >> >> configfs dm_round_robin ib_iser bridge rdma_cm ib_cm iw_cm
>> >> ib_sa
>> >> >> >> ib_mad
>> >> >> >> ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi
>> >> >> scsi_transport_iscsi
>> >> >> >> bonding nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc
>> >> 8021q
>> >> >> garp
>> >> >> >> stp
>> >> >> >> joydev dm_multipath dcdbas mac_hid i7core_edac edac_core
>> >> >> >> acpi_power_meter lp parport usbhid hid bnx2 mpt2sas
>> >> >> >> scsi_transport_sas
>> >> >> >> e1000e raid_class scsi_dh_rdac usb_storage [last unloaded:
>> >> >> ipmi_si]
>> >> >> >> [ 1221.463058]
>> >> >> >> [ 1221.463146] Pid: 20611, comm: libvirtd Not tainted
>> >> >> >> 3.2.0-26-generic
>> >> >> >> #41-Ubuntu Dell Inc. PowerEdge R310/05XKKK
>> >> >> >> [ 1221.463626] RIP: 0010:[<ffffffffa04f800a>]
>> >> >> [<ffffffffa04f800a>]
>> >> >> >> gfs2_unlink+0x8a/0x220 [gfs2]
>> >> >> >> [ 1221.464100] RSP: 0018:ffff88020cfe1d28  EFLAGS: 00010296
>> >> >> >> [ 1221.464344] RAX: 0000000000000000 RBX: ffff88021ef58000
>> >> RCX:
>> >> >> >> ffff88020cfe1d40
>> >> >> >> [ 1221.464662] RDX: 0000000000000000 RSI: 00000000000183f3
>> >> RDI:
>> >> >> >> ffff88022efa2440
>> >> >> >> [ 1221.464979] RBP: ffff88020cfe1e38 R08: 4000000000000000
>> >> R09:
>> >> >> >> 0000000000000000
>> >> >> >> [ 1221.465297] R10: fdd50265775f720a R11: 0000000000000003
>> >> R12:
>> >> >> >> ffff88021ef50000
>> >> >> >> [ 1221.465615] R13: ffff88020cfe1d80 R14: ffff8802178980c0
>> >> R15:
>> >> >> >> ffff88022efa2000
>> >> >> >> [ 1221.466115] FS:  00007f4d2c0e7700(0000)
>> >> >> GS:ffff880237200000(0000)
>> >> >> >> knlGS:0000000000000000
>> >> >> >> [ 1221.466487] CS:  0010 DS: 0000 ES: 0000 CR0:
>> >> 0000000080050033
>> >> >> >> [ 1221.466748] CR2: 0000000000000018 CR3: 00000002175e4000
>> >> CR4:
>> >> >> >> 00000000000006f0
>> >> >> >> [ 1221.467066] DR0: 0000000000000000 DR1: 0000000000000000
>> >> DR2:
>> >> >> >> 0000000000000000
>> >> >> >> [ 1221.467384] DR3: 0000000000000000 DR6: 00000000ffff0ff0
>> >> DR7:
>> >> >> >> 0000000000000400
>> >> >> >> [ 1221.467702] Process libvirtd (pid: 20611, threadinfo
>> >> >> >> ffff88020cfe0000, task ffff8802241cdbc0)
>> >> >> >> [ 1221.468091] Stack:
>> >> >> >> [ 1221.468199]  0000000000000003 ffff88022f108048
>> >> >> ffff88020cfe1d58
>> >> >> >> ffff88020cfe1d40
>> >> >> >> [ 1221.468581]  ffff88020cfe1d40 ffff88022f108000
>> >> >> ffff88022354aa00
>> >> >> >> 0000000000000001
>> >> >> >> [ 1221.468963]  0000000000000000 0000000000000000
>> >> >> ffffffffa04f7fda
>> >> >> >> ffff88020cfe1d80
>> >> >> >> [ 1221.469346] Call Trace:
>> >> >> >> [ 1221.469486]  [<ffffffffa04f7fda>] ? 
>> gfs2_unlink+0x5a/0x220
>> >> >> [gfs2]
>> >> >> >> [ 1221.469955]  [<ffffffffa04f7ff4>] ? 
>> gfs2_unlink+0x74/0x220
>> >> >> [gfs2]
>> >> >> >> [ 1221.470236]  [<ffffffff8129cb2c>] ?
>> >> >> >> security_inode_permission+0x1c/0x30
>> >> >> >> [ 1221.470536]  [<ffffffff81184e70>]
>> >> vfs_unlink.part.26+0x80/0xf0
>> >> >> >> [ 1221.470802]  [<ffffffff81184f1c>] vfs_unlink+0x3c/0x60
>> >> >> >> [ 1221.471040]  [<ffffffff8118758a>] 
>> do_unlinkat+0x1aa/0x1d0
>> >> >> >> [ 1221.471290]  [<ffffffff81177fc0>] ? 
>> vfs_write+0x110/0x180
>> >> >> >> [ 1221.471538]  [<ffffffff811782a7>] ? sys_write+0x67/0x90
>> >> >> >> [ 1221.471780]  [<ffffffff81188106>] sys_unlink+0x16/0x20
>> >> >> >> [ 1221.472019]  [<ffffffff81661fc2>]
>> >> >> system_call_fastpath+0x16/0x1b
>> >> >> >> [ 1221.472290] Code: 00 00 49 83 c5 40 31 d2 4c 89 e9 be 01 
>> 00
>> >> 00
>> >> >> 00
>> >> >> >> e8
>> >> >> >> fc 1e ff ff 48 8b b3 28 02 00 00 4c 89 ff e8 ad 7e 00 00 48 
>> 8d
>> >> 8d
>> >> >> 08
>> >> >> >> ff
>> >> >> >> ff ff <48> 8b 78 18 31 d2 be 01 00 00 00 48 83 e9 80 e8 d2 
>> 1e
>> >> ff
>> >> >> ff
>> >> >> >> 48
>> >> >> >> [ 1221.473936] RIP  [<ffffffffa04f800a>]
>> >> gfs2_unlink+0x8a/0x220
>> >> >> >> [gfs2]
>> >> >> >> [ 1221.474240]  RSP <ffff88020cfe1d28>
>> >> >> >> [ 1221.474408] CR2: 0000000000000018
>> >> >> >> [ 1221.474959] ---[ end trace f7df780fd45600a8 ]---
>> >> >> >>
>> >> >> >>
>> >> >> >> Bart Verwilst schreef op 20.08.2012 09:50:
>> >> >> >> > Nothing out of the ordinary, should have mentioned that!
>> >> >> >> >
>> >> >> >> > <snip>
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> read
>> >> >> >> 20
>> >> >> >> > bytes from fd 17
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> client
>> >> >> >> > command is 7
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> About
>> >> >> >> to
>> >> >> >> > process command
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> command
>> >> >> >> to
>> >> >> >> > process is 7
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> >> > get_all_members: allocated new buffer (retsize=1024)
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> >> > get_all_members: retlen = 1760
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> command
>> >> >> >> > return code is 4
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> > Returning command data. length = 1760
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> sending
>> >> >> >> > reply 40000007 to fd 17
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> read
>> >> >> >> 20
>> >> >> >> > bytes from fd 17
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> client
>> >> >> >> > command is 800000b7
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> About
>> >> >> >> to
>> >> >> >> > process command
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> command
>> >> >> >> to
>> >> >> >> > process is 800000b7
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> command
>> >> >> >> > return code is 0
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> > Returning command data. length = 0
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> sending
>> >> >> >> > reply c00000b7 to fd 17
>> >> >> >> > Aug 19 00:08:01 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> read
>> >> >> >> 20
>> >> >> >> > bytes from fd 17
>> >> >> >> > Aug 19 00:08:01 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> client
>> >> >> >> > command is 7
>> >> >> >> > Aug 19 00:08:01 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> About
>> >> >> >> to
>> >> >> >> > process command
>> >> >> >> > </snip>
>> >> >> >> >
>> >> >> >> > Digimer schreef op 20.08.2012 00:01:
>> >> >> >> >> On 08/19/2012 05:34 PM, Bart Verwilst wrote:
>> >> >> >> >>> Aug 19 00:10:01 vm02-test kernel: [282120.240067] INFO:
>> >> task
>> >> >> >> >>> kworker/1:0:3117 blocked for more than 120 seconds.
>> >> >> >> >>> Aug 19 00:10:01 vm02-test kernel: [282120.240182] "echo 
>> 0
>> >> >
>> >> >> >> >>> /proc/sys/kernel/hung_task_timeout_secs" disables this
>> >> >> message.
>> >> >> >> >>
>> >> >> >> >> Nothing around Aug 19 00:08:00 ?
>> >> >> >>
>> >> >> >> --
>> >> >> >> Linux-cluster mailing list
>> >> >> >> Linux-cluster at redhat.com
>> >> >> >> https://www.redhat.com/mailman/listinfo/linux-cluster
>> >> >>
>> >>
>>