[Linux-cluster] gfs2 blocking tasks

Tue Aug 21 20:35:40 UTC 2012

Hi Steven,

I've tested with kernel 3.3 ( 
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.3-precise/ ), bug isnt 
present there. Tried with 3.2.28 ( also from the kernel-ppa ), and bug 
happens there. In the end i was idd able to trace it back to 3.3-rc6, 
where you pushed a couple of GFS2 patches upstream.

inode.c for example is quite different between 3.2.28 and 3.3-rc1, and 
i do not dare to hack myself a diff file that incorporates your change, 
fearing it will probably be less stable than it is already. :)

Would it be too much to ask to backport your change to 3.2.x? I will 
then test this, and try to push it upstream to -stable and/or ubuntu 
LTS..

Thanks a lot in advance!

Kind regards,

Bart

Steven Whitehouse schreef op 21.08.2012 14:37:
> Hi,
>
> On Tue, 2012-08-21 at 14:23 +0200, Bart Verwilst wrote:
>> Hi Steven,
>>
>> The kernel of Ubuntu 12.04 LTS is based on 3.2.0, while the patch 
>> you
>> mentioned seems to be for a newer(?) version.
>> What should I do, offer an altered version of this patch for 
>> inclusion
>> into Ubuntu's 3.2.0 kernel, or is it a little less straightforward 
>> than
>> this? :)
>>
>> Kind regards,
>>
>> Bart
>>
> Well, since you've not got that patch in your existing kernel, then
> there are really two issues here. Firstly to try and verify that this
> patch really is a fix for the problem, and then to figure out what 
> needs
> to be done wrt Ubuntu distro kernels. One solution may be to post it 
> for
> the upstream -stable kernel as I think most distros will then pick 
> this
> up.
>
> Are you able to build a new Ubuntu kernel with that patch in it? That
> would be one way to test it. Another way which doesn't require 
> building
> kernels is this:
>
> The problem occurs when the resource groups are not uptodate, and
> various actions taken on the filesystem will ensure that they are
> uptodate. Mounting a filesystem and immediately running an unlink of 
> a
> file which is known to exist on the filesystem (before performing any
> other action) should trigger that bug if it is present.
>
> It may not be that particular bug, but that looks the most likely of 
> any
> recent patch to that bit of code. Are you able to try a more recent
> Ubuntu kernel? If you could try one based on a more recent upstream
> which has that patch in it, then that might also help narrow down the
> problem,
>
> Steve.
>
>> Steven Whitehouse schreef op 21.08.2012 13:27:
>> > Hi,
>> >
>> > On Tue, 2012-08-21 at 13:03 +0200, Bart Verwilst wrote:
>> >> Hi Steven,
>> >>
>> >> There is no drbd in the mix ( which is why i changed the title of
>> >> the
>> >> bugreport now ). I'm only using plain iSCSI. The original posted 
>> had
>> >> it
>> >> with drbd :)
>> >>
>> >> Kind regards,
>> >>
>> >> Bart
>> >>
>> > Ah, I see sorry.. I misunderstood the report. I wonder whether 
>> your
>> > distro kernel has this patch:
>> >
>> > 
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=718b97bd6b03445be53098e3c8f896aeebc304aa
>> >
>> > Thats the most likely thing that I can see that has been fixed
>> > recently,
>> >
>> > Steve.
>> >
>> >
>> >> Steven Whitehouse schreef op 21.08.2012 12:59:
>> >> > Hi,
>> >> >
>> >> > On Tue, 2012-08-21 at 12:39 +0200, Bart Verwilst wrote:
>> >> >> Hi Steven
>> >> >>
>> >> >> Shared storage is iSCSI,
>> >> >>
>> >> >> <totem rrp_mode="none" secauth="off" token="20000"/>
>> >> >> <quorumd tko="4" interval="2"
>> >> >> device="/dev/mapper/iscsi_cluster_quorum"></quorumd>
>> >> >>
>> >> >> Actually i know why this is happening now, and can reproduce 
>> 100%
>> >> of
>> >> >> the time, i've added my findings as a comment to this bug from
>> >> >> somebody
>> >> >> having the same problem:
>> >> >>
>> >> >> 
>> https://bugs.launchpad.net/ubuntu/+source/gfs2-utils/+bug/1020207
>> >> >>
>> >> >> Create file on one node's gfs2 mount, rm on the other -> 
>> hanging
>> >> >> mountpoint + kernel OOPS.
>> >> >>
>> >> >> Happy that i'm finally getting somewhere with this :P
>> >> >>
>> >> >> Anything i can do to help Steven?
>> >> >>
>> >> >> Kind regards,
>> >> >>
>> >> >> Bart Verwilst
>> >> >>
>> >> > Can you reproduce this without drbd in the mix? That should 
>> remove
>> >> > one
>> >> > complication and make this easier to track down.
>> >> >
>> >> > I'll take a look at see what that dereference is likely to be 
>> in
>> >> the
>> >> > mean time,
>> >> >
>> >> > Steve.
>> >> >
>> >> >> Steven Whitehouse schreef op 21.08.2012 12:17:
>> >> >> > Hi,
>> >> >> >
>> >> >> > On Tue, 2012-08-21 at 12:08 +0200, Bart Verwilst wrote:
>> >> >> >> As yet another reply to my own post, i found this on the 
>> node
>> >> >> where
>> >> >> >> it
>> >> >> >> hangs ( this time it's vm01, and /var/lib/libvirt/sanlock
>> >> that's
>> >> >> >> hanging
>> >> >> >> ):
>> >> >> >>
>> >> >> >>
>> >> >> >> [ 1219.640653] GFS2: fsid=: Trying to join cluster 
>> "lock_dlm",
>> >> >> >> "kvm:sanlock"
>> >> >> >> [ 1219.660035] GFS2: fsid=kvm:sanlock.0: Joined cluster. 
>> Now
>> >> >> >> mounting
>> >> >> >> FS...
>> >> >> >> [ 1219.720108] GFS2: fsid=kvm:sanlock.0: jid=0, already 
>> locked
>> >> >> for
>> >> >> >> use
>> >> >> >> [ 1219.720113] GFS2: fsid=kvm:sanlock.0: jid=0: Looking at
>> >> >> >> journal...
>> >> >> >> [ 1219.772606] GFS2: fsid=kvm:sanlock.0: jid=0: Acquiring 
>> the
>> >> >> >> transaction lock...
>> >> >> >> [ 1219.772659] GFS2: fsid=kvm:sanlock.0: jid=0: Replaying
>> >> >> journal...
>> >> >> >> [ 1219.772675] GFS2: fsid=kvm:sanlock.0: jid=0: Replayed 0 
>> of
>> >> 0
>> >> >> >> blocks
>> >> >> >> [ 1219.772679] GFS2: fsid=kvm:sanlock.0: jid=0: Found 1 
>> revoke
>> >> >> tags
>> >> >> >> [ 1219.782611] init: libvirt-bin main process ended,
>> >> respawning
>> >> >> >> [ 1219.784161] GFS2: fsid=kvm:sanlock.0: jid=0: Journal
>> >> replayed
>> >> >> in
>> >> >> >> 1s
>> >> >> >> [ 1219.784268] GFS2: fsid=kvm:sanlock.0: jid=0: Done
>> >> >> >> [ 1219.784329] GFS2: fsid=kvm:sanlock.0: jid=1: Trying to
>> >> acquire
>> >> >> >> journal lock...
>> >> >> >> [ 1219.788349] GFS2: fsid=kvm:sanlock.0: jid=1: Looking at
>> >> >> >> journal...
>> >> >> >> [ 1219.886047] GFS2: fsid=kvm:sanlock.0: jid=1: Done
>> >> >> >> [ 1219.886110] GFS2: fsid=kvm:sanlock.0: jid=2: Trying to
>> >> acquire
>> >> >> >> journal lock...
>> >> >> >> [ 1219.891121] GFS2: fsid=kvm:sanlock.0: jid=2: Looking at
>> >> >> >> journal...
>> >> >> >> [ 1219.943994] init: ttyS1 main process (20318) terminated
>> >> with
>> >> >> >> status
>> >> >> >> 1
>> >> >> >> [ 1219.944037] init: ttyS1 main process ended, respawning
>> >> >> >> [ 1219.967054] init: ttyS0 main process (20320) terminated
>> >> with
>> >> >> >> status
>> >> >> >> 1
>> >> >> >> [ 1219.967100] init: ttyS0 main process ended, respawning
>> >> >> >> [ 1219.972037] ttyS0: LSR safety check engaged!
>> >> >> >> [ 1220.226027] GFS2: fsid=kvm:sanlock.0: jid=2: Acquiring 
>> the
>> >> >> >> transaction lock...
>> >> >> >> [ 1220.226160] GFS2: fsid=kvm:sanlock.0: jid=2: Replaying
>> >> >> journal...
>> >> >> >> [ 1220.311801] GFS2: fsid=kvm:sanlock.0: jid=2: Replayed 3 
>> of
>> >> 106
>> >> >> >> blocks
>> >> >> >> [ 1220.311805] GFS2: fsid=kvm:sanlock.0: jid=2: Found 4487
>> >> revoke
>> >> >> >> tags
>> >> >> >> [ 1220.322148] GFS2: fsid=kvm:sanlock.0: jid=2: Journal
>> >> replayed
>> >> >> in
>> >> >> >> 1s
>> >> >> >> [ 1220.322253] GFS2: fsid=kvm:sanlock.0: jid=2: Done
>> >> >> >
>> >> >> > So that looks like it successfully recovered the journals 
>> for
>> >> >> nodes
>> >> >> > one
>> >> >> > and two. How many nodes are in the cluster? What is the 
>> fencing
>> >> >> > quorum
>> >> >> > set up being used?
>> >> >> >
>> >> >> >> [ 1221.457120] BUG: unable to handle kernel NULL pointer
>> >> >> dereference
>> >> >> >> at
>> >> >> >> 0000000000000018
>> >> >> >
>> >> >> > So this is a dereference of something which is 24 bytes into
>> >> some
>> >> >> > structure or other. Certainly something which should not 
>> happen
>> >> so
>> >> >> we
>> >> >> > need to take a look at that.
>> >> >> >
>> >> >> > Was this a one off, or something that you can reproduce?
>> >> >> >
>> >> >> > Steve.
>> >> >> >
>> >> >> >
>> >> >> >> [ 1221.457508] IP: [<ffffffffa04f800a>] 
>> gfs2_unlink+0x8a/0x220
>> >> >> >> [gfs2]
>> >> >> >> [ 1221.457958] PGD 2170f2067 PUD 20c0ff067 PMD 0
>> >> >> >> [ 1221.458197] Oops: 0000 [#1] SMP
>> >> >> >> [ 1221.458374] CPU 0
>> >> >> >> [ 1221.458460] Modules linked in: gfs2 dlm ipmi_si mptctl
>> >> mptbase
>> >> >> >> ipmi_devintf ipmi_msghandler dell_rbu ip6table_filter
>> >> ip6_tables
>> >> >> >> iptable_filter ip_tables ebtable_nat ebtables x_tables
>> >> kvm_intel
>> >> >> kvm
>> >> >> >> ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager
>> >> >> >> ocfs2_stackglue
>> >> >> >> configfs dm_round_robin ib_iser bridge rdma_cm ib_cm iw_cm
>> >> ib_sa
>> >> >> >> ib_mad
>> >> >> >> ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi
>> >> >> scsi_transport_iscsi
>> >> >> >> bonding nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc
>> >> 8021q
>> >> >> garp
>> >> >> >> stp
>> >> >> >> joydev dm_multipath dcdbas mac_hid i7core_edac edac_core
>> >> >> >> acpi_power_meter lp parport usbhid hid bnx2 mpt2sas
>> >> >> >> scsi_transport_sas
>> >> >> >> e1000e raid_class scsi_dh_rdac usb_storage [last unloaded:
>> >> >> ipmi_si]
>> >> >> >> [ 1221.463058]
>> >> >> >> [ 1221.463146] Pid: 20611, comm: libvirtd Not tainted
>> >> >> >> 3.2.0-26-generic
>> >> >> >> #41-Ubuntu Dell Inc. PowerEdge R310/05XKKK
>> >> >> >> [ 1221.463626] RIP: 0010:[<ffffffffa04f800a>]
>> >> >> [<ffffffffa04f800a>]
>> >> >> >> gfs2_unlink+0x8a/0x220 [gfs2]
>> >> >> >> [ 1221.464100] RSP: 0018:ffff88020cfe1d28  EFLAGS: 00010296
>> >> >> >> [ 1221.464344] RAX: 0000000000000000 RBX: ffff88021ef58000
>> >> RCX:
>> >> >> >> ffff88020cfe1d40
>> >> >> >> [ 1221.464662] RDX: 0000000000000000 RSI: 00000000000183f3
>> >> RDI:
>> >> >> >> ffff88022efa2440
>> >> >> >> [ 1221.464979] RBP: ffff88020cfe1e38 R08: 4000000000000000
>> >> R09:
>> >> >> >> 0000000000000000
>> >> >> >> [ 1221.465297] R10: fdd50265775f720a R11: 0000000000000003
>> >> R12:
>> >> >> >> ffff88021ef50000
>> >> >> >> [ 1221.465615] R13: ffff88020cfe1d80 R14: ffff8802178980c0
>> >> R15:
>> >> >> >> ffff88022efa2000
>> >> >> >> [ 1221.466115] FS:  00007f4d2c0e7700(0000)
>> >> >> GS:ffff880237200000(0000)
>> >> >> >> knlGS:0000000000000000
>> >> >> >> [ 1221.466487] CS:  0010 DS: 0000 ES: 0000 CR0:
>> >> 0000000080050033
>> >> >> >> [ 1221.466748] CR2: 0000000000000018 CR3: 00000002175e4000
>> >> CR4:
>> >> >> >> 00000000000006f0
>> >> >> >> [ 1221.467066] DR0: 0000000000000000 DR1: 0000000000000000
>> >> DR2:
>> >> >> >> 0000000000000000
>> >> >> >> [ 1221.467384] DR3: 0000000000000000 DR6: 00000000ffff0ff0
>> >> DR7:
>> >> >> >> 0000000000000400
>> >> >> >> [ 1221.467702] Process libvirtd (pid: 20611, threadinfo
>> >> >> >> ffff88020cfe0000, task ffff8802241cdbc0)
>> >> >> >> [ 1221.468091] Stack:
>> >> >> >> [ 1221.468199]  0000000000000003 ffff88022f108048
>> >> >> ffff88020cfe1d58
>> >> >> >> ffff88020cfe1d40
>> >> >> >> [ 1221.468581]  ffff88020cfe1d40 ffff88022f108000
>> >> >> ffff88022354aa00
>> >> >> >> 0000000000000001
>> >> >> >> [ 1221.468963]  0000000000000000 0000000000000000
>> >> >> ffffffffa04f7fda
>> >> >> >> ffff88020cfe1d80
>> >> >> >> [ 1221.469346] Call Trace:
>> >> >> >> [ 1221.469486]  [<ffffffffa04f7fda>] ? 
>> gfs2_unlink+0x5a/0x220
>> >> >> [gfs2]
>> >> >> >> [ 1221.469955]  [<ffffffffa04f7ff4>] ? 
>> gfs2_unlink+0x74/0x220
>> >> >> [gfs2]
>> >> >> >> [ 1221.470236]  [<ffffffff8129cb2c>] ?
>> >> >> >> security_inode_permission+0x1c/0x30
>> >> >> >> [ 1221.470536]  [<ffffffff81184e70>]
>> >> vfs_unlink.part.26+0x80/0xf0
>> >> >> >> [ 1221.470802]  [<ffffffff81184f1c>] vfs_unlink+0x3c/0x60
>> >> >> >> [ 1221.471040]  [<ffffffff8118758a>] 
>> do_unlinkat+0x1aa/0x1d0
>> >> >> >> [ 1221.471290]  [<ffffffff81177fc0>] ? 
>> vfs_write+0x110/0x180
>> >> >> >> [ 1221.471538]  [<ffffffff811782a7>] ? sys_write+0x67/0x90
>> >> >> >> [ 1221.471780]  [<ffffffff81188106>] sys_unlink+0x16/0x20
>> >> >> >> [ 1221.472019]  [<ffffffff81661fc2>]
>> >> >> system_call_fastpath+0x16/0x1b
>> >> >> >> [ 1221.472290] Code: 00 00 49 83 c5 40 31 d2 4c 89 e9 be 01 
>> 00
>> >> 00
>> >> >> 00
>> >> >> >> e8
>> >> >> >> fc 1e ff ff 48 8b b3 28 02 00 00 4c 89 ff e8 ad 7e 00 00 48 
>> 8d
>> >> 8d
>> >> >> 08
>> >> >> >> ff
>> >> >> >> ff ff <48> 8b 78 18 31 d2 be 01 00 00 00 48 83 e9 80 e8 d2 
>> 1e
>> >> ff
>> >> >> ff
>> >> >> >> 48
>> >> >> >> [ 1221.473936] RIP  [<ffffffffa04f800a>]
>> >> gfs2_unlink+0x8a/0x220
>> >> >> >> [gfs2]
>> >> >> >> [ 1221.474240]  RSP <ffff88020cfe1d28>
>> >> >> >> [ 1221.474408] CR2: 0000000000000018
>> >> >> >> [ 1221.474959] ---[ end trace f7df780fd45600a8 ]---
>> >> >> >>
>> >> >> >>
>> >> >> >> Bart Verwilst schreef op 20.08.2012 09:50:
>> >> >> >> > Nothing out of the ordinary, should have mentioned that!
>> >> >> >> >
>> >> >> >> > <snip>
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> read
>> >> >> >> 20
>> >> >> >> > bytes from fd 17
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> client
>> >> >> >> > command is 7
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> About
>> >> >> >> to
>> >> >> >> > process command
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> command
>> >> >> >> to
>> >> >> >> > process is 7
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> >> > get_all_members: allocated new buffer (retsize=1024)
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> >> > get_all_members: retlen = 1760
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> command
>> >> >> >> > return code is 4
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> > Returning command data. length = 1760
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> sending
>> >> >> >> > reply 40000007 to fd 17
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> read
>> >> >> >> 20
>> >> >> >> > bytes from fd 17
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> client
>> >> >> >> > command is 800000b7
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> About
>> >> >> >> to
>> >> >> >> > process command
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> command
>> >> >> >> to
>> >> >> >> > process is 800000b7
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> memb:
>> >> >> command
>> >> >> >> > return code is 0
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> > Returning command data. length = 0
>> >> >> >> > Aug 19 00:08:00 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> sending
>> >> >> >> > reply c00000b7 to fd 17
>> >> >> >> > Aug 19 00:08:01 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> read
>> >> >> >> 20
>> >> >> >> > bytes from fd 17
>> >> >> >> > Aug 19 00:08:01 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> >> client
>> >> >> >> > command is 7
>> >> >> >> > Aug 19 00:08:01 vm02-test corosync[7394]:   [CMAN  ] 
>> daemon:
>> >> >> About
>> >> >> >> to
>> >> >> >> > process command
>> >> >> >> > </snip>
>> >> >> >> >
>> >> >> >> > Digimer schreef op 20.08.2012 00:01:
>> >> >> >> >> On 08/19/2012 05:34 PM, Bart Verwilst wrote:
>> >> >> >> >>> Aug 19 00:10:01 vm02-test kernel: [282120.240067] INFO:
>> >> task
>> >> >> >> >>> kworker/1:0:3117 blocked for more than 120 seconds.
>> >> >> >> >>> Aug 19 00:10:01 vm02-test kernel: [282120.240182] "echo 
>> 0
>> >> >
>> >> >> >> >>> /proc/sys/kernel/hung_task_timeout_secs" disables this
>> >> >> message.
>> >> >> >> >>
>> >> >> >> >> Nothing around Aug 19 00:08:00 ?
>> >> >> >>
>> >> >> >> --
>> >> >> >> Linux-cluster mailing list
>> >> >> >> Linux-cluster at redhat.com
>> >> >> >> https://www.redhat.com/mailman/listinfo/linux-cluster
>> >> >>
>> >>
>>