[Linux-cluster] 2.6.20-rc4 gfs2 bug

Thu Mar 8 18:07:51 UTC 2007

On Thu, 2007-01-25 at 00:07 -0500, Dan Merillat wrote:
> Running 2.6.20-rc4 _WITH_ the following patch: (Shouldn't be the issue,
> but just in case, I'm listing it here)

Not adding to the thread here, but Dan, check the date on your machine.
This just showed up in my mailbox (8 March) and its headers say it was
sent on 25 January!

> 
> Date:	Fri, 29 Dec 2006 21:03:57 +0100
> From:	Ingo Molnar <mingo at elte.hu>
> Subject: [patch] remove MAX_ARG_PAGES
> Message-ID: <20061229200357.GA5940 at elte.hu>
> 
> Linux fileserver 2.6.20-rc4MAX_ARGS #4 PREEMPT Fri Jan 12 03:58:25 EST 2007 x86_64 GNU/Linux
> 
> This happened when I started testing gfs2 for the first time.  I
> installed userspace from CVS, loaded the gfs2/dlm modules, mkfs.gfs2,
> then "mount -t gfs2 -v /dev/vg1/gfs2 /mnt/gfs"
> 
> This was the initial mount of the new filesystem.  I can create
> directories, but attempting a stress-test with bonnie seems to have
> deadlocked something.  (at "Start 'em", immediately.)
> 
> To clarify: the two oopses happened at first mount.  After that, I
> created files/directories, then attempted to stress it a bit with
> bonnie++.  No further oops/dmesg output.
> 
> For the GFS2 folks, latest CVS gfs_tool doesn't have lockdump, is there
> any way to examine what I'm stuck on?
> 
> This machine is specifically for testing new things before I put them
> into production, so I can leave it hung like this indefinitely for
> debugging.
> 
> 
> [845566.571468] GFS2 (built Jan 12 2007 04:02:27) installed
> [849416.113382] DLM (built Jan 12 2007 04:01:21) installed
> [849416.352219] Lock_DLM (built Jan 12 2007 04:02:46) installed
> [850966.368016] GFS2: fsid=: Trying to join cluster "lock_dlm", "internal:gfs-test"
> [850971.783223] dlm: gfs-test: recover 1
> [850971.783242] dlm: gfs-test: add member 1
> [850971.783246] dlm: gfs-test: total members 1 error 0
> [850971.783248] dlm: gfs-test: dlm_recover_directory
> [850971.783260] dlm: gfs-test: dlm_recover_directory 0 entries
> [850971.783270] dlm: gfs-test: recover 1 done: 0 ms
> [850971.783454] GFS2: fsid=internal:gfs-test.0: Joined cluster. Now mounting FS...
> [850973.409048] GFS2: fsid=internal:gfs-test.0: jid=0, already locked for use
> [850973.409135] GFS2: fsid=internal:gfs-test.0: jid=0: Looking at journal...
> [850973.504558] GFS2: fsid=internal:gfs-test.0: jid=0: Done
> [850973.504653] GFS2: fsid=internal:gfs-test.0: jid=1: Trying to acquire journal lock...
> [850973.517086] GFS2: fsid=internal:gfs-test.0: jid=1: Looking at journal...
> [850973.691546] GFS2: fsid=internal:gfs-test.0: jid=1: Done
> [850973.691635] GFS2: fsid=internal:gfs-test.0: jid=2: Trying to acquire journal lock...
> [850973.702646] GFS2: fsid=internal:gfs-test.0: jid=2: Looking at journal...
> [850973.846397] GFS2: fsid=internal:gfs-test.0: jid=2: Done
> 
> 
> [850973.869288] ------------[ cut here ]------------
> [850973.869294] kernel BUG at fs/gfs2/glock.c:738!
> [850973.869297] invalid opcode: 0000 [1] PREEMPT 
> [850973.869300] CPU 0 
> [850973.869302] Modules linked in: lock_dlm dlm gfs2 scsi_tgt bttv video_buf firmware_class ir_common compat_ioctl32 btcx_risc tveeprom videodev v4l2_common v4l1_compat radeon nbd eth1394 ohci1394 dm_crypt eeprom w83627hf hwmon_vid i2c_isa i2c_viapro snd_via82xx snd_mpu401_uart snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore
> [850973.869324] Pid: 31076, comm: gfs2_glockd Not tainted 2.6.20-rc4MAX_ARGS #4
> [850973.869327] RIP: 0010:[<ffffffff8816cabb>]  [<ffffffff8816cabb>] :gfs2:gfs2_glmutex_unlock+0x2b/0x40
> [850973.869355] RSP: 0018:ffff81001849be70  EFLAGS: 00010282
> [850973.869359] RAX: ffff810023ff4ee0 RBX: ffff810023ff4e68 RCX: ffffffff88185800
> [850973.869363] RDX: 0000000000000000 RSI: ffff810023ff4ec0 RDI: ffff810023ff4e68
> [850973.869366] RBP: ffff810023ff4f38 R08: 0000000000000000 R09: 0000000000006052
> [850973.869370] R10: 0000000000000000 R11: ffffffff8816de60 R12: ffff810023ff4e68
> [850973.869374] R13: ffff810023ff4eb0 R14: ffff81003ffd6850 R15: ffff81003ffd6870
> [850973.869378] FS:  00002aebf51826d0(0000) GS:ffffffff807fb000(0000) knlGS:00000000f72026c0
> [850973.869381] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [850973.869384] CR2: 00002b9e93097fe0 CR3: 0000000003a79000 CR4: 00000000000006e0
> [850973.869388] Process gfs2_glockd (pid: 31076, threadinfo ffff81001849a000, task ffff810000b82890)
> [850973.869390] Stack:  ffff810023ff4eb0 ffffffff8816cc08 ffff81001849beb0 ffff810024322000
> [850973.869397]  ffff8100243223b8 ffff8100074cf968 ffffffff88163510 ffffffff88163528
> [850973.869402]  0000000000000000 ffff810000b82890 ffffffff8029fe70 ffff81001849bec8
> [850973.869407] Call Trace:
> [850973.869421]  [<ffffffff8816cc08>] :gfs2:gfs2_reclaim_glock+0x138/0x180
> [850973.869434]  [<ffffffff88163510>] :gfs2:gfs2_glockd+0x0/0xf0
> [850973.869445]  [<ffffffff88163528>] :gfs2:gfs2_glockd+0x18/0xf0
> [850973.869453]  [<ffffffff8029fe70>] autoremove_wake_function+0x0/0x30
> [850973.869465]  [<ffffffff88163510>] :gfs2:gfs2_glockd+0x0/0xf0
> [850973.869471]  [<ffffffff80234d43>] kthread+0xd3/0x110
> [850973.869476]  [<ffffffff80229407>] schedule_tail+0x37/0xc0
> [850973.869481]  [<ffffffff8029fca0>] keventd_create_kthread+0x0/0xa0
> [850973.869485]  [<ffffffff80264618>] child_rip+0xa/0x12
> [850973.869490]  [<ffffffff8029fca0>] keventd_create_kthread+0x0/0xa0
> [850973.869497]  [<ffffffff80234c70>] kthread+0x0/0x110
> [850973.869501]  [<ffffffff8026460e>] child_rip+0x0/0x12
> [850973.869504] 
> [850973.869505] 
> [850973.869506] Code: 0f 0b 66 66 90 eb fe 66 66 66 90 66 66 66 90 66 66 90 66 66 
> [850973.869514] RIP  [<ffffffff8816cabb>] :gfs2:gfs2_glmutex_unlock+0x2b/0x40
> [850973.869528]  RSP <ffff81001849be70>
> [850973.869530]  <6>note: gfs2_glockd[31076] exited with preempt_count 1
> 
> 
> [850986.762341] ------------[ cut here ]------------
> [850986.762346] kernel BUG at fs/gfs2/glock.c:738!
> [850986.762349] invalid opcode: 0000 [2] PREEMPT 
> [850986.762351] CPU 0 
> [850986.762353] Modules linked in: lock_dlm dlm gfs2 scsi_tgt bttv video_buf firmware_class ir_common compat_ioctl32 btcx_risc tveeprom videodev v4l2_common v4l1_compat radeon nbd eth1394 ohci1394 dm_crypt eeprom w83627hf hwmon_vid i2c_isa i2c_viapro snd_via82xx snd_mpu401_uart snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore
> [850986.762376] Pid: 31075, comm: gfs2_scand Not tainted 2.6.20-rc4MAX_ARGS #4
> [850986.762379] RIP: 0010:[<ffffffff8816cabb>]  [<ffffffff8816cabb>] :gfs2:gfs2_glmutex_unlock+0x2b/0x40
> [850986.762405] RSP: 0000:ffff81001f221e80  EFLAGS: 00010286
> [850986.762408] RAX: ffff810023ff4940 RBX: ffff810023ff48c8 RCX: 0000000000000000
> [850986.762412] RDX: 0000000000000146 RSI: ffff810023ff4920 RDI: ffff810023ff48c8
> [850986.762416] RBP: ffff810024322000 R08: ffff81001f220000 R09: 00000000f6e88388
> [850986.762418] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
> [850986.762422] R13: 0000000000000000 R14: ffffffff8816d2f0 R15: ffff81003ffd6870
> [850986.762426] FS:  00002b5212e53ae0(0000) GS:ffffffff807fb000(0000) knlGS:00000000f6e88bb0
> [850986.762429] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [850986.762432] CR2: 00000000f706f000 CR3: 000000002b96b000 CR4: 00000000000006e0
> [850986.762436] Process gfs2_scand (pid: 31075, threadinfo ffff81001f220000, task ffff810025448850)
> [850986.762438] Stack:  ffff810023ff48c8 ffffffff8816a86c 0000000000000147 ffff810024322000
> [850986.762445]  ffff8100074cf968 ffffffff88163600 ffff81003ffd6850 ffffffff8816ae54
> [850986.762450]  ffff81003ffd6850 000000000000000f ffff810024322000 ffffffff88163618
> [850986.762454] Call Trace:
> [850986.762469]  [<ffffffff8816a86c>] :gfs2:examine_bucket+0x8c/0x100
> [850986.762481]  [<ffffffff88163600>] :gfs2:gfs2_scand+0x0/0x70
> [850986.762494]  [<ffffffff8816ae54>] :gfs2:gfs2_scand_internal+0x24/0x40
> [850986.762506]  [<ffffffff88163618>] :gfs2:gfs2_scand+0x18/0x70
> [850986.762514]  [<ffffffff80234d43>] kthread+0xd3/0x110
> [850986.762519]  [<ffffffff80229407>] schedule_tail+0x37/0xc0
> [850986.762525]  [<ffffffff8029fca0>] keventd_create_kthread+0x0/0xa0
> [850986.762530]  [<ffffffff80264618>] child_rip+0xa/0x12
> [850986.762535]  [<ffffffff8029fca0>] keventd_create_kthread+0x0/0xa0
> [850986.762542]  [<ffffffff80234c70>] kthread+0x0/0x110
> [850986.762545]  [<ffffffff8026460e>] child_rip+0x0/0x12
> [850986.762548] 
> [850986.762549] 
> [850986.762550] Code: 0f 0b 66 66 90 eb fe 66 66 66 90 66 66 66 90 66 66 90 66 66 
> [850986.762559] RIP  [<ffffffff8816cabb>] :gfs2:gfs2_glmutex_unlock+0x2b/0x40
> [850986.762572]  RSP <ffff81001f221e80>
> [850986.762575]  <6>note: gfs2_scand[31075] exited with preempt_count 1
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
----------------------------------------------------------------------
- Rick Stevens, Principal Engineer          rstevens at vitalstream.com -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
-            I'm afraid my karma just ran over your dogma            -
----------------------------------------------------------------------