[Linux-cluster] gfs2 partition withdrawn

Adam Hough adam at gradientzero.com
Sun Oct 18 05:19:17 UTC 2009


On Sat, Oct 3, 2009 at 6:13 AM, Nicolas Ferré <
nicolas.ferre at univ-provence.fr> wrote:

> Hi,
>
> We have a problem with our cluster, a gfs2 fs cannot be accessed some times
> after the system reboot. I have to manually umount/mount it.
>
> Here is the relevant part of /var/log/messages:
> Oct  3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1: fatal: invalid
> metadata block
> Oct  3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1:   bh = 114419123
> (magic number)
> Oct  3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1:   function =
> gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 334
> Oct  3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1: about to withdraw
> this file system
> Oct  3 11:46:14 slater kernel: GFS2: fsid=crcmm:home.1: telling LM to
> withdraw
> Oct  3 11:46:16 slater kernel: VFS:Filesystem freeze failed
> Oct  3 11:46:22 slater snmpd[7344]: Connection from UDP: [127.0.0.1]:58640
> Oct  3 11:46:22 slater snmpd[7344]: Received SNMP packet(s) from UDP:
> [127.0.0.1]:58640
> Oct  3 11:46:37 slater snmpd[7344]: Connection from UDP: [127.0.0.1]:47125
> Oct  3 11:46:37 slater snmpd[7344]: Received SNMP packet(s) from UDP:
> [127.0.0.1]:47125
> Oct  3 11:46:53 slater snmpd[7344]: Connection from UDP: [127.0.0.1]:33910
> Oct  3 11:46:53 slater snmpd[7344]: Received SNMP packet(s) from UDP:
> [127.0.0.1]:33910
> Oct  3 11:46:53 slater kernel: dlm: home: group leave failed -512 0
> Oct  3 11:46:53 slater kernel: GFS2: fsid=crcmm:home.1: withdrawn
> Oct  3 11:46:53 slater kernel:
> Oct  3 11:46:53 slater kernel: Call Trace:
> Oct  3 11:46:53 slater kernel:  [<ffffffff8863a3ce>]
> :gfs2:gfs2_lm_withdraw+0xc1/0xd0
> Oct  3 11:46:53 slater kernel:  [<ffffffff80063a06>]
> __wait_on_bit+0x60/0x6e
> Oct  3 11:46:53 slater kernel:  [<ffffffff800153ac>] sync_buffer+0x0/0x3f
> Oct  3 11:46:53 slater kernel:  [<ffffffff80063a80>]
> out_of_line_wait_on_bit+0x6c/0x78
> Oct  3 11:46:53 slater kernel:  [<ffffffff8009f6ef>]
> wake_bit_function+0x0/0x23
> Oct  3 11:46:53 slater kernel:  [<ffffffff8001a7ac>] submit_bh+0x10a/0x111
> Oct  3 11:46:53 slater kernel:  [<ffffffff8864d547>]
> :gfs2:gfs2_meta_check_ii+0x2c/0x38
> Oct  3 11:46:53 slater kernel:  [<ffffffff8863de01>]
> :gfs2:gfs2_meta_indirect_buffer+0x104/0x15f
> Oct  3 11:46:53 slater kernel:  [<ffffffff8863d993>]
> :gfs2:gfs2_getbuf+0x106/0x115
> Oct  3 11:46:53 slater kernel:  [<ffffffff8862e786>]
> :gfs2:recursive_scan+0x96/0x175
> Oct  3 11:46:53 slater kernel:  [<ffffffff8862e82c>]
> :gfs2:recursive_scan+0x13c/0x175
> Oct  3 11:46:53 slater kernel:  [<ffffffff8862f6bc>]
> :gfs2:do_strip+0x0/0x349
> Oct  3 11:46:53 slater kernel:  [<ffffffff8862e8fe>]
> :gfs2:trunc_dealloc+0x99/0xe7
> Oct  3 11:46:53 slater kernel:  [<ffffffff8862f6bc>]
> :gfs2:do_strip+0x0/0x349
> Oct  3 11:46:53 slater kernel:  [<ffffffff88645dd2>]
> :gfs2:gfs2_delete_inode+0xdd/0x191
> Oct  3 11:46:53 slater kernel:  [<ffffffff88645d3b>]
> :gfs2:gfs2_delete_inode+0x46/0x191
> Oct  3 11:46:53 slater kernel:  [<ffffffff88635e77>]
> :gfs2:gfs2_glock_schedule_for_reclaim+0x5d/0x9a
> Oct  3 11:46:53 slater kernel:  [<ffffffff88645cf5>]
> :gfs2:gfs2_delete_inode+0x0/0x191
> Oct  3 11:46:53 slater kernel:  [<ffffffff8002f49e>]
> generic_delete_inode+0xc6/0x143
> Oct  3 11:46:53 slater kernel:  [<ffffffff8864a99c>]
> :gfs2:gfs2_inplace_reserve_i+0x63b/0x691
> Oct  3 11:46:53 slater kernel:  [<ffffffff80021f3f>] __up_read+0x19/0x7f
> Oct  3 11:46:53 slater kernel:  [<ffffffff88635dd8>]
> :gfs2:do_promote+0xf5/0x137
> Oct  3 11:46:53 slater kernel:  [<ffffffff8863f24a>]
> :gfs2:gfs2_write_begin+0x16c/0x339
> Oct  3 11:46:53 slater kernel:  [<ffffffff88640a7b>]
> :gfs2:gfs2_file_buffered_write+0xf3/0x26c
> Oct  3 11:46:53 slater kernel:  [<ffffffff88640e4c>]
> :gfs2:__gfs2_file_aio_write_nolock+0x258/0x28f
> Oct  3 11:46:53 slater kernel:  [<ffffffff88640fee>]
> :gfs2:gfs2_file_write_nolock+0xaa/0x10f
> Oct  3 11:46:54 slater kernel:  [<ffffffff800c5145>]
> generic_file_read+0xac/0xc5
> Oct  3 11:46:54 slater kernel:  [<ffffffff8009f6c1>]
> autoremove_wake_function+0x0/0x2e
> Oct  3 11:46:54 slater kernel:  [<ffffffff88635e77>]
> :gfs2:gfs2_glock_schedule_for_reclaim+0x5d/0x9a
> Oct  3 11:46:54 slater kernel:  [<ffffffff8009f6c1>]
> autoremove_wake_function+0x0/0x2e
> Oct  3 11:46:54 slater kernel:  [<ffffffff8864113e>]
> :gfs2:gfs2_file_write+0x49/0xa7
> Oct  3 11:46:54 slater kernel:  [<ffffffff80016927>] vfs_write+0xce/0x174
> Oct  3 11:46:54 slater kernel:  [<ffffffff800171df>] sys_write+0x45/0x6e
> Oct  3 11:46:54 slater kernel:  [<ffffffff8006149d>]
> sysenter_do_call+0x1e/0x6a
> Oct  3 11:46:54 slater kernel:
> Oct  3 11:46:54 slater kernel: GFS2: fsid=crcmm:home.1: gfs2_delete_inode:
> -5
>
> Can someone explain the meaning of such messages? And how to cure the
> problem ...
>
> Regards,
>
> --
> Nicolas Ferre'
> Laboratoire Chimie Provence
> Universite' de Provence - France
> Tel: +33 491282733
> http://sites.univ-provence.fr/lcp-ct
>
> Nicholas,

Any time you see a gfs/gfs2 filesystem withdrawn message do yourself a favor
and do an fsck of the fileystem.


These to links might explain some of what your are seeing especially after
you run an fsck.

http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.4/html/Global_File_System/s1-manage-gfswithdraw.html
https://bugzilla.redhat.com/show_bug.cgi?id=210367
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20091018/0d439354/attachment.htm>


More information about the Linux-cluster mailing list