[Linux-cluster] Re: processes stalled reading gfs filesystem

Frank frank at si.ct.upc.edu
Wed Mar 25 14:14:14 UTC 2009


Hi again,
I haven't received any answer but I keep on giving details about this issue.
Finally I umount GFS filesystem in both nodes and I have done a
gfs_fsck;it have fix several filesystem elements.
After I have mount it, and when we try to work on previously damaged
directories we get those messages:

GFS: fsid=hr-pm:gfs01.0: warning: assertion "(gh->gh_flags & LM_FLAG_ANY)
|| !(tmp_gh->gh_flags & LM_FLAG_ANY)" failed
GFS: fsid=hr-pm:gfs01.0:   function = add_to_queue
GFS: fsid=hr-pm:gfs01.0:   file = fs/gfs/glock.c, line = 1420
GFS: fsid=hr-pm:gfs01.0:   time = 1237984594
BUG: warning at fs/gfs/util.c:287/gfs_assert_warn_i() (Tainted:  P     )
 [<f9ad7e91>] gfs_assert_warn_i+0x92/0xbd [gfs]
 [<f9aba680>] gfs_glock_nq+0x131/0x36f [gfs]
 [<f9aba8d1>] gfs_glock_nq_init+0x13/0x26 [gfs]
 [<f9acf378>] gfs_private_nopage+0x45/0x81 [gfs]
 [<c0460831>] __handle_mm_fault+0x23b/0xe08
 [<c04597a2>] __do_page_cache_readahead+0x1ab/0x1cc
 [<c06062fe>] do_page_fault+0x2a4/0x5ad
 [<c060605a>] do_page_fault+0x0/0x5ad
 [<c0607dfb>] error_code+0x4f/0x54
 [<c060007b>] __inet6_check_established+0x21f/0x394

Any ideas?
Thanks.

Frank
> Date: Fri, 20 Mar 2009 12:20:47 +0100
> From: Frank <frank at si.ct.upc.edu>
> Subject: [Linux-cluster] processes stalled reading gfs filesystem
> To: linux-cluster at redhat.com
> Message-ID: <49C37C0F.5020308 at si.ct.upc.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hi,
> we have a couple of Dell servers with Red Hat 5.2 and OpenVZ, sharing a
> GFS filesystem.
>
> We have noticed that there are a directory which processes stalls when
> try to access it.
> For instance look this processes:
>
> [root at parmenides ~]# ps -fel | grep save
> 4 D root      8997     1  1  78   0 -  1780 339955 09:40 ?
> 00:02:31 /usr/sbin/save -s espai.upc.es -g Virtuals -LL -f - -m
> parmenides -t 1236294005 -l 4 -q -W 78 -N /mnt/gfs /mnt/gfs
> 0 S root     16736 21208  0  78   0 -   980 pipe_w 12:07 pts/1
> 00:00:00 grep save
> 4 D root     18796     1  1  78   0 -  1777 339955 08:46 ?
> 00:02:16 /usr/sbin/save -s espai.upc.es -g Virtuals -LL -f - -m
> parmenides -t 1236294005 -l 4 -q -W 78 -N /mnt/gfs /mnt/gfs
>
> Both processes are stalled reading a file:
>
> # lsof -p 8997 | grep gfs
> save    8997 root  cwd    DIR   253,7     2048   7022183
> /mnt/gfs/vz/private/109/usr/lib/openoffice/program
> save    8997 root    3r   DIR   253,7     3864        26 /mnt/gfs
> save    8997 root    6r   DIR   253,7     3864       232 /mnt/gfs/vz
> save    8997 root    7r   DIR   253,7     3864       233
> /mnt/gfs/vz/private
> save    8997 root    8r   DIR   253,7     3864 230761349
> /mnt/gfs/vz/private/109
> save    8997 root    9r   DIR   253,7     3864 230773154
> /mnt/gfs/vz/private/109/usr
> save    8997 root   12r   DIR   253,7     2048   7003944
> /mnt/gfs/vz/private/109/usr/lib
> save    8997 root   14r   DIR   253,7     3864   7022175
> /mnt/gfs/vz/private/109/usr/lib/openoffice
>
> # lsof -p 18796 | grep gfs
> save    18796 root  cwd    DIR   253,7     2048   7022183
> /mnt/gfs/vz/private/109/usr/lib/openoffice/program
> save    18796 root    3r   DIR   253,7     3864        26 /mnt/gfs
> save    18796 root    6r   DIR   253,7     3864       232 /mnt/gfs/vz
> save    18796 root    7r   DIR   253,7     3864       233
> /mnt/gfs/vz/private
> save    18796 root    8r   DIR   253,7     3864 230761349
> /mnt/gfs/vz/private/109
> save    18796 root    9r   DIR   253,7     3864 230773154
> /mnt/gfs/vz/private/109/usr
> save    18796 root   12r   DIR   253,7     2048   7003944
> /mnt/gfs/vz/private/109/usr/lib
> save    18796 root   14r   DIR   253,7     3864   7022175
> /mnt/gfs/vz/private/109/usr/lib/openoffice
>
> Also there is a process with the glock_ flag accesing the same:
>
> 0 D root      8425  6783  0  78   0 -   669 glock_ 08:24 ?
> 00:00:00 /usr/lib/openoffice/program/pagein
> -L/usr/lib/openoffice/program @pagein-common
>
> What can be the problem? A corruption in the filesystem?
> should a "gfs_fsck" fix the problem?
> Regards.
>
> Frank


-- 
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.
For all your IT requirements visit: http://www.transtec.co.uk




More information about the Linux-cluster mailing list