[Linux-cluster] Kernel panic: GFS: Assertion failed on line 550 of file rgrp.c

Fri Feb 10 17:25:05 UTC 2006

Hi Kevin,

Thx for your real fast reply :). The result of running gfs_fsck with
serveral "v" is below:

[root at rac3 root]# gfs_fsck -vvvvvvvvvy /dev/pool/oracle_u02
Initializing fsck
Initializing lists...
Initializing special inodes...
(file.c:45)     readi:  Offset (320) is >= the file size (320).
(super.c:211)   4 journals found.
(file.c:45)     readi:  Offset (45888) is >= the file size (45888).
(super.c:268)   478 resource groups found.
(util.c:112)    For 65773 Expected 1161970:2 - got 0:0
Buffer #65773 (1 of 5) is neither GFS_METATYPE_RB nor GFS_METATYPE_RG.
Resource group is corrupted.
Unable to read in rgrp descriptor.
Unable to fill in resource group information.
(initialize.c:364)      <backtrace> - init_sbp()

>It would be interesting to see if the partitions are identical after the
>snapshot.  How large are the LUNs?  Can you do a comparison of the
>volumes?  I would do those steps first before the fsck.  It is possible
>you have a problem with the oracle_u02, so would be interesting to run
>gfs_fsck if the snapped LUNs are identical.

We use the SnapView feature of our EMC CX500 SAN so those two LUNs _should_
be identical. In fact, we have cloned other GFS LUNs many times in the past
without no problem. Tomorrow we'll drop the destination LUN and try again if
gfs_fsck can not help.

Regards,

Thai Duong.

On 2/10/06, Kevin Anderson <kanderso at redhat.com> wrote:
>
> On Fri, 2006-02-10 at 22:59 +0700, Thai Duong wrote:
> > Hi Kevin,
> >
> > I did unmount oracle_u02 before cloning but still no luck.
> > When I tried to run gfs_fsck against oracle_u02 on the backup
> > cluster's node, it reported something like below:
> >
> > [root at rac3 root]# gfs_fsck -y /dev/pool/oracle_u02
> > Initializing fsck
> > Buffer #65773 (1 of 5) is neither GFS_METATYPE_RB nor GFS_METATYPE_RG.
>
> Add some -vvvvvvv flags to the gfs_fsck command line. Each "v" adds
> another layer of messages.  The DEBUG messages are at layer 7. This
> should print out more information about the resource group that it is
> failing to read.
>
> > Resource group is corrupted.
> > Unable to read in rgrp descriptor.
> > Unable to fill in resource group information.
> >
> > It seems that oracle_u02 somehow got broken. Running gfs_fsck against
> > oracle_u01 works like a charm. Do i need to run gfs_fsck against the
> > original oracle_u02? Please advise.
>
> It would be interesting to see if the partitions are identical after the
> snapshot.  How large are the LUNs?  Can you do a comparison of the
> volumes?  I would do those steps first before the fsck.  It is possible
> you have a problem with the oracle_u02, so would be interesting to run
> gfs_fsck if the snapped LUNs are identical.
>
> Kevin
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20060211/c73161b5/attachment.htm>