[Linux-cluster] GFS Problem: invalid metadata block

Tue Oct 10 21:00:53 UTC 2006

Thank you Robert and Wendy for taking the time to answer my question -- 
I appreciate it.

As you suggested it was indeed a SAN problem.  Someone else in my 
organization was attempting to use the same section of disk as another 
filesystem on a different host.  I can understand why GFS would be 
unhappy with that.

--Matt

Robert Peterson wrote:
> Matt Eagleson wrote:
>> Hello,
>>
>> I have been evaluating a GFS cluster as an NFS solution and have 
>> unfortunately run in to a serious problem which I cannot explain.  
>> Both of the GFS filesystems I am exporting became corrupt and unusable.
>>
>> The system is Redhat AS4 with 2.6.9-42.0.2.ELsmp.  I cannot find 
>> anything unusual on the host or the SAN at the time of this error.  
>> Nobody was logged in to the nodes.
>>
>> Can anyone help me understand what is happening here?
>>
>> Here are the logs:
> 
> Hi Matt,
> 
> These errors indicate file system corruption on your SAN.  The "bh =" is 
> the
> block number where the error was detected.  Two of the errors were found
> in GFS resource group data ("RG"), which are areas on disk that indicate 
> which
> blocks on the SAN are allocated and which aren't.  (Not to be confused 
> with the
> Resource Groups in rgmanager, which is something completely different.) 
> The third error is usually reserved for the quota file inode.
> Corruption in the RG information is extremely rare, and may indicate a 
> hardware
> problem with your SAN.  The fact that both nodes detected problems in 
> different
> areas is an indication that the problem might be in the SAN itself 
> rather than
> the motherboards, fibre channel cards or memory of the nodes, although 
> that's
> still not guaranteed.  Many things can cause data corruption.
> 
> I recommend you:
> 
> 1. Verify the hardware is working properly in all respects.  One way you 
> can do this
>    is to make a backup of the raw data to another device and verify the 
> copy against
>    the original without GFS or any of the cluster software in the mix.
>    For example, unmount the file system from all nodes in the cluster, 
> then do
>    something like "dd if=/dev/my_vg/lvol0 of=/mnt/backup/sanbackup" then:
>    "diff /dev/my_vg/lvol0 mnt/backup/sanbackup"  (assuming of course that
>    /dev/my_vg/lvol0 is the logical volume you have your GFS partition 
> on, and
>    /mnt/backup/ is some scratch area big enough to hold that much data.)
>    The idea here is simply to test that reading from the SAN give you
>    the same data twice.  If that works successfully on one node, try it 
> on the other node.
> 2. Once you verify the hardware is working properly, run gfs_fsck on it.
>    The latest version of gfs_fsck can repair most GFS rg corruption.
> 3. If the file system is fixed okay, you should back it up.
> 4. You may want to do a similar test, only writing data to the SAN, then 
> reading it
>    back and verifying the results.  Obviously this will destroy the data 
> on your SAN
>    unless you are careful, so if this is a production machine, please 
> take measures
>    to protect the data before trying anything like this.
> 5. If you can read and write to the SAN reliably from both nodes without 
> GFS,
>    then try using it again with GFS and see if the problem comes back.
> 
> Perhaps someone else (the SAN manufacturer?) can recommend hardware
> tests you can run to verify the data integrity.
> 
> I realize these kinds of tests take a long time to do, but if it's a 
> hardware problem,
> you really need to know.  There's a outside chance the problem is somewhere
> in the GFS core, but I've personally only seen this type of corruption 
> once or twice
> so I think it's unlikely.  If you can recreate this kind of corruption 
> with some kind of
> test, please let us know how.
> 
> Regards,
> 
> Bob Peterson
> Red Hat Cluster Suite
> 
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster