[Linux-cluster] GFS file system corruption?

Matthew B. Brookover mbrookov at mines.edu
Mon Aug 29 14:14:59 UTC 2005


I have 6 computers running Redhat Enterrpise 3 release 5, and kernel
2.4.21-32.0.1.ELsmp.

I compiled GFS 6.0.2.20-2 from the source code.  The SAN is an ISCSI
based storage system from LeftHand Networks.   Using ext3, the postmark
disk test works fine, on a GFS file system, we get a number of errors. 
The output from both postmark runs is below.

I unmounted the file systems, and ran gfs_fsck on the GFS system.  It
produced a number of errors like these: 

[root at imagine root]# gfs_fsck -y /dev/pool/u_as
Initializing fsck
Starting pass1
Pass1 complete
Starting pass1b
Pass1b complete
Starting pass1c
Pass1c complete
Starting pass2
Pass2 complete
Starting pass3
Pass3 complete
Starting pass4
Pass4 complete
Starting pass5
ondisk and fsck bitmaps differ at block 17887
Succeeded.
ondisk and fsck bitmaps differ at block 17888
Succeeded.
ondisk and fsck bitmaps differ at block 17889
Succeeded.
ondisk and fsck bitmaps differ at block 17890
Succeeded.
ondisk and fsck bitmaps differ at block 17891
Succeeded.
ondisk and fsck bitmaps differ at block 17892
Succeeded.
ondisk and fsck bitmaps differ at block 17893
Succeeded.
ondisk and fsck bitmaps differ at block 17894
Succeeded.
ondisk and fsck bitmaps differ at block 17895
Succeeded.
ondisk and fsck bitmaps differ at block 17896
Succeeded.
ondisk and fsck bitmaps differ at block 17897
Succeeded.
ondisk and fsck bitmaps differ at block 17898
Succeeded.
ondisk and fsck bitmaps differ at block 17899
Succeeded.
ondisk and fsck bitmaps differ at block 17900
Succeeded.
ondisk and fsck bitmaps differ at block 17901
Succeeded.

The complete output was from gfs_fsck was 935k.

I have included the the cluster configuration files below.  Fencing is
handled by a perl script that I wrote.  It uses SNMP to turn off the
ports in a Cisco 3750 switch.  There were no log entries from GULM or
GFS on any of the hosts.

PostMark is available from http://www.netapp.com/tech_library/3022.html.

Does any body have any ideas what would do this?

Other GFS file systems on these servers have had similar problems.  It
seems that gfs_fsck repairs bitmap errors after some number of file
creates and deletes.  PostMark is the only program that I have used on
GFS that reports errors.  The EXT3 fsck was clean after PostMark was
ran.

thank you for your assistance.

Matt Brookover
Academic Computing and Networking
Colorado School of Mines
303-273-3436
mbrookov at mines.edu

PostMark output when running on EXT3 file system: 

PostMark v1.5 : 3/27/01
pm>set size 10000 100000000
pm>run
Creating files...Done
Performing transactions..........Done
Deleting files...Done
Time:
        1941 seconds total
        1350 seconds of transactions (0 per second)

Files:
        771 created (0 per second)
                Creation alone: 500 files (0 per second)
                Mixed with transactions: 271 files (0 per second)
        247 read (0 per second)
        253 appended (0 per second)
        771 deleted (0 per second)
                Deletion alone: 542 files (27 per second)
                Mixed with transactions: 229 files (0 per second)

Data:
        13100.09 megabytes read (6.75 megabytes per second)
        42926.13 megabytes written (22.12 megabytes per second)
pm>exit


PostMark output when running on GFS file system: 

PostMark v1.5 : 3/27/01
pm>set size 10000 100000000
pm>run
Creating files...Done
Performing transactions....Error: cannot open '615' for writing
Error: cannot open '616' for writing
Error: cannot open '617' for writing
Error: cannot open '619' for writing
Error: cannot open '623' for writing
.Error: cannot open '624' for writing
Error: cannot open '625' for writing
Error: cannot open '626' for writing
Error: cannot open '629' for writing
Error: cannot open '633' for writing
Error: cannot open '634' for writing
Error: cannot open '635' for writing
Error: cannot open '636' for writing
Error: cannot open '637' for writing
Error: cannot open '641' for writing
Error: cannot open '642' for writing
Error: cannot open '643' for writing
Error: cannot open '644' for writing
Error: Cannot delete '637'
Error: Cannot delete '615'
Error: Cannot delete '634'
.Error: cannot open '650' for writing
Error: Cannot delete '625'
Error: cannot open '667' for writing
Error: cannot open '668' for writing
Error: cannot open '669' for writing
.Error: cannot open '687' for writing
.Error: cannot open '696' for writing
Error: cannot open '709' for writing
Error: cannot open '712' for writing
Error: cannot open '719' for writing
Error: cannot open '720' for writing
.Error: cannot open '721' for writing
Error: cannot open '642' for reading
Error: cannot open '722' for writing
Error: cannot open '626' for append
Error: cannot open '642' for append
Error: Cannot delete '624'
Error: cannot open '731' for writing
Error: cannot open '735' for writing
Error: cannot open '736' for writing
Error: cannot open '720' for append
Error: cannot open '737' for writing
Error: cannot open '741' for writing
Error: cannot open '742' for writing
Error: cannot open '743' for writing
Error: cannot open '744' for writing
Error: cannot open '746' for writing
Error: cannot open '748' for writing
Error: Cannot delete '743'
.Error: Cannot delete '721'
Error: cannot open '741' for reading
Error: Cannot delete '719'
Error: cannot open '755' for writing
Error: cannot open '756' for writing
Error: cannot open '641' for reading
Error: cannot open '760' for writing
Error: cannot open '743' for reading
Error: Cannot delete '636'
Error: Cannot delete '669'
Error: cannot open '687' for reading
Done
Deleting files...Error: Cannot delete '615'
Error: Cannot delete '617'
Error: Cannot delete '619'
Error: Cannot delete '623'
Error: Cannot delete '624'
Error: Cannot delete '625'
Error: Cannot delete '626'
Error: Cannot delete '629'
Error: Cannot delete '633'
Error: Cannot delete '634'
Error: Cannot delete '635'
Error: Cannot delete '636'
Error: Cannot delete '637'
Error: Cannot delete '641'
Error: Cannot delete '642'
Error: Cannot delete '643'
Error: Cannot delete '644'
Error: Cannot delete '667'
Error: Cannot delete '668'
Error: Cannot delete '669'
Error: Cannot delete '687'
Error: Cannot delete '696'
Error: Cannot delete '709'
Error: Cannot delete '712'
Error: Cannot delete '719'
Error: Cannot delete '720'
Error: Cannot delete '721'
Error: Cannot delete '722'
Error: Cannot delete '731'
Error: Cannot delete '735'
Error: Cannot delete '736'
Error: Cannot delete '737'
Error: Cannot delete '741'
Error: Cannot delete '742'
Error: Cannot delete '743'
Error: Cannot delete '744'
Error: Cannot delete '746'
Error: Cannot delete '748'
Error: Cannot delete '755'
Error: Cannot delete '756'
Error: Cannot delete '760'
Done
Time:
        1773 seconds total
        1086 seconds of transactions (0 per second)

Files:
        768 created (0 per second)
                Creation alone: 500 files (0 per second)
                Mixed with transactions: 268 files (0 per second)
        239 read (0 per second)
        253 appended (0 per second)
        768 deleted (0 per second)
                Deletion alone: 505 files (16 per second)
                Mixed with transactions: 222 files (0 per second)

Data:
        12812.81 megabytes read (7.23 megabytes per second)
        40532.43 megabytes written (22.86 megabytes per second)
pm>exit


Cluster.css file:

cluster
{
        name = "CSM_ACN"
        lock_gulm
        {
                servers = ["imagine.Mines.EDU","illuminate.Mines.EDU","illusion.Mines.EDU"]
                heartbeat_rate = 3.0
                allowed_misses = 5
        }
}


fence.css file:

fence_devices
{
        CSMACN_fence
        {
                agent = "fence_cisco"
        }
}


Nodes.css file:

nodes
{
        imagine.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.1"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="imagine"
                                }
                        }
                }
        }

        illuminate.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.2"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="illuminate"
                                }
                        }
                }
        }

        illusion.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.3"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="illusion"
                                }
                        }
                }
        }

        inspire.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.5"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="inspire"
                                }
                        }
                }
        }
        inception.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.4"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="inception"
                                }
                        }
                }
        }
        incantation.Mines.EDU
        {
                ip_interfaces
                {
                        eth0 = "138.67.130.6"
                }
                fence
                {
                        snmpfence
                        {
                                CSMACN_fence
                                {
                                        port="incantation"
                                }
                        }
                }
        }
}


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050829/71902757/attachment.htm>


More information about the Linux-cluster mailing list