[Linux-cluster] GFS file system corruption?

Matthew B. Brookover mbrookov at mines.edu
Sun Aug 28 20:31:13 UTC 2005


I have 6 computers running Redhat Enterrpise 3 release 5, running kernel
2.4.21-32.0.1.ELsmp.

>From the source code I compiled GFS 6.0.2.20-2.  The SAN is an ISCSI
based storage system from LeftHand Networks.   Using ext3, the postmark
disk test works fine, on a GFS file system, we get a number of errors. 
The output from both postmark runs is below.

I unmounted the file systems, and ran gfs_fsck on the GFS system.  It
produced a number of errors like these:

Pass4 complete
Starting pass5
ondisk and fsck bitmaps differ at block 17887
Succeeded.
ondisk and fsck bitmaps differ at block 17888
Succeeded.
ondisk and fsck bitmaps differ at block 17889
Succeeded.
ondisk and fsck bitmaps differ at block 17890
Succeeded.
ondisk and fsck bitmaps differ at block 17891
Succeeded.
ondisk and fsck bitmaps differ at block 17892

(The complete gfs_fsck output is attached, I compressed it because the
file is quite large)

I have attached the cluster configuration files if you are interested. 
Fencing is handeled by a perl script that I wrote.  It uses SNMP to turn
off the ports in a Cisco 3750 switch.  There were no log entries from
GULM or GFS on any of the hosts.

PostMark is available from http://www.netapp.com/tech_library/3022.html.

Does any body have any ideas what would do this?

Other GFS file systems on these servers have had similar problems.  It
seems that gfs_fsck repairs bitmap errors after some number of file
creates and deletes.  PostMark is the only program that I have used on
GFS that reports errors.  The EXT3 fsck was clean after PostMark was
ran.

thank you for your assistance.

Matt Brookover
Academic Computing and Networking
Colorado School of Mines
303-273-3436
mbrookov at mines.edu

PostMark output when running on EXT3 file system:

PostMark v1.5 : 3/27/01
pm>set size 10000 100000000
pm>run
Creating files...Done
Performing transactions..........Done
Deleting files...Done
Time:
        1941 seconds total
        1350 seconds of transactions (0 per second)

Files:
        771 created (0 per second)
                Creation alone: 500 files (0 per second)
                Mixed with transactions: 271 files (0 per second)
        247 read (0 per second)
        253 appended (0 per second)
        771 deleted (0 per second)
                Deletion alone: 542 files (27 per second)
                Mixed with transactions: 229 files (0 per second)

Data:
        13100.09 megabytes read (6.75 megabytes per second)
        42926.13 megabytes written (22.12 megabytes per second)
pm>exit


PostMark output when running on GFS file system:

PostMark v1.5 : 3/27/01
pm>set size 10000 100000000
pm>run
Creating files...Done
Performing transactions....Error: cannot open '615' for writing
Error: cannot open '616' for writing
Error: cannot open '617' for writing
Error: cannot open '619' for writing
Error: cannot open '623' for writing
.Error: cannot open '624' for writing
Error: cannot open '625' for writing
Error: cannot open '626' for writing
Error: cannot open '629' for writing
Error: cannot open '633' for writing
Error: cannot open '634' for writing
Error: cannot open '635' for writing
Error: cannot open '636' for writing
Error: cannot open '637' for writing
Error: cannot open '641' for writing
Error: cannot open '642' for writing
Error: cannot open '643' for writing
Error: cannot open '644' for writing
Error: Cannot delete '637'
Error: Cannot delete '615'
Error: Cannot delete '634'
.Error: cannot open '650' for writing
Error: Cannot delete '625'
Error: cannot open '667' for writing
Error: cannot open '668' for writing
Error: cannot open '669' for writing
.Error: cannot open '687' for writing
.Error: cannot open '696' for writing
Error: cannot open '709' for writing
Error: cannot open '712' for writing
Error: cannot open '719' for writing
Error: cannot open '720' for writing
.Error: cannot open '721' for writing
Error: cannot open '642' for reading
Error: cannot open '722' for writing
Error: cannot open '626' for append
Error: cannot open '642' for append
Error: Cannot delete '624'
Error: cannot open '731' for writing
Error: cannot open '735' for writing
Error: cannot open '736' for writing
Error: cannot open '720' for append
Error: cannot open '737' for writing
Error: cannot open '741' for writing
Error: cannot open '742' for writing
Error: cannot open '743' for writing
Error: cannot open '744' for writing
Error: cannot open '746' for writing
Error: cannot open '748' for writing
Error: Cannot delete '743'
.Error: Cannot delete '721'
Error: cannot open '741' for reading
Error: Cannot delete '719'
Error: cannot open '755' for writing
Error: cannot open '756' for writing
Error: cannot open '641' for reading
Error: cannot open '760' for writing
Error: cannot open '743' for reading
Error: Cannot delete '636'
Error: Cannot delete '669'
Error: cannot open '687' for reading
Done
Deleting files...Error: Cannot delete '615'
Error: Cannot delete '617'
Error: Cannot delete '619'
Error: Cannot delete '623'
Error: Cannot delete '624'
Error: Cannot delete '625'
Error: Cannot delete '626'
Error: Cannot delete '629'
Error: Cannot delete '633'
Error: Cannot delete '634'
Error: Cannot delete '635'
Error: Cannot delete '636'
Error: Cannot delete '637'
Error: Cannot delete '641'
Error: Cannot delete '642'
Error: Cannot delete '643'
Error: Cannot delete '644'
Error: Cannot delete '667'
Error: Cannot delete '668'
Error: Cannot delete '669'
Error: Cannot delete '687'
Error: Cannot delete '696'
Error: Cannot delete '709'
Error: Cannot delete '712'
Error: Cannot delete '719'
Error: Cannot delete '720'
Error: Cannot delete '721'
Error: Cannot delete '722'
Error: Cannot delete '731'
Error: Cannot delete '735'
Error: Cannot delete '736'
Error: Cannot delete '737'
Error: Cannot delete '741'
Error: Cannot delete '742'
Error: Cannot delete '743'
Error: Cannot delete '744'
Error: Cannot delete '746'
Error: Cannot delete '748'
Error: Cannot delete '755'
Error: Cannot delete '756'
Error: Cannot delete '760'
Done
Time:
        1773 seconds total
        1086 seconds of transactions (0 per second)

Files:
        768 created (0 per second)
                Creation alone: 500 files (0 per second)
                Mixed with transactions: 268 files (0 per second)
        239 read (0 per second)
        253 appended (0 per second)
        768 deleted (0 per second)
                Deletion alone: 505 files (16 per second)
                Mixed with transactions: 222 files (0 per second)

Data:
        12812.81 megabytes read (7.23 megabytes per second)
        40532.43 megabytes written (22.86 megabytes per second)
pm>exit


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050828/743faa0b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gfs_fsck_all.out1.bz2
Type: application/x-bzip
Size: 36928 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050828/743faa0b/attachment.bin>
-------------- next part --------------
cluster
{ 
	name = "CSM_ACN" 
	lock_gulm
	{
		servers = ["imagine.Mines.EDU","illuminate.Mines.EDU","illusion.Mines.EDU"]
		heartbeat_rate = 3.0
		allowed_misses = 5
	} 
}
-------------- next part --------------
fence_devices
{
	CSMACN_fence
	{
		agent = "fence_cisco"
	}
}
-------------- next part --------------
nodes
{
	imagine.Mines.EDU
	{
		ip_interfaces
		{
			eth0 = "138.67.130.1"
		}
		fence
		{
			snmpfence
			{
				CSMACN_fence
				{
					port="imagine"
				}
			}
		}
	}

	illuminate.Mines.EDU
	{
		ip_interfaces
		{
			eth0 = "138.67.130.2"
		}
		fence
		{
			snmpfence
			{
				CSMACN_fence
				{
					port="illuminate"
				}
			}
		}
	}

	illusion.Mines.EDU
	{
		ip_interfaces
		{
			eth0 = "138.67.130.3"
		}
		fence
		{
			snmpfence
			{
				CSMACN_fence
				{
					port="illusion"
				} 
			} 
		} 
	}
	
	inspire.Mines.EDU
	{
		ip_interfaces
		{
			eth0 = "138.67.130.5"
		}
		fence
		{
			snmpfence
			{
				CSMACN_fence
				{
					port="inspire"
				}
			}
		}
	}
	inception.Mines.EDU
	{
		ip_interfaces
		{
			eth0 = "138.67.130.4"
		}
		fence
		{
			snmpfence
			{
				CSMACN_fence
				{
					port="inception"
				}
			}
		}
	}
	incantation.Mines.EDU
	{
		ip_interfaces
		{
			eth0 = "138.67.130.6"
		}
		fence
		{
			snmpfence
			{
				CSMACN_fence
				{
					port="incantation"
				}
			}
		}
	}
}


More information about the Linux-cluster mailing list