[Linux-cluster] GFS1 NFS problems under 2.6.25-rc1 (was: GFS2 loses data under kernel 2.6.24...)

Glen Dosey doseyg at r-networks.net
Mon Feb 11 21:08:26 UTC 2008


Yes, it fixes that problem, thank you.

So I grabbed the latest code from kernel.org, 2.6.25-rc1 and tested it
successfully. The problem now is that I cannot seem to get the GFS1 code
working properly. I grabbed the latest HEAD from CVS and modified sys.c to
work with the new kobject calls. Or at least I think I have, perhaps that
is what is wrong, I have attached the diff.

The gfs module builds successfully against the 2.6.25-rc1 kernel and
loads. Locally on the server everything appears to work, but on the GFs
filesystem NFS exported from the server, clients can create files but they
sometimes fail to store any data. For example:

[root at nccws00 gfs]# echo test > newfile
[root at nccws00 gfs]# cat newfile
test
[root at nccws00 gfs]# dd if=/dev/zero of=test1.dd bs=8k count=2
dd: closing output file `test1.dd': Invalid argument
[root at nccws00 gfs]# cp /tmp/patch-2.6.24.2.bz2 .
cp: closing `./patch-2.6.24.2.bz2': Invalid argument
[root at nccws00 gfs]# ll
total 3152084
-rw-r--r-- 1 root root          5 Feb 11 16:00 newfile
-rw-r--r-- 1 root root          0 Feb 11 16:01 patch-2.6.24.2.bz2
-rw-rw-rw- 1 root root          0 Feb 11 16:01 test1.dd
-rw-r--r-- 1 root root 1073741824 Feb  7 17:16 test2.dd
-rw-r--r-- 1 root root 1073741824 Feb  7 17:19 test3.dd
-rw-r--r-- 1 root root 1073741824 Feb  7 19:38 test4.dd
[root at nccws00 gfs]#



Am I missing some other significant changes which need patches as well ? I
only need GFS1 to work over NFS currently.


Thanks,
Glen Dosey


> Hi,
>
> Does this patch fix it for you?
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=9656b2c14c6ee0806c90a6be41dec71117fc8f50
>
> or you can just upgrade to the latest upstream Linus kernel. It was a
> result of the write_end function not working in exactly the same way as
> the older commit_write used to,
>
> Steve.
>
> On Fri, 2008-02-08 at 00:01 -0500, Glen Dosey wrote:
>> I experienced this today at work on a RHEL5 system and have verified it
>> today at home on Fedora 8. Perhaps I am doing something foolish ....
>>
>> I have a fully patched RHEL5 x86_64 system which works fine with the Red
>> Hat supplied cluster stuff, except the NFS server performance is abysmal
>> (~640Mb/s NFS). After pulling my hair trying to fix NFS I decided to
>> just grab the latest kernel which fixed the problem (~980Mb/s NFS). But
>> it introduced another much more serious problem, which I've duplicated
>> on my FC8 x86_64 system at home.
>>
>> I already have all the cman/clvmd/openais/gfs[2]-utils packages
>> installed through the package manager. I downloaded kernel 2.6.24 from
>> kernel.org and did a straight `make -j4 rpm ` and installed the
>> resulting rpm in both instances. Both systems worked fine with
>> RHEL/Fedora kernels, but here's what happens under 2.6.24
>>
>> [root at eclipse test]# dd if=/dev/zero of=test3.dd bs=512M count=1
>> 1+0 records in
>> 1+0 records out
>> 536870912 bytes (537 MB) copied, 7.95285 s, 67.5 MB/s
>> [root at eclipse test]# ll
>> total 2101312
>> -rw-r--r-- 1 root root          0 2008-02-07 23:25 test2.dd
>> -rw-r--r-- 1 root root  536870912 2008-02-07 23:42 test3.dd
>> -rw-r--r-- 1 root root 1073741824 2008-02-07 22:54 test.dd
>> [root at eclipse test]# cd ..
>> [root at eclipse mnt]# umount /mnt/test/
>> [root at eclipse mnt]# mount /mnt/test/
>> [root at eclipse mnt]# mount | grep test
>> /dev/mapper/disk00-test on /mnt/test type gfs2
>> (rw,hostdata=jid=0:id=524289:first=1)
>> [root at eclipse mnt]# cd /mnt/test/
>> [root at eclipse test]# ll
>> total 2101312
>> -rw-r--r-- 1 root root          0 2008-02-07 23:25 test2.dd
>> -rw-r--r-- 1 root root          0 2008-02-07 23:42 test3.dd
>> -rw-r--r-- 1 root root 1073741824 2008-02-07 22:54 test.dd
>>
>> Files that have data just go zero size after an umount and remount. I've
>> tried a variety of file sizes and tried it with file containing data as
>> well (not all zeros). This worked under the RHEL kernels, so is there
>> something I'm doing wrong ?
>>
>> Both systems are running cman and are a quorate 2 node cluster (where
>> the second node doesn't exist). At work it's a 1TB shared filesystem but
>> here at home it's just a local disk, so there's nothing else with any
>> access to it.
>>
>> If someone could maybe point out what I'm doing wrong I'd appreciate it,
>> or just let me know this won't work for whatever reason. I haven't even
>> touched on getting the GFS1 modules to build into this.
>>
>> Thanks,
>> Glen
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GFS_kobject.patch
Type: application/octet-stream
Size: 1396 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080211/e482b2ee/attachment.obj>


More information about the Linux-cluster mailing list