[Linux-cluster] Re: Linux-cluster Digest, Vol 20, Issue 14

Jonathan Woytek woytek+ at cmu.edu
Tue Dec 13 21:07:23 UTC 2005


I did not keep up on this thread, but I can comment and say that I fall 
into the third camp that you list below, and I am NOT a happy camper. 
I've actually been going through our setup and moving filesystems over 
to EXT3 and dealing without the advantages that GFS provided to us 
initially.

Our initial setup was about a year ago, with four storage units 
connected via iSCSI to a RedHat Cluster Services and GFS cluster of two 
machines to serve as front-ends to the data.  We later added two 
GFS-only nodes to handle specific applications.  The idea was that the 
RHCS cluster would provide user services for data access, such as NFS, 
samba, CVS, Subversion, HTTP, etc.  The majority of users wanted samba 
and NFS access, with the majority of that access coming through windows 
users connected via samba.

Our results were quite disastrous, and continue to be bad.  Our initial 
roll-out looked good, until we started to get a lot of concurrent access 
to files over samba.  Our symptoms are that what appears to be a memory 
leak is eventually triggered somewhere, which begins to dramatically 
slow samba access.  Eventually, the system gets into a state where the 
kernel begins to go into an OOM loop, killing things until it kills the 
RHCS watchdog, which causes a reboot of the machine.  While it is doing 
this, GFS performance for all filesystems grinds to a halt on the 
affected machine (though GFS performance elsewhere works, but is slowed 
somewhat).

As a result, we've begun looking at other solutions, and are moving as 
many filesystems off of GFS as possible.  I've also ended up being a bit 
critical of the support that we paid for on this issue, as what amounted 
to months of having a call open with RedHat support yielded nothing but 
requests for additional logs, the capture of which normally triggered 
the OOM loop when the machine was already in a bad state.

Until something that should be as simple as providing remote access to 
GFS filesystems works, I maintain (publically and privately) that GFS is 
not ready for prime-time, and certainly not worth the money that I paid 
for two nodes worth of GFS support.

While it is possible that the GFS system available under RH4 works and 
fixes some of these issues, I can't be upgrading our production machines 
  with major OS releases every few months on the unconfirmed and 
probably slim chance that the upgrade will fix the problems.  I'm sure 
I'm not the only one in this kind of situation.

Sorry for the rant-like post, but I am just a tiny bit frustrated here.

jonathan


Alan Wood wrote:
> you can look at my post from Nov 15 of 2004 to see the effects I 
> experienced running samba on top of GFS.  whether or not the problems 
> stem purely from locking I don't know (I played extensively with the 
> locking options in my smb.conf, to no avail), but the crashes [and 
> delays] I saw when I had multiple users access the same file/share made 
> the system unusable in production. whenever I've pushed on this question 
> people seem to fall into one of two camps:
> 1.  never tried running samba on top of GFS with high load, but thinks 
> it should work
> 2.  acknowledges there might be some underlying problems
> 
> if there is a 3rd camp out there of people who are running samba sharing 
> on top of GFS I'd love to hear about it.  My experience says it'll start 
> up fine and probably work ok under light load (say, 5 users) or if users 
> only ever access their own shares.  but as soon as you have multiple 
> users accessing a common samba share you start experiencing 
> [unacceptable] delays and if something else is going on (say a webserver 
> serving the same path) you'll probably get a crash.
> -alan
> 
> On Tue, 13 Dec 2005 linux-cluster-request at redhat.com wrote:
> 
>> Date: Tue, 13 Dec 2005 09:26:38 -0600
>> From: Eric Anderson <anderson at centtech.com>
>> Subject: Re: [Linux-cluster] Re: Linux-cluster Digest, Vol 20, Issue
>>     12
>> To: linux clustering <linux-cluster at redhat.com>
>> Message-ID: <439EE82E.2080106 at centtech.com>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Marco Masotti wrote:
>>
>>>> ==========================
>>>> Date: Tue, 13 Dec 2005 01:42:11 -0800 (PST)
>>>> From: Alan Wood <chekov at ucla.edu>
>>>> To: linux-cluster at redhat.com
>>>> Subject: [Linux-cluster] Re: Linux-cluster Digest, Vol 20, Issue
>>>> 12
>>>> ==========================
>>>>
>>>>
>>>
>>>
>>> [...]
>>>
>>>
>>>
>>>> SMB is stateful and not cluster
>>>> aware,
>>>>
>>>>
>>>
>>>
>>> I'm defintely missing something in my assumptions. By its very 
>>> nature, shouldn't GFS be prescinding from its application, as in 
>>> every other filesystem?
>>>
>>> Also, pls allow the ingenuous question, what number of applications 
>>> needs ever to be cluster aware, if not a very strict one? Also, 
>>> intuitively as it may come, should a properly written applicative be 
>>> independent of the operating filesystem properties? Thanks.
>>>
>>>
>>
>> I agree here - GFS supposedly supports posix semantics, so the
>> application should not care about whether it is clustered or not, as
>> long as it using locking correctly on it's own.  At least, with other
>> clustered filesystems, this is the case. If GFS doesn't allow this, I
>> would say it isn't really a cluster aware filesystem, but more of a
>> distributed lock/cache coherent filesystem without fully clustered
>> semantics.. (please correct me here! I'm still learning)
>>
>> Eric
>>
>>
>>
>>
>>
> 
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Jonathan Woytek                 w: 412-681-3463         woytek+ at cmu.edu
NREC Computing Manager          c: 412-401-1627         KB3HOZ
PGP Key available upon request




More information about the Linux-cluster mailing list