[Linux-cluster] GFS vs GFS2

Wed May 7 12:09:21 UTC 2008

Hi,

On Wed, 2008-05-07 at 13:44 +0200, Jonas Björklund wrote:
> Hello,
> 
> I would like to know also...
> 
> /Jonas
> 
> On Wed, 7 May 2008, Vimal Gupta wrote:
> 
> > Hi,
> >
> > I have the same question.???
> > Anybody has the answer Please.......???
> >
> > Chris Picton wrote:
> >>  Hi All
> >>
> >>  I am investigating a new cluster installation.
> >>
> >>  Documentation from redhat indicates that GFS2 is not yet production ready.
> >>  Tests I have run show it is *much* faster that gfs for my workload.
> >>
> >>  Is GFS2 not production-ready due to lack of testing, or due to known bugs?
> >>
> >>  Any advice would be appreciated
> >>
> >>  Chris
> >>

The answer is a bit of both. We are getting to the stage where the known
bugs are mostly solved or will be very shortly. You can see the state of
the bug list at any time by going to bugzilla.redhat.com and looking for
any bug with gfs2 in the summary line. There are currently approx 70
such bugs, but please bear in mind that a large number of these are
asking for new features, and some of them are duplicates of the same bug
across different versions of RHEL and/or Fedora.

We are currently at a stage where having a large number of people
helping us in testing would be very helpful. If you have your own
favourite filesystem test, or if you are in a position to run a test
application, then we would be very interested in any reports of
success/failure.

If you do have any problems, then please do:
 o Check bugzilla to see if someone else has had the same problem
 o Report them (preferably via bugzilla, as that ensures that they won't
get lost somewhere)
 o Report them as "Fedora, rawhide" if they relate to the upstream
kernel (either Linus' tree or my -nmw git tree) and indicate in the
comments section which of these kernels you were using
 o Send patches if you have them, but please don't let that stop you
reporting bugs. All reports are useful. We might not be able to always
fix each and every report right away, but sometimes patterns emerge via
a number of reports which do allow us to home in on a particularly
tricky issue.
 o If you experience a hang, then please include (if possible):
    - A glock lock dump from all nodes (via debugfs)
    - A dlm lock dump from all nodes (via debugfs)
    - A stack trace from all nodes (echo t >/proc/sysrq-trigger)
 o If you experience an oops, then please make sure that you include all
the messages (including those which might have been logged just before
the oops itself).

The more people we have testing & reporting bugs, the quicker we can
approach stability.

There is one issue which I'm currently working on relating to a (fairly
rare, but nonetheless possible) race. This happens when two threads
calling ->readpage() race with each other. The reason that this is
problematic is that its the one place left where we are using "try
locks" to get around the page lock/glock lock ordering problem and the
VFS's AOP_TRUNCATED_PAGE return code is not guaranteed to result in
->readpage() being called again if another ->readpage() has raced with
it and brought the page uptodate. As a result "try locks" are the only
option, but for long and complicated reasons when a "try lock" is queued
it might end up triggering a demotion (if a request is pending from a
remote node) which deadlocks due to page lock/glock ordering.

The patch I'm working on at the moment, fixes that problem by failing
the glock (GLR_TRYFAILED) if a demote is needed and scheduling the glock
workqueue to deal with the demotion, thus avoiding the race. The try
lock will then be retried at a later date when it can be successful.

The bugzilla for this is #432057 if you want to follow my progress.

Steve.