[Linux-cluster] Re: Why GFS is so slow? What it is waiting for?

Martin Fuerstenau martin.fuerstenau at oce.com
Fri May 9 12:39:40 UTC 2008


Hi,

unfortunaley not. According to my informaiotns (which are mainly from
this list and from the wiki) for each node of the cluster this structure
(journal) is established on the filesystem. If you read the manpage for
gfs_fsck you see, that it must be unmounted from all nodes.

If you have the problem I had you should plan a maintenance window
asap. 

My problem started as mentioned with a slow gfs from the beginning and
lead to clustercrashs after 7 months. All my problems were fixed by the
check. Perhaps is the same with your system.

Yours - Martin

On Fri, 2008-05-09 at 04:51 -0700, Ja S wrote:
> Hi Martin:
> 
> Thanks for your reply indeed.
> 
> --- Martin Fuerstenau <martin.fuerstenau at oce.com>
> wrote:
> 
> > Hi,
> > 
> > I had (nearly) the same problem. A slow gfs. From
> > the beginning. Two
> > weeks ago the cluster crashed every time the load
> > became heavier.
> > 
> > What was the reason? A rotten gfs. The gfs uses
> > leafnodes for data an
> > leafnodes for metadata whithin the filesystem. And
> > the problem was in
> > the metadata leafnodes.
> > 
> > Have you checked the Filesystem? Unmount it from all
> > nodes and use
> > gfs_fsck on the filesystem. 
> 
> No, not yet. I am afraid I cannot umount the file
> sytem then do the gfs_fsck since the server downtime
> is totally forbidden. 
> 
> Is there any other way to reclaim the unused or lost
> blocks ( I guess  leafnodes you mentioned meant to be
> the disk block, correct me if I am wrong.)? 
> 
> Should "gfs_tool settune /mnt/points inoded_secs 10"
> work for a heavy loaded node with freqent create and
> delete file operations?
> 
> 
> >In my case it reported
> > (and repaired) tons
> > of unused leafnoedes and some other errors. First
> > time I started it
> > without the -y (for yes). Well, after one hour ot
> > typing y I killed it
> > and started it with -y. The work was done whithin an
> > hour for 1TB. Now
> > the filesystem is clean and it was like a
> > turboloader and Nitrogen
> > injection for a car. Fast as it was never before. 
> 
> Great. Sounds fantastic. However, if the low
> performance is caused by the "rotten" gfs, will your
> now cleaned file system be possibly messed up again
> after a certain period? Do you have a smart way to
> monitor the status of your file system in order to
> make a regular downtime schedule and "force" your
> manager to prove it, :-) ? If you do, I am eager to
> know.
> 
> Thanks again and look forward to your next reply.
> 
> Best,
> 
> Jas
> 
> 
> 
> 
> > Maybe there is a bug in the mkfs command or so. I
> > will never use a gfs
> > without a filesystem check after creation
> > 
> > Martin Fuerstenau
> > Seniro System Engineer
> > Oce Printing Systems, Poing
> > 
> > On Fri, 2008-05-09 at 02:25 -0700, Ja S wrote:
> > > Hi, Klaus:
> > > 
> > > Thank you very much for your kind answer.
> > > 
> > > Tunning the parameters sounds really interesting.
> > I
> > > should give it a try.
> > > 
> > > By the way, how did you come up with these new
> > > parameter values? Did you calculate them based on 
> > > some measures or simply pick them up and test.
> > > 
> > > Best,
> > > 
> > > Jas
> > > 
> > > 
> > > --- Klaus Steinberger
> > > <Klaus.Steinberger at physik.uni-muenchen.de> wrote:
> > > 
> > > > Hi,
> > > > 
> > > > > However, it took ages to list the subdirectory
> > on
> > > > an
> > > > > absolute idle cluster node. See below:
> > > > >
> > > > > # time ls -la | wc -l
> > > > > 31767
> > > > >
> > > > > real    3m5.249s
> > > > > user    0m0.628s
> > > > > sys     0m5.137s
> > > > >
> > > > > There are about 3 minutes spent on somewhere.
> > Does
> > > > > anyone have any clue what the system was
> > waiting
> > > > for?
> > > > 
> > > > Did you tune glock's?  I found that it's very
> > > > important for performance of 
> > > > GFS.
> > > > 
> > > > I'm doing the following tunings currently:
> > > > 
> > > > gfs_tool settune /export/data/etp quota_account
> > 0
> > > > gfs_tool settune /export/data/etp glock_purge 50
> > > > gfs_tool settune /export/data/etp demote_secs
> > 200
> > > > gfs_tool settune /export/data/etp statfs_fast 1
> > > > 
> > > > Switch off quota off course only if you don't
> > need
> > > > it. All this tunings have 
> > > > to be done every time after mounting, so do it
> > in a
> > > > init.d script running 
> > > > after GFS mount, and of course do it on every
> > node.
> > > > 
> > > > Here is the link to the glock paper:
> > > > 
> > > >
> > >
> >
> http://people.redhat.com/wcheng/Patches/GFS/readme.gfs_glock_trimming.R4
> > > > 
> > > > The glock tuning (glock_purge and demote_secs
> > > > parameters) definitly solved  a 
> > > > problem we had here with the Tivoli Backup
> > Client.
> > > > Before it was running for 
> > > > days and sometimes even did give up. We observed
> > > > heavy lock traffic.
> > > > 
> > > > After changing the glock parameters times for
> > the
> > > > backup did go down 
> > > > dramatically, we now can run a Incremental
> > Backup on
> > > > a 4 TByte filesystem in 
> > > > under 4 hours. So give it a try.
> > > > 
> > > > There is some more tuning, which could be done
> > > > unfortunately just on creation 
> > > > of filesystem. The default number of Resource
> > Groups
> > > > is ways too large for 
> > > > nowadays TByte Filesystems. 
> > > > 
> > > > Sincerly,
> > > > Klaus
> > > > 
> > > > 
> > > > -- 
> > > > Klaus Steinberger        
> > Beschleunigerlaboratorium
> > > > Phone: (+49 89)289 14287  Am Coulombwall 6,
> > D-85748
> > > > Garching, Germany
> > > > FAX:   (+49 89)289 14280  EMail:
> > > > Klaus.Steinberger at Physik.Uni-Muenchen.DE
> > > > URL:
> > > >
> > >
> >
> http://www.physik.uni-muenchen.de/~Klaus.Steinberger/
> > > > > --
> > > > Linux-cluster mailing list
> > > > Linux-cluster at redhat.com
> > > >
> > >
> >
> https://www.redhat.com/mailman/listinfo/linux-cluster
> > > 
> > > 
> > > 
> > >      
> >
> ____________________________________________________________________________________
> > > Be a better friend, newshound, and 
> > > know-it-all with Yahoo! Mobile.  Try it now. 
> >
> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> > > 
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > >
> >
> https://www.redhat.com/mailman/listinfo/linux-cluster
> > > 
> > Martin Fürstenau        Tel.    : (49) 8121-72-4684
> > Oce Printing Systems  Fax     : (49) 8121-72-4996
> > OI-12                        E-Mail  :
> > martin.fuerstenau at oce.com
> > Siemensallee 2
> > 85586 Poing
> > Germany
> > 
> > 
> > 
> > Visit Oce at drupa! Register online now:
> > <http://drupa.oce.com>
> > 
> > This message and attachment(s) are intended solely
> > for use by the addressee and may contain information
> > that is privileged, confidential or otherwise exempt
> > from disclosure under applicable law.
> > 
> > If you are not the intended recipient or agent
> > thereof responsible for delivering this message to
> > the intended recipient, you are hereby notified that
> > any dissemination, distribution or copying of this
> > communication is strictly prohibited.
> > 
> > If you have received this communication in error,
> > please notify the sender immediately by telephone
> > and with a 'reply' message.
> > 
> > Thank you for your co-operation.
> > 
> > 
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> >
> https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> 
> 
> 
>       ____________________________________________________________________________________
> Be a better friend, newshound, and 
> know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 

Visit Oce at drupa! Register online now: <http://drupa.oce.com>

This message and attachment(s) are intended solely for use by the addressee and may contain information that is privileged, confidential or otherwise exempt from disclosure under applicable law.

If you are not the intended recipient or agent thereof responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited.

If you have received this communication in error, please notify the sender immediately by telephone and with a 'reply' message.

Thank you for your co-operation.






More information about the Linux-cluster mailing list