[Linux-cluster] rm -r on gfs2 filesystem is very slow

Fri Jul 10 16:47:19 UTC 2009

Hi,

On Fri, 2009-07-10 at 09:07 -0700, Peter Schobel wrote:
> So, in your opinion, there are no known issues which would cause this
> particular problem and this poor performance while deleting files
> should be considered normal?
> 
> Thanks,
> 
> Peter
> ~
> 
It depends just how bad it is... it will be worse, but the question is
how much worse. The challenge of course is to minimise the amount by
which it slows down,

Steve.

> On Fri, Jul 10, 2009 at 8:56 AM, Steven Whitehouse<swhiteho at redhat.com> wrote:
> > Hi,
> >
> > On Fri, 2009-07-10 at 08:49 -0700, Peter Schobel wrote:
> >> The initial writing is done via the network by checking out source
> >> trees from a Perforce repository. Beyond that, source trees are
> >> compiled causing the creation of many object files.
> >>
> >> Multiple source trees will be compiled from the same node or from
> >> multiple nodes.
> >>
> >> This performance problem exhibits itself even when using a single
> >> node. Writing to the filesystem seems to work fine. The time to do a
> >> cp -r dir /gfs/dir is very comparable to writing to local disk
> >> however, rm -r /gfs/dir takes considerably longer than it does on
> >> local disk. I am guessing this is a feature of dlm checking for a lock
> >> on each individual file but I'm not sure.
> >>
> >> Peter
> >
> > Partly that is the case. There are some things which can be done to
> > improve performance in the deallocation area, and so that is likely to
> > improve in future. The main issue is to ensure that we continue to
> > maintain the correct locking order in that code. It can be complex since
> > it involves the inode lock, transaction lock, and (maybe) multiple
> > resource group locks,
> >
> > Steve.
> >
> >> ~
> >>
> >> On Fri, Jul 10, 2009 at 8:27 AM, Steven Whitehouse<swhiteho at redhat.com> wrote:
> >> > Hi,
> >> >
> >> > On Fri, 2009-07-10 at 07:42 -0700, Peter Schobel wrote:
> >> >> When we did our initial proof of concept, we did not notice any
> >> >> performance problem of this magnitude. We were using OS release 2. Our
> >> >> QA engineers passed approval on the performance stats of the gfs2
> >> >> filesystem and now that we are in deployment phase they are calling it
> >> >> unusable.
> >> >>
> >> >> Have there been any recent software changes that could have caused
> >> >> degraded performance or something I may have missed in configuration?
> >> >> Are there any tunable parameters in gfs2 that may increase our
> >> >> performance?
> >> >>
> >> > Not that I'm aware of. There are no tunable parameters which might
> >> > affect this particular aspect of performance, but to be clear exactly
> >> > what the issue is, let me ask a few questions...
> >> >
> >> >> Our application is very write intensive. Basically we are compiling a
> >> >> source tree and running a make clean between builds.
> >> >>
> >> >> Thanks in advance,
> >> >>
> >> >> Peter
> >> >> ~
> >> >>
> >> > What is the nature of the writes? Are the different nodes writing into
> >> > different directories in the main?
> >> >
> >> > GFS2 is pretty good at large directories, given certain conditions. Look
> >> > ups should be pretty fast. Once there is a writer into a particular
> >> > directory, then ideally one would take care not to read or write that
> >> > directory from other nodes until the writer is finished.
> >> >
> >> > Directory listing of large directories can be slow, and counts as
> >> > reading the directory from a caching point of view. Look ups of
> >> > individual files should be fast though,
> >> >
> >> > Steve.
> >> >
> >> >
> >> >> On Wed, Jul 08, 2009 at 01:58:30PM -0700, Peter Schobel wrote:
> >> >>
> >> >> >> I am trying to set up a four node cluster but am getting very poor
> >> >> >> performance when removing large directories. A directory approximately
> >> >> >> 1.6G  in size takes around 5 mins to remove from the gfs2 filesystem
> >> >> >> but removes in around 10 seconds from the local disk.
> >> >> >>
> >> >> >> I am using CentOS 5.3 with kernel 2.6.18-128.1.16.el5PAE.
> >> >> >>
> >> >> >> The filesystem was formatted in the following manner: mkfs.gfs2 -t
> >> >> >> wtl_build:dev_home00 -p lock_dlm -j 10
> >> >> >> /dev/mapper/VolGroupGFS-LogVolDevHome00 and is being mounted with the
> >> >> >> following options: _netdev,noatime,defaults.
> >> >> >
> >> >> > This is something you have to live with.  GFS(2) works great, but with
> >> >> > large(r) directories performance is extremely bad and for many
> >> >> > applications a real show-stopper.
> >> >> >
> >> >> > There have been many discussions on this list, with GFS parameter tuning
> >> >> > suggestions that at least for me didn't result in any improvements, with
> >> >> > promises that the problems would be solved in GFS2 (I see no significant
> >> >> > performance improvements between GFS and GFS2), etc.
> >> >>
> >> >> > --
> >> >> > --    Jos Vos <jos at xos.nl>
> >> >> > --    X/OS Experts in Open Systems BV   |   Phone: +31 20 6938364
> >> >> > --    Amsterdam, The Netherlands        |     Fax: +31 20 6948204
> >> >>
> >> >
> >> > --
> >> > Linux-cluster mailing list
> >> > Linux-cluster at redhat.com
> >> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >> >
> >>
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> 
> 
>