[Linux-cluster] Directories with >100K files
swhiteho at redhat.com
Wed Jan 21 10:10:07 UTC 2009
On Tue, 2009-01-20 at 22:32 -0500, Jeff Sturm wrote:
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of
> > nick at javacat.f2s.com
> > Sent: Tuesday, January 20, 2009 5:19 AM
> > To: linux-cluster at redhat.com
> > Subject: [Linux-cluster] Directories with >100K files
> > We have a GFS filesystem mounted over iSCSI. When doing an
> > 'ls' on directories with several thousand files it takes
> > around 10 minutes to get a response back -
> You don't say how many nodes you have, or anything about your
> Some general pointers:
> - A plain "ls" is probably much faster any variant that fetches inode
> metatdata, e.g. "ls -l". The latter performs a stat() on each
> individual file which in turn triggers locking activity of some sort.
> This is known to be slow on GFS1. (I've heard reports that GFS2 is/will
> be better.)
The latest gfs1 is also much better. It is a tricky thing to do
efficiently, and not doing the stats is a good plan.
> - You want a fast, reliable low-latency network for your cluster. Intel
> GigE cards and a fast switch are a good bet.
> - Unless your application needs access times or quota support, mounting
> with "noquota,noatime" is a good idea. Maybe also "nodiratime".
> > Can anyone recommend any GFS tunables to help us out here ?
> You could try bumping demote_secs up from its default of 5 minutes.
> That'll cause locks to be held longer so they may not need to be
> reacquired so often. It won't help with the initial directory listing,
> but should help on subsequent invocations.
> In your case, with "ls" taking 8 minutes to run, some locks initially
> acuired during execution of the command have already been demoted once
Also the question to ask is how many nodes are accessing this
filesystem? If more than one node is accessing the same directory and at
least one of those does a write (i.e. inode create/delete) within the
demote_secs time, then the demote_secs time will not make much
difference since the locks will be pushed out by the other node's access
> > Should we set statfs_fast to 1 ?
> Probably good to set this, regardless.
> > What about glock_purge ?
> Glock_purge helps limit CPU time consumed by gfs_scand when a large
> number of unused glocks are present. See
> . This may make your system run better but I'm not sure it's going to
> help with listing your giant directories.
Better to disable this altogether unless there is a very good reason to
use it. It generally has the effect of pushing things out of cache early
so is to be avoided.
> > Here is the fstab entry for the GFS filesystem:
> > /dev/vggfs/lvol00 /apps gfs
> > _netdev 1 2
> Try "noatime,noquota" here.
> Linux-cluster mailing list
> Linux-cluster at redhat.com
More information about the Linux-cluster