[Linux-cluster] Why GFS is so slow? What it is waiting for?

Thu May 8 22:29:41 UTC 2008

Hi, Wendy:

Thanks for your so prompt and kind explanation. It is
very helpful. According to your comments, I did
another test. See below:

# stat abc/
  File: `abc/'
  Size: 8192            Blocks: 6024       IO Block:
4096   directory
Device: fc00h/64512d    Inode: 1065226     Links: 2
Access: (0770/drwxrwx---)  Uid: (    0/    root)  
Gid: (    0/    root)
Access: 2008-05-08 06:18:58.000000000 +0000
Modify: 2008-04-15 03:02:24.000000000 +0000
Change: 2008-04-15 07:11:52.000000000 +0000

# cd abc/
# time ls | wc -l 
31764

real    0m44.797s
user    0m0.189s
sys     0m2.276s

The real time in this test is much shorter than the
previous one. However, it is still reasonable long. As
you said, the ‘ls’ command only reads the single
directory file. In my case, the directory file itself
is only 8192 bytes. The time spent on disk IO should
be included in “sys 0m2.276s”. Although DLM needs time
to lookup the location of the corresponding master
lock resource and to process locking, the system
should not take about 42 seconds to complete the “ls”
command. So, what is the hidden issue or is there a
way to identify possible bottlenecks? 

Great thanks in advance.

Jas

--- Wendy Cheng <s.wendy.cheng at gmail.com> wrote:

> Ja S wrote:
> > Hi, All:
> >
> > I used to post this question before, but have not
> > received any comments yet. Please allow me post it
> > again.
> >
> > I have a subdirectory containing more than 30,000
> > small files on a SAN storage (GFS1+DLM, RAID10).
> No
> > user application knows the existence of the
> > subdirectory. In other words, the subdirectory is
> free
> > of accessing. 
> >   
> Short answer is to remember "ls" and "ls -la" are
> very different 
> commands. "ls" is a directory read (that reads from
> one single file) but 
> "ls -la" needs to get file attributes (file size,
> modification times, 
> ownership, etc) from *each* of the files from the
> subject directory. In 
> your case, it needs to read more than 30,000 inodes
> to get them. The "ls 
> -la" is slower for *any* filesystem but particularly
> troublesome for a 
> cluster filesystem such as GFS due to:
> 
> 1. Cluster locking overheads (it needs readlocks
> from *each* of the 
> files involved).
> 2. Depending on when and how these files are
> created. During file 
> creation time and if there are lock contentions, GFS
> has a tendency to 
> spread the file locations all over the disk.
> 3. You use iscsi such that dlm lock traffic and file
> block access are on 
> the same fabric ?  If this is true, you will more or
> less serialize the 
> lock access.
> 
> Hope above short answer will ease your confusion.
> 
> -- Wendy
> > However, it took ages to list the subdirectory on
> an
> > absolute idle cluster node. See below:
> >
> > # time ls -la | wc -l
> > 31767
> >
> > real    3m5.249s
> > user    0m0.628s
> > sys     0m5.137s
> >
> > There are about 3 minutes spent on somewhere. Does
> > anyone have any clue what the system was waiting
> for?
> >
> >
> > Thanks for your time and wish to see your valuable
> > comments soon.
> >
> > Jas
> >
> >
> >      
>
____________________________________________________________________________________
> > Be a better friend, newshound, and 
> > know-it-all with Yahoo! Mobile.  Try it now. 
>
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> >
>
https://www.redhat.com/mailman/listinfo/linux-cluster
> >   
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
>
https://www.redhat.com/mailman/listinfo/linux-cluster
> 

      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ