[Linux-cluster] An odd problem may be related to GFS+DLM

Ja S jas199931 at yahoo.com
Mon May 5 00:28:29 UTC 2008

Hi, All:

We realised a problem and suspected that the problem
might be related to GFS and DLM. Therefore, I am
sending the email to this group. If you think my
problem is irrelevant, please forgive me.


We have a SAN environment, where 5 nodes running RHEL
v4u4 and Redhat Cluster Suite connected to EMC
AX150SCi iSCSI RAID storage (GFS+DLM, RAID10)

We have a subdirectory on the storage and we are sure
that no applications on these five nodes know the
existence of the subdirectory. In other words, the
subdirectory should be free of lock but its parent
directories may have locks. The subdirectory holds
more than 31700 small files and the total size of
these files is about 4.3G. Within these 31700 files,
about 1/3 of them are symbolic links pointing to other
files at the same subdirectory.

The subdirectory stat is:
  File: `abc'
  Size: 8192            Blocks: 6024       IO Block:
4096   directory
Device: fc00h/64512d    Inode: 1065226     Links: 2
Access: (0770/drwxrwx---)  Uid: (    0/    root)  
Gid: (    0/    root)
Access: 2008-05-04 22:53:39.000000000 +0000
Modify: 2008-04-15 03:02:24.000000000 +0000
Change: 2008-04-15 07:11:52.000000000 +0000

Now, when I tried to ls the subdirectory from an idle
node, it took ages to output the information. I then
timed the ls command, and the results were shocking. 

# time ls -la > /dev/null

real    3m5.249s
user    0m0.628s
sys     0m5.137s

As I said that the node I used to access the
subdirectory was completely idle, what could cause the
long delay?

We asked EMC to check the hardware (including the
controller and hard drives) and was reported that
there was no problem at all.

Therefore, I would like to seek your kind answers to
the following questions:

Is the problem related to GFS and DLM? I heard GFS is
not suitable for many small files. Is that true? Is
the delay caused by locks applied to its parent
directories? Which direction should I go to figure out
what is happening and what is the underlying reason?

Thanks for your time and look forward to your reply.


