Filesystem-local databases in mlocate

Bernardo Innocenti bernie at develer.com
Sun Mar 18 14:52:14 UTC 2007


Axel Thimm wrote:

>> As far as I can tell, NFSv4 is just catching up.  And as of today I still
>> find many trivial workloads for which NFSv4 still performs poorly. Try
>> "time find /nfs_share >/dev/null" versus the same command on a local
>> filesystem to see what I mean. 
> 
> Well, aren't you just arguing against your original proposal to move
> everything to NFSv4 and rely on the caching done by NFSv4? ;)

:-) Telling the truth shall outweigh one's desire of being always
right.

Well, the NFSv4 performance problem I was talking about affects
stat()-ing many files.  Filesystem metadata is not being cached as
efficiently as one would expect and client requests are not being
clustered together as much as possibile.

*But* the contents of a single largish file such as mlocate.db should
be cached on the clients.  At least, that's what I'm experiencing
with NFSv4 in my LAN.

To measure NFSv4 read() performance, I must create a file on the
server, read it on the server to make sure it's in buffer-cache
(otherwise I'd be measruing the performance of the RAID array too),
then go on the client and cat the file to /dev/null.
To repeat the test, I must remove the test file from the server
and create a new one from scratch, otherwise the client would have
cached it all and get much better results.


>> Why would GFS be any slower than a non-clustered filesystem when it
>> comes to raw data read performance?  The DLM overhead would supposedly
>> not get in the way of every single block being read.
> 
> You should try and time GFS. When it drops the domain locks, no
> caching survives.

Are you talking about GFS2?  I've never had a chance to try it.

It's been a while since I used GFS, and it was GFS1 on RHAS3 or
maybe 4.  At that time, GFS performance was poor wrt ext3 even when
the storage was locally attached to a single server.  But what it
did was so useful for an HA cluster that you would excuse it for
not being also fast.


> You are trying to solve an easy-to-solve caching problem by requiring
> 
> o usage of NFSv4
> o high bandwidth of drives
> o gigabit ethernet
> o and more

Ouch... But these conditions were just ORed together!

The reason I try to drive away from the caching solution is that
most caches are more fragile and complex than their designers
initially thought.  Most break in the face of the user who's not
even aware of them (because caches are designed to be transparent).


> while the original poster mentioned he needs this for his wireless
> connection of his laptop ...

Then he's only got NFSv4 left...

-- 
   // Bernardo Innocenti - Develer R&D dept.
 \X/  http://www.develer.com/




More information about the fedora-devel-list mailing list