[Linux-cachefs] Fedora kernel testing

Wed Feb 25 11:11:08 UTC 2009

David,

----- "David Howells" <dhowells at redhat.com> wrote:

> Daire Byrne <Daire.Byrne at framestore.com> wrote:
> > Does it make any sense to have a concept of "lazy" netfs lookups whereby if
> > I know that the netfs filesystem won't be changed I can go straight to the
> > cache and not have to go to the network at all? Maybe not quite disconnected
> > in the sense that perhaps you can relookup the network every 10 mins or so.
> > This would be great for high latency networks and small file access.
> 
> That would mostly be a case of juggling the NFS access parameters, I
> would think, except for one thing: as you mention below, the contents of
> directories and the results of lookups are not cached.
> 
> > Also would it be possible to cache the contents of dir in a similar
> > fashion? So for example if I mount my home dir over a slow link and have 
> > some subdir in my PATHs, applications will only need to go to the network
> > the first time to check the existence of a file in a dir (e.g. PATH etc.).
> > With something like Lustre perhaps this all becomes far more efficient due 
> > to the DLM - you only go to the network when you are notified of a PAGE
> > change? Or would you need to keep locks on all cached files for that?
> 
> Caching the contents of directories for NFS is something I want to look at.
> There are two parts to this: (1) storing a list of dirents that belong
> to a directory and (2) storing a list of name to FH mappings (the result of
> doing lookups).

This would be really cool for the VPN/homedir application. I have found that 
trying to run your homedir over a VPN results in very slow application load 
times simply because when searching PATHs (e.g. Maya, Python etc.) apps will 
make many many open() calls most of which simply return ENOENT (No such
file or directory). If we could cache the entries of the dir subsequent file
lookups would not need to go to the network.

There are even more access() calls to non-existent files than open() calls 
which really slows things down a lot. Does it make sense to cache the perms 
of files in a dir too?

> With AFS, a directory is read as if it were a file, and then there are
> rules for parsing the blob.  This means that lookup is done locally.  This
> doesn't always appear to be possible with NFS, though, and depends on whether
> READDIRPLUS is available.  This means that scattered data need to be
> stored in the cache, but I'm unsure of the best way to do that.
> 
> Whatever, it's something that I have on the plan to implement at some
> point.

That's good to know. I thought READDIRPLUS was a standard NFSv3 feature? It's 
funny how after all this time AFS still has some pretty cool and unique features.

> > Another application for NFS caching I was thinking about was for storing VM
> > images. It is necessary to keep VM images on shared storage for easy
> > migration but you want to minimise the NFS traffic and utilise local storage
> > if possible. Would caching the read parts of the image on disk should help? 
> > Like before if I know that the VM instance is based on a read-only QCOW2 image
> > would it help to force the NFS cache to never revalidate what is on the net
> > and go straight to the cache? Obviously writes always have to go to the net.
> 
> So, caching of writable files?  That can be made possible for NFS with
> relative ease, but you cannot tell whether a write overlapped with another
> write on the server from another client, except on NFS4 and then only if the
> change attribute is properly implemented on the server.
> 
> On the other hand, the pagecache for writable files operates under
> this same issue, so it may not be a problem really.

Sorry, I wasn't very clear. I'm interested in testing the NFS cache when we have
a single common QCOW2 disk image accessed by many (100+) clients but all writes
simply go back to a separate QCOW2 images (which QCOW2 supports). I suppose this is
the VM equivalent of caching a "diskless" network boot Linux distro except the image
is a single file instead of many files. Again knowing that the read-only master image 
never changes it would be good if one went to the cache before doing any network 
lookups. We *could* just copy the entire image locally to the machine each time but 
obviously getting cachefilesd to manage it automatically is more elegant. The 
performance may be worse though if the PAGEs get written randomly out of order in 
the corresponding local cache file.

Thanks for the feedback,

Daire