[Linux-cluster] Re: GFS, what's remaining

kurt.hackel at oracle.com kurt.hackel at oracle.com
Mon Sep 5 19:11:59 UTC 2005


On Mon, Sep 05, 2005 at 05:24:33PM +0800, David Teigland wrote:
> On Mon, Sep 05, 2005 at 01:54:08AM -0700, Andrew Morton wrote:
> > David Teigland <teigland at redhat.com> wrote:
> > >
> > >  We export our full dlm API through read/write/poll on a misc device.
> > >
> > 
> > inotify did that for a while, but we ended up going with a straight syscall
> > interface.
> > 
> > How fat is the dlm interface?   ie: how many syscalls would it take?
> 
> Four functions:
>   create_lockspace()
>   release_lockspace()
>   lock()
>   unlock()

FWIW, it looks like we can agree on the core interface.  ocfs2_dlm
exports essentially the same functions:
    dlm_register_domain()
    dlm_unregister_domain()
    dlmlock()
    dlmunlock()

I also implemented dlm_migrate_lockres() to explicitly remaster a lock
on another node, but this isn't used by any callers today (except for
debugging purposes).  There is also some wiring between the fs and the
dlm (eviction callbacks) to deal with some ordering issues between the
two layers, but these could go if we get stronger membership.

There are quite a few other functions in the "full" spec(1) that we
didn't even attempt, either because we didn't require direct 
user<->kernel access or we just didn't need the function.  As for the
rather thick set of parameters expected in dlm calls, we managed to get
dlmlock down to *ahem* eight, and the rest are fairly slim.

Looking at the misc device that gfs uses, it seems like there is pretty
much complete interface to the same calls you have in kernel, validated
on the write() calls to the misc device.  With dlmfs, we were seeking to
lock down and simplify user access by using standard ast/bast/unlockast
calls, using a file descriptor as an opaque token for a single lock,
letting the vfs lifetime on this fd help with abnormal termination, etc.
I think both the misc device and dlmfs are helpful and not necessarily
mutually exclusive, and probably both are better approaches than
exporting everything via loads of syscalls (which seems to be the 
VMS/opendlm model).

-kurt

1. http://opendlm.sourceforge.net/cvsmirror/opendlm/docs/dlmbook_final.pdf


Kurt C. Hackel
Oracle
kurt.hackel at oracle.com




More information about the Linux-cluster mailing list