[Linux-cluster] Re: EAGAIN from dlm_lock: bug or feature?

Daniel Phillips phillips at redhat.com
Mon Oct 18 16:36:14 UTC 2004


On Monday 18 October 2004 03:51, Patrick Caulfield wrote:
> On Mon, Oct 18, 2004 at 01:01:30AM -0400, Daniel Phillips wrote:
> > Hi Patrick,
> >
> > If dlm_lock collides with a lock mastered locally it returns EAGAIN
> > and the ast is never called.  Is this a bug or a feature?  I see it
> > as a bug because it forces me to handle the EAGAIN in two places to
> > handle the same event, which is creeping cruft.  It is also sure to
> > lead to bugs where a distributed program works perfectly until
> > somebody runs two instances on the same host.
>
> It's a matter of opinion I suppose. If you get an error return from
> dlm_lock and ignore it then you deserve everything you get.
> Particularly with something like a NOQUEUE lock where you must expect
> that there is a reasonable probability of failure.

Of course I'm handling errors from dlm_lock, the problem is, you are 
also returning a non-error there, that is normally delivered to another 
routine, and this behavior changes depending on whether the lock is 
mastered locally or not.  Bad taste, in my opinion, this violates the 
principle of least surprise.

> As VMS also has this behaviour I suspect it'll be something people
> are used to.

Are you sure?  IBM's dlm doesn't have the behaviour as far as I can tell 
(see the section on local locking).  And even if VMS has the behaviour, 
it's a broken behavior.  Fixing it won't break any working programs and 
it will fix some programs that nobody knows are broken until some lock 
randomly ends up being mastered locally.

Regards,

Daniel




More information about the Linux-cluster mailing list