[Linux-cluster] DLM behavior after lockspace recovery
Daniel Phillips
phillips at redhat.com
Sun Oct 17 00:50:04 UTC 2004
On Saturday 16 October 2004 17:14, Jeff wrote:
> In your example of a counter which tracks the # of operations
> in progress, regenerating the LVB value during failover from
> the last known good value among the surviving nodes doesn't
> do any good. There is no way to avoid recalculating the correct
> value during the failover process.
>
> OTOH, in my example where the value in the lock value block
> is used as a block version # it makes perfect sense to use
> the last known value from the surviving nodes.
Going back over the early part of the thread, you weren't originally
advocating this, you just thought that might be the way vaxcluster did
it, and you thought RSB$L_VALSEQNUM might have something to do with it.
Let's keep hunting around for a way of handling this with flags alone,
ok? Sequencing the lvbs is a rather, ahem, heavyweight approach that
consumes memory in every lvb user (more probably, every dlm user
whether they use lvbs or not) and benefits only a small subset of lvb
applications. This extra sequence number has to travel over the net,
through sockets and through various other interface bits. More bloat
in this department has to be seen as a bad thing.
It seems to me that the VALNOTVALID flag by itself isn't enough for you
because one of your nodes might update the lvb, and consequently some
other node may not ever see the VALNOTVALID flag, and therefore not
know that it should reset its cached counter. So how about an
additional flag, say, INVALIDATED, that the lock master hands out to
any lvb reader the first time it reads an lvb for which recovery was
not possible, whether the lvb was subsequently written or not. Your
application looks at INVALIDATED to know that it has to reset its
counter and ignores VALNOTVALID. Does this work for you?
I agree that VALNOTVALID is a useful flag that we should have, but isn't
useful here. I also agree with your discomfort about setting the lvb
arbitrarily to zero, but having a flag to detect that reduces the
annoyance considerably, don't you think?
> Another example is a lock who's LVB doesn't change once it has
> been initialized. In this case it doesn't matter whether the
> value block is marked invalid or not. The contents are still
> useful.
But this value could still disappear if all the readers drop the lock,
so presumably you have a way of recovering it, in which case the above
flags would work for this problem as well.
Also, in this case, randomly picking a value from one of the surviving
nodes (and setting VALNOTVALID) will do as well as a sequence number
scheme. Such an api change would require changes to the gfs harness
plugin, which isn't such a big deal at this point since gfs+gdlm is
still pre-alpha.
Regards,
Daniel
More information about the Linux-cluster
mailing list