[Linux-cluster] DLM behavior after lockspace recovery

Fri Oct 15 04:32:38 UTC 2004

Hi Jeff,

On Thursday 07 October 2004 07:26, Jeff wrote:
> Here's the problem with simply resetting the value block to zero.
> We're using the value block as a counter to track whether a block
> on disk has changed or not. Each cluster member keeps a copy of the
> value block counter in memory along with the associated disk block.
> When a process converts a NL lock to a higher mode it reads the
> current copy of the value block to decide whether it needs to re-read
> the block from disk.
>
> When the lock request completes with VALNOTVALID as a status the
> process knows that it needs to re-read the block from disk. The big
> question though is what does it write into the lock value block at
> that point so the other systems will know this as well. If the lock
> value block is guaranteed to have the most recent value seen by the
> existing nodes then the process can simply increment the value and
> it will know that the result will not match what any other system has
> cached. If the lock value block is zeroed or set to an arbitrary
> value from any one of the surviving nodes, then it might be a value
> which is lower than exists on one or more of the nodes. There are
> ways we can deal with this but it means more bookkeeping.

But do you really think the dlm should pretend that a potentially 
corrupt value is in fact good?  This seems like a very bad idea to me.
In return for saving some bookkeeping in the very special case where you 
have an incrementing lvb, you suggest imposing extra overhead on every 
lvb update and having the dlm make false promises about data integrity.  
I don't think this is a good trade.

I'd suggest biting the bullet and initiating application level recovery 
as soon as you see VALNOTVALID.  In this case it means you have to tell 
every node to reset its cached sequence number.

Regards,

Daniel