[Cluster-devel] [PATCH] gfs2: fix lock cancelling

Thu Sep 20 15:31:54 UTC 2007

On Thu, Sep 20, 2007 at 10:55:29AM -0400, J. Bruce Fields wrote:
> +int gdlm_plock_cancel(void *lockspace, struct lm_lockname *name,
> +			struct file *file, struct file_lock *fl)
> +{
> +	struct gdlm_ls *ls = lockspace;
> +	struct plock_xop *xop;
> +	struct plock_op *op;
> +
> +	spin_lock(&ops_lock);
> +	list_for_each_entry(op, &recv_list, list) {
> +		xop = (struct plock_xop *xop)op;
> +		if (!xop->callback)
> +			continue;
> +		if (xop->fl != fl)
> +			continue;
> +		list_del_init(&op->list);
> +		goto found;
> +	}

If found on the recv_list, it means the op has been sent up to the lock
manager in userspace and is still floating around up there.  If we remove
the op from the recv_list, it means, as you say, that the lock manager
could get an error back later when it does dev_write() to complete the op.
(dev_write() just prints an error message currently, doesn't return an
error to userspace.)

This assumes, of course, that seeing an error, the lock manager could do
something sensible to bring itself back in sync with the application... as
we've discussed before, that's a hard problem that we may never solve :-)

> +	list_for_each_entry(op, &send_list, list) {
> +		xop = (struct plock_xop *xop)op;
> +		if (!xop->callback)
> +			continue;
> +		if (xop->fl != fl)
> +			continue;
> +		list_del_init(&op->list);
> +		goto found;
> +	}

If found on the send_list, it means the op hasn't been sent up to the lock
manager yet, so the cancel can be considered a success.

> +	spin_unlock(&ops_lock);
> +	/* Too late; the lock's probably already been granted. */
> +	return -ENOENT;

It's up to the caller to sort out what happens in this case.

> +found:
> +	spin_unlock(&ops_lock);
> +	/* XXX: Is any other cleanup necessary here? */
> +	kfree(op);
> +	return 0;
> +}
> +