[linux-lvm] potential locking issues

Jaco Kroon jaco at uls.co.za
Wed Feb 20 20:26:03 UTC 2013


Hi All,

LVM2 uses a locking scheme, relying on flock to maintain lock files for
volume groups, by default /var/lock/lvm/V_${vgname} - these lock files
are opened, then flock()ed, and eventually either unlocked and later
locked again, or potentially just unlink()ed with the lock held.

The unlink() can potentially cause the lock to desync and cause
problems.  Consider the following scenario with three processes
(ordering is as is, the numbers are process numbers):

1.  open()
2.  open()
1.  flock() <-- succeeds
2.  flock() <-- blocks.
1.  unlink()
1.  close() <-- at this point process 2's flock succeeds.
3.  open() <-- note that this ends up being a *different* file.
3.  flock() <-- succeeds.

At this point both 2 and 3 thinks they have the lock and that's wrong.

I actually saw an instance today where dmeventd had a file descriptor
open to a deletect V_vggroup lockfile, so this *does* happen in the
field.  This also explains various lockups i've seen in the past, which
I later figured out usually happened when dmeventd was running (So i put
much effort into ensuring dmeventd never ever started up - which helped
a lot).

Permitting I'm right the fix would be to fix _undo_flock in
lib/locking/file_locking.c to not unlink the lockfile - ever.  Or any
other file that is used for locking purposes anywhere in the codebase
for that matter.
-- 
Kind Regards,
Jaco Kroon
 




More information about the linux-lvm mailing list