[dm-devel] Review of dm-block-manager.c
Mikulas Patocka
mpatocka at redhat.com
Mon Aug 1 21:00:32 UTC 2011
Hi
This is review of dm-block-manager.c:
char buffer_cache_name[32];
sprintf(bm->buffer_cache_name, "dm_block_buffer-%d:%d",
--- it may not fit in 32 bytes.
__wait_block uses TASK_INTERRUPTIBLE sleep and returns error code
-ERESTARTSYS if interrupted by a signal. But this error code is never
checked. Consequently, if the process receives a signal, this signal will
interrupt waiting, and the rest of the buffer management code will
mistakenly think that the event to wait for happened.
This should be replaced by TASK_UNINTERRUPTIBLE sleep and functions
__wait_io, __wait_unlocked, __wait_read_lockable, __wait_all_writes,
__wait_all_io, __wait_clean be changed to return void (because their
return code is never checked anyway).
The code uses only a spinlock to protect it state. When the spinlock is
dropped (for example during wait), the buffer may have been reused for
other purposes, but it is not checked. There is a comment "/* FIXME: Can b
have been recycled between io completion and here? */" indicating that Joe
is aware of the problem.
b->write_lock_pending++;
__wait_unlocked(b, &flags);
b->write_lock_pending--;
if (b->where != block)
goto retry;
If the buffer was reused while we were waiting, b->write_lock_pending was
already reset to zero (in __transition BS_EMPTY). We decrement it to
0xffffffff.
Error buffers are linked in error_list and this list is only flushed at a
specific case (in __wait_flush). If there are many i/o errors (for
example, the disk is unplugged) and __wait_flush is not called
sufficiently often, all existing buffers will be moved to error_list and
then the code deadlocks as there would be no empty or clean buffers.
The code uses fixed-size cache of 4096 buffers and a single process may
hold more than one buffer. This may deadlock in case of massive
parallelism --- for example, imagine that 4096 processes come
concurrently, each process requesting two buffers --- each process
allocates one buffer and then a deadlock happens, each process is waiting
for some free buffer that never comes. (this bug existed already the last
year when I looked at the code)
Mikulas
More information about the dm-devel
mailing list