[Cluster-devel] [PATCH 4 of 5] Bz #248176: GFS2: invalid metadata block - REVISED
Wendy Cheng
wcheng at redhat.com
Fri Aug 10 13:12:14 UTC 2007
Steven Whitehouse wrote:
>Hi,
>
>On Thu, 2007-08-09 at 14:21 -0400, Wendy Cheng wrote:
>
>
>>Bob Peterson wrote:
>>
>>
>>>On Thu, 2007-08-09 at 09:46 -0400, Wendy Cheng wrote:
>>>
>>>
>>>
>>>>Set aside "after this patch, the problem goes away" thing ...
>>>>
>>>>I haven't checked previous three patches yet so I may not have the
>>>>overall picture ... but why adding the journal flush spin lock here
>>>>could prevent the new inode to get re-used before its associated buffer
>>>>are flushed to the logs ? Could you elaborate more ?
>>>>
>>>>
>>>>
>>>>
>>>>>+ down_write(&sdp->sd_log_flush_lock);
>>>>> block = rgblk_search(rgd, goal, GFS2_BLKST_UNLINKED,
>>>>> GFS2_BLKST_UNLINKED);
>>>>>+ up_write(&sdp->sd_log_flush_lock);
>>>>>
>>>>>
>>>>>
>>>IIRC, if we don't protect rgblk_search from finding GFS2_BLKST_UNLINKED
>>>blocks, a "deleted" inode may be returned to function
>>>gfs2_inplace_reserve_i which will do an iput on the inode,
>>>which may reference buffers that are being flushed to disk.
>>>If almost all blocks in that bitmap are allocated, I think the
>>>deleted block may sometimes be reused and the buffer
>>>associated with the reused block may be changed before it's
>>>actually written out to disk.
>>>
>>>
>>>
>>Log flushing is an asynchronous event. I still don't see how this can
>>*protect* the condition you just described (i.e., prevents the block
>>being assigned to someone else before log flush occurs). Or do I
>>understand your statement right (i.e., the log flushing must occur
>>before the block is used by someone else) ? It may *reduce* the
>>possibility (if log flushing happens at the same time as this
>>assignment) but I don't see how it can *stop* the condition.
>>
>>You may "reduce" the (rare) possibility but the real issue is still
>>hanging there ? If this is true, then I don't agree we have to pay the
>>price of moving a journal flushing lock into resource handling code.
>>
>>-- Wendy
>>
>>
>>
>
>Due to the way in which the locking is defined, the journal lock is also
>used to keep other processes out of the rgrp bitmaps. This prevents the
>state of the rgrp bitmaps changing while we are scanning them in case a
>journal flush might occur.
>
>The sd_log_flush_lock is an rwsem which is held in read mode by each and
>every transaction and in write mode when flushing the journal. Log
>flushing ought to be an asynchronous event, but due to the design of the
>journaling, it unfortunately isn't in GFS2. It is something that we need
>to review in the future,
>
>
>
>
It is still not clear what exactly does this lock protect ? The unlinked
rgrp bitmap itself or the buffers associated with these disk blocks ? If
it is later (the buffers as Bob said), it implies for every block GFS2
takes from this unlinked bitmap, journal flush *has* to happen before it
can be used ? Could you elaborate more ?
-- Wendy
More information about the Cluster-devel
mailing list