[Cluster-devel] [PATCH 2/2] gfs2: Fix lru_count going negative

Ross Lagerwall ross.lagerwall at citrix.com
Thu Jan 31 15:23:58 UTC 2019


On 1/31/19 2:36 PM, Bob Peterson wrote:
> Hi Ross,
> 
> Comments below. Sorry if this is a bit incoherent; it's early and I'm
> not properly caffeinated yet.
> 
> ----- Original Message -----
>> Under certain conditions, lru_count may drop below zero resulting in
>> a large amount of log spam like this:
>>
>> vmscan: shrink_slab: gfs2_dump_glock+0x3b0/0x630 [gfs2] \
>>      negative objects to delete nr=-1
>>
>> This happens as follows:
>> 1) A glock is moved from lru_list to the dispose list and lru_count is
>>     decremented.
>> 2) The dispose function calls cond_resched() and drops the lru lock.
>> 3) Another thread takes the lru lock and tries to add the same glock to
>>     lru_list, checking if the glock is on an lru list.
>> 4) It is on a list (actually the dispose list) and so it avoids
>>     incrementing lru_count.
>> 5) The glock is moved to lru_list.
>> 5) The original thread doesn't dispose it because it has been re-added
>>     to the lru list but the lru_count has still decreased by one.
>>
>> Fix by checking if the LRU flag is set on the glock rather than checking
>> if the glock is on some list and rearrange the code so that the LRU flag
>> is added/removed precisely when the glock is added/removed from lru_list.
>>
>> Signed-off-by: Ross Lagerwall <ross.lagerwall at citrix.com>
>> ---
>>   fs/gfs2/glock.c | 16 +++++++++-------
>>   1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
>> index b92740edc416..53e6c7e0c1b3 100644
>> --- a/fs/gfs2/glock.c
>> +++ b/fs/gfs2/glock.c
>> @@ -185,13 +185,14 @@ void gfs2_glock_add_to_lru(struct gfs2_glock *gl)
>>   {
>>   	spin_lock(&lru_lock);
>>   
>> -	if (!list_empty(&gl->gl_lru))
>> -		list_del_init(&gl->gl_lru);
>> -	else
>> +	list_del(&gl->gl_lru);
>> +	list_add_tail(&gl->gl_lru, &lru_list);
> 
> This looks like a bug, and I like your idea of using the GLF_LRU bit
> to determine whether or not to do the manipulation, but I have some
> concerns. First, does it work with kernel list debugging turned on?
> 
> To me it looks like the list_del (as opposed to list_del_init) above
> will set entry->next and prev to LIST_POISON values, then the
> list_add_tail() calls __list_add() which checks:
> 	if (!__list_add_valid(new, prev, next))
> 		return;
> Without list debugging, the value is always returned true, but with
> list debugging it checks for circular values of list->prev and list->next
> which, since they're LIST_POISON, ought to fail.
> So it seems like the original list_del_init is correct.

No, __list_add_valid() checks if prev & next correctly link to each 
other and checks that new is not the same as prev or next. It doesn't 
check anything to do with new's pointers. I've tested with DEBUG_LIST 
and it appears to work.

> 
> The intent was: if the glock is already on the lru, take it off
> before re-adding it, and the count ought to be okay, because if it's
> on the LRU list, it's already been incremented. So taking it off and
> adding it back on is a net 0 on the count. But that's only
> true if the GLF_LRU bit is set. If it's on a different list (the
> dispose list), as you noted, it still needs to be incremented.
> 
> If the glock is on the dispose_list, rather than the lru list, we
> want to take it off the dispose list and move it to the lru_list,
> but in that case, we need to increment the lru count, and not
> poison the list_head.
> 
> So to me it seems like we should keep the list_del_init, and only
> do it if the list isn't empty, but trigger off the GLF_LRU flag
> for managing the count. The lru_lock ought to prevent races.

I think I understand the intent, but IMO this patch makes the logic 
clearer than relying on both the LRU bit and the state of the list to 
determine what to do. Thoughts?

Thanks,
-- 
Ross Lagerwall




More information about the Cluster-devel mailing list