[Cluster-devel] [GFS2 PATCH 1/3] GFS2: Set of distributed preferences for rgrps

Tue Oct 28 13:34:43 UTC 2014

Hi,

On 27/10/14 14:07, Bob Peterson wrote:
> ----- Original Message -----
>>> +				if ((loops < 3) &&
>>> +				    gfs2_rgrp_used_recently(rs, 1000) &&
>>> +				    gfs2_rgrp_congested(rs->rs_rbm.rgd, loops))
>>> +					goto next_rgrp;
>>> +			}
>> This makes no sense, we end the loop when loops == 3 so that these
>> conditions will be applied in every case which is not what we want. We
>> must always end up doing a search of every rgrp in the worst case, in
>> order that if there is some space left somewhere, we will eventually
>> find it.
>>
>> Definitely better wrt figuring out which rgrps to prefer, but I'm not
>> yet convinced about this logic. The whole point of the congestion logic
>> is to figure out ahead of time, whether it will take a long time to
>> access that rgrp, so it seems that this is not quite right, otherwise
>> there should be no need to bypass it like this. The fast_to_acquire
>> logic should at least by merged into the rgrp_congested logic, possibly
>> by just reducing the threshold at which congestion is measured.
>>
>> It might be useful to introduce a tracepoint for when we reject and rgrp
>> during allocation, with a reason as to why it was rejected, so that it
>> is easier to see whats going on here,
>>
>> Steve.
> Hi,
>
> Sigh. You're right: Good catch. The problem is that I've done more than 30
> attempts at trying to get this right in my git tree, each of which has
> up to 15 patches for various things. Somewhere around iteration 20, I
> dropped an important change. My intent was always to add another layer
> of rgrp criteria, so loops would be 4 rather than 3. I had done this
> with a different patch, but it got dropped by accident. The 3 original
> layers are as follows:
>
> loop 0: Reject rgrps that are likely congested (based on past history)
>          and rgrps where we just found congestion.
>          Only accept rgrps for which we can get a multi-block reservation.
> loop 1: Reject rgrps that are likely congested (based on past history)
>          and rgrps where we just found congestion. Accept rgrps that have
>          enough free space, even if we can't get a reservation.
> loop 2: Don't ever reject rgrps because we're out of ideal conditions.
That is not how it is supposed to work. Loop 0 is the one when we try 
and avoid rgrps which are congested. Loop 1 and 2 are the same in that 
both are supposed to do a full scan of the rgrps. The only reason for 
loop 2 is that we flush out any unlinked inodes that may have 
accumulated between loop 1 and loop 2, but otherwise they should be 
identical.

> The new scheme was intended to add a new layer 0 which only accepts rgrps
> within a preferred subset of rgrps. In other words:
>
> loop 0: Reject rgrps that aren't in our preferred subset of rgrps.
>          Reject rgrps that are likely congested (based on past history)
>          and rgrps where we just found congestion.
>          Only accept rgrps for which we can get a multi-block reservation.
> loop 1: Reject rgrps that are likely congested (based on past history)
>          and rgrps where we just found congestion.
>          Only accept rgrps for which we can get a multi-block reservation.
> loop 2: Reject rgrps that are likely congested (based on past history)
>          and rgrps where we just found congestion. Accept any rgrp that has
>          enough free space, even if we can't get a reservation.
> loop 3: Don't ever reject rgrps because we're out of ideal conditions.
>
> But is 4 loops too many? I could combine 0 and 1, and in fact, today's code
> accidentally does just that. The mistake was probably that I had been
> experimenting with 3 versus 4 layers and had switched them back and forth
> a few times for various tests.
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems
Yes, definitely too many. If we are looping that many times, it suggests 
that something is wrong with the way in which we are searching for 
rgrps. It would be better to use fewer loops if at all possible, rather 
than more, since this looping will be very slow,

Steve.