[dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu
Jens Axboe
axboe at kernel.dk
Wed Jun 28 22:19:11 UTC 2017
On 06/28/2017 04:07 PM, Brian King wrote:
> On 06/28/2017 04:59 PM, Jens Axboe wrote:
>> On 06/28/2017 03:54 PM, Jens Axboe wrote:
>>> On 06/28/2017 03:12 PM, Brian King wrote:
>>>> -static inline int part_in_flight(struct hd_struct *part)
>>>> +static inline unsigned long part_in_flight(struct hd_struct *part)
>>>> {
>>>> - return atomic_read(&part->in_flight[0]) + atomic_read(&part->in_flight[1]);
>>>> + return part_stat_read(part, in_flight[0]) + part_stat_read(part, in_flight[1]);
>>>
>>> One obvious improvement would be to not do this twice, but only have to
>>> loop once. Instead of making this an array, make it a structure with a
>>> read and write count.
>>>
>>> It still doesn't really fix the issue of someone running on a kernel
>>> with a ton of possible CPUs configured. But it does reduce the overhead
>>> by 50%.
>>
>> Or something as simple as this:
>>
>> #define part_stat_read_double(part, field1, field2) \
>> ({ \
>> typeof((part)->dkstats->field1) res = 0; \
>> unsigned int _cpu; \
>> for_each_possible_cpu(_cpu) { \
>> res += per_cpu_ptr((part)->dkstats, _cpu)->field1; \
>> res += per_cpu_ptr((part)->dkstats, _cpu)->field2; \
>> } \
>> res; \
>> })
>>
>> static inline unsigned long part_in_flight(struct hd_struct *part)
>> {
>> return part_stat_read_double(part, in_flight[0], in_flight[1]);
>> }
>>
>
> I'll give this a try and also see about running some more exhaustive
> runs to see if there are any cases where we go backwards in performance.
>
> I'll also run with partitions and see how that impacts this.
And do something nuts, like setting NR_CPUS to 512 or whatever. What do
distros ship with?
--
Jens Axboe
More information about the dm-devel
mailing list