[dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu

Jens Axboe axboe at kernel.dk
Wed Jun 28 22:19:11 UTC 2017


On 06/28/2017 04:07 PM, Brian King wrote:
> On 06/28/2017 04:59 PM, Jens Axboe wrote:
>> On 06/28/2017 03:54 PM, Jens Axboe wrote:
>>> On 06/28/2017 03:12 PM, Brian King wrote:
>>>> -static inline int part_in_flight(struct hd_struct *part)
>>>> +static inline unsigned long part_in_flight(struct hd_struct *part)
>>>>  {
>>>> -	return atomic_read(&part->in_flight[0]) + atomic_read(&part->in_flight[1]);
>>>> +	return part_stat_read(part, in_flight[0]) + part_stat_read(part, in_flight[1]);
>>>
>>> One obvious improvement would be to not do this twice, but only have to
>>> loop once. Instead of making this an array, make it a structure with a
>>> read and write count.
>>>
>>> It still doesn't really fix the issue of someone running on a kernel
>>> with a ton of possible CPUs configured. But it does reduce the overhead
>>> by 50%.
>>
>> Or something as simple as this:
>>
>> #define part_stat_read_double(part, field1, field2)			\
>> ({									\
>> 	typeof((part)->dkstats->field1) res = 0;			\
>> 	unsigned int _cpu;						\
>> 	for_each_possible_cpu(_cpu) {					\
>> 		res += per_cpu_ptr((part)->dkstats, _cpu)->field1;	\
>> 		res += per_cpu_ptr((part)->dkstats, _cpu)->field2;	\
>> 	}								\
>> 	res;								\
>> })
>>
>> static inline unsigned long part_in_flight(struct hd_struct *part)
>> {
>> 	return part_stat_read_double(part, in_flight[0], in_flight[1]);
>> }
>>
> 
> I'll give this a try and also see about running some more exhaustive
> runs to see if there are any cases where we go backwards in performance.
> 
> I'll also run with partitions and see how that impacts this.

And do something nuts, like setting NR_CPUS to 512 or whatever. What do
distros ship with?

-- 
Jens Axboe




More information about the dm-devel mailing list