[dm-devel] [patch 0/5] device mapper percpu patches

Jens Axboe axboe at kernel.dk
Wed Nov 7 22:59:31 UTC 2018


On 11/7/18 3:47 PM, Mikulas Patocka wrote:
> 
> 
> On Wed, 7 Nov 2018, Mike Snitzer wrote:
> 
>> On Tue, Nov 06 2018 at  4:34pm -0500,
>> Mikulas Patocka <mpatocka at redhat.com> wrote:
>>
>>> Hi
>>>
>>> These are the device mapper percpu patches.
>>>
>>> Note that I didn't test request-based device mapper because I don't have
>>> hardware for it (the patches don't convert request-base targets to percpu
>>> values, but there are a few inevitable changes in dm-rq.c).
>>
>> Patches 1 - 3 make sense.  But the use of percpu inflight counters isn't
>> something I can get upstream.  Any more scalable counter still needs to
>> be wired up to the block stats interfaces (the one you did in patch 5 is
>> only for the "inflight" fsffs file, there is also the generic diskstats
>> callout to part_in_flight(), etc).  Wiring up both part_in_flight() and
>> part_in_flight_rw() to optionally callout to a new callback isn't going
>> to fly.. especially if that callout is looping up the sum of percpu
>> counters.
>>
>> I checked with Jens and now that in 4.21 all of the old request-based IO
>> path is gone (and given that blk-mq bypasses use of ->in_flight[]): the
>> only consumer of the existing ->in_flight[] is the bio-based IO path.
>>
>> Given that now only bio-based is consuming it, and your work was focused
>> on making bio-based DM's "pending" IO accounting more scalable, it is
>> best to just change block core's ->in_flight[] directly.
>>
>> But Jens is against switching to using percpu counters because they are
>> really slow when summing the counts.  And diskstats does that
>> frequently.  Jens said at least 2 other attempts were made and rejected
>> to switch over to percpu counters.
> 
> I'd like to know - which kernel part needs to sum the percpu IO counters 
> frequently?
> 
> My impression was that the counters need to be summed only when the user 
> is reading the files in sysfs and that is not frequent at all.

part_round_stats() does it on IO completion - only every jiffy, but it's
enough that previous attempts at percpu inflight counters only worked
for some cases, and were worse for others.

-- 
Jens Axboe




More information about the dm-devel mailing list