Queue upcall locking (was: [dm-devel] [RFC][PATCH] fix dm_any_congested() to properly sync up with suspend code path)

Mikulas Patocka mpatocka at redhat.com
Tue Nov 11 22:08:43 UTC 2008


On Tue, 11 Nov 2008, Christoph Hellwig wrote:

> On Mon, Nov 10, 2008 at 09:19:21AM -0500, Mikulas Patocka wrote:
> > For device mapper, congested_fn asks every device in the tree and make OR 
> > of their bits --- so if the user has 50 devices, it asks them all.
> > 
> > For md-linear, md-raid0, md-raid1, md-raid10 and md-multipath it does the 
> > same --- asking every device.
> > 
> > If you have a better idea how to implement congested_fn, say it.
> 
> Well, even asking 50 devices shouldn't take that long normally.

There are some atomic operations inside congested_fn.

I remember that there was some database benchmark on a setup with 48 
devices. Just dropping one spinlock from unplug function resulted in 
overall improvement (not microbenchmarks, the whole run) by 18%. The 
slowdown with unplug is similar as with congested_fn --- unplugs walks all 
the physical devices for a given logical device and unplugs them.

So simplifying congested_fn might help too. Do you have an idea, how often 
is it called? I thought about caching the return value for one jiffy.

> Do you take some sleeping lock that might block forever?  In that case I 
> would just return congested for that device as it would delay pdflush 
> otherwise.

Sometimes we do, and we shouldn't, because congested_fn is called with 
spinlock. That's what I'm working on now.

BTW. how "realtime" do you expect the Linux kernel to be? Is there some 
effort to make all spinlocks locked for finite time? (so that someone 
could one day calculate the times of all spinlocks, take the maximum time 
and give Linux a "realtime" label). If so, then bdi_congested call should 
be moved out of spinlocks. If there are already many cases when spinlock 
could be taken and list of indefinite length walked inside, it isn't worth 
optimizing for it.

Mikulas




More information about the dm-devel mailing list