[linux-lvm] Discussion: performance issue on event activation mode

Peter Rajnoha prajnoha at redhat.com
Thu Sep 30 11:29:07 UTC 2021

On 9/30/21 09:51, Martin Wilck wrote:
> On Thu, 2021-09-30 at 00:06 +0200, Peter Rajnoha wrote:
>> On Tue 28 Sep 2021 12:42, Benjamin Marzinski wrote:
>>> On Tue, Sep 28, 2021 at 03:16:08PM +0000, Martin Wilck wrote:
>>>> I have pondered this quite a bit, but I can't say I have a
>>>> concrete
>>>> plan.
>>>> To avoid depending on "udev settle", multipathd needs to
>>>> partially
>>>> revert to udev-independent device detection. At least during
>>>> initial
>>>> startup, we may encounter multipath maps with members that don't
>>>> exist
>>>> in the udev db, and we need to deal with this situation
>>>> gracefully. We
>>>> currently don't, and it's a tough problem to solve cleanly. Not
>>>> relying
>>>> on udev opens up a Pandora's box wrt WWID determination, for
>>>> example.
>>>> Any such change would without doubt carry a large risk of
>>>> regressions
>>>> in some scenarios, which we wouldn't want to happen in our large
>>>> customer's data centers.
>>> I'm not actually sure that it's as bad as all that. We just may
>>> need a
>>> way for multipathd to detect if the coldplug has happened.  I'm
>>> sure if
>>> we say we need it to remove the udev settle, we can get some method
>>> to
>>> check this. Perhaps there is one already, that I don't know about.
>>> If
>> The coldplug events are synthesized and as such, they all now contain
>> SYNTH_UUID=<UUID> key-value pair with kernel>=4.13:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-uevent
>> I've already tried to proposee a patch for systemd/udev that would
>> mark
>> all uevents coming from the trigger (including the one used at boot
>> for
>> coldplug) with an extra key-value pair that we could easily match in
>> rules,
>> but that was not accepted. So right now, we could detect that
>> synthesized uevent happened, though we can't be sure it was the
>> actual
>> udev trigger at boot. For that, we'd need the extra marks. I can give
>> it
>> another try though, maybe if there are more people asking for this
>> functionality, we'll be at better position for this to be accepted.
> That would allow us to discern synthetic events, but I'm unsure how
> this what help us. Here, what matters is to figure out when we don't
> expect any more of them to arrive.

I think this would require different approach on systemd/udev side. Currently, 
"udevadm trigger --setlle" uses different UUID for each synthesized uevent's 
SYNTH_UUID. This is actually not exactly how it was meant to be used. Instead, 
the SYNTH_UUID was also meant to be used as form of grouping - so in case of 
"udevadm trigger", there should be a single UUID used to group all the 
generated uevents based on that UUID. Then, this logic could be enhanced in a 
way that there would be different SYNTH_UUID used for each subsystem (e.g. 
block), hence we could wait for each subsystem's devices separately, not being 
dragged by waiting for anything else.

So then we could have services like:

And then place our services after that. We'd need to elaborate a bit if more 
fine grained separation would be needed or not...

If we see this udev settle as the key point, then I think we should probably 
concentrate on enhancing systemd/udev to provide this functionality (and 
primarily the udevadm trigger functionality and waiting for related 
synthesized events). I think the infrastructure to accomplish this is already 
there. It just needs suitable user-space changes (the udevadm trigger).

> I guess it would be possible to compare the list of (interesting)
> devices in sysfs with the list of devices in the udev db. For
> multipathd, we could
>   - scan set U of udev devices on startup
>   - scan set S of sysfs devices on startup

Well, I think that's exactly the functionality that could be provided by the 
settle separation as described above... And then everybody could benefit from 

>   - listen for uevents for updating both S and U
>   - after each uevent, check if the difference set of S and U is emtpy
>   - if yes, coldplug has finished
>   - otherwise, continue waiting, possibly until some timeout expires.
> It's more difficult for LVM because you have no daemon maintaining
> state.
> Martin


More information about the linux-lvm mailing list