[linux-lvm] Discussion: performance issue on event activation mode
prajnoha at redhat.com
Tue Jun 8 13:56:49 UTC 2021
On Tue 08 Jun 2021 15:46, Zdenek Kabelac wrote:
> Dne 08. 06. 21 v 15:41 Peter Rajnoha napsal(a):
> > On Tue 08 Jun 2021 13:23, Martin Wilck wrote:
> > > On Di, 2021-06-08 at 14:29 +0200, Peter Rajnoha wrote:
> > > > On Mon 07 Jun 2021 16:48, David Teigland wrote:
> > > > > If there are say 1000 PVs already present on the system, there
> > > > > could be
> > > > > real savings in having one lvm command process all 1000, and then
> > > > > switch
> > > > > over to processing uevents for any further devices afterward. The
> > > > > switch
> > > > > over would be delicate because of the obvious races involved with
> > > > > new devs
> > > > > appearing, but probably feasible.
> > > > Maybe to avoid the race, we could possibly write the proposed
> > > > "/run/lvm2/boot-finished" right before we initiate scanning in
> > > > "vgchange
> > > > -aay" that is a part of the lvm2-activation-net.service (the last
> > > > service to do the direct activation).
> > > >
> > > > A few event-based pvscans could fire during the window between
> > > > "scan initiated phase" in lvm2-activation-net.service's
> > > > "ExecStart=vgchange -aay..."
> > > > and the originally proposed "ExecStartPost=/bin/touch /run/lvm2/boot-
> > > > finished",
> > > > but I think still better than missing important uevents completely in
> > > > this window.
> > > That sounds reasonable. I was thinking along similar lines. Note that
> > > in the case where we had problems lately, all actual activation (and
> > > slowness) happened in lvm2-activation-early.service.
> > >
> > Yes, I think most of the activations are covered with the first service
> > where most of the devices are already present, then the rest is covered
> > by the other two services.
> > Anyway, I'd still like to know why exactly
> > obtain_device_list_from_udev=1 is so slow. The only thing that it does
> > is that it calls libudev's enumeration for "block" subsystem devs. We
> > don't even check if the device is intialized in udev in this case if I
> > remember correctly, so if there's any udev processing in parallel hapenning,
> > it shouldn't be slowing down. BUT we're waiting for udev records to
> > get initialized for filtering reasons, like mpath and MD component detection.
> > We should probably inspect this in detail and see where the time is really
> > taken underneath before we do any futher changes...
> This remains me - did we already fix the anoying problem of 'repeated' sleep
> for every 'unfinished' udev intialization?
> I believe there should be exactly one sleep try to wait for udev and if it
> doesn't work - go with out.
> But I've seen some trace where the sleep was repeatedly for each device were
> udev was 'uninitiated'.
> Clearly this doesn't fix the problem of 'unitialized udev' but at least
> avoid extremely lengthy sleeping lvm command.
The sleep + iteration is still there!
The issue is that we're relying now on udev db records that contain
info about mpath and MD components - without this, the detection (and
hence filtering) could fail in certain cases. So if go without checking
udev db, that'll be a step back. As an alternative, we'd need to call
out mpath and MD directly from LVM2 if we really wanted to avoid
checking udev db (but then, we're checking the same thing that is
already checked by udev means).
More information about the linux-lvm