[linux-lvm] Discussion: performance issue on event activation mode

Tom Yan tom.ty89 at gmail.com
Thu Jul 8 10:10:56 UTC 2021


I wonder if using a shell wrapper to detect whether pvscan actually
exits way earlier than the unit does is a more accurate approach.

On Sat, 3 Jul 2021 at 19:50, heming.zhao at suse.com <heming.zhao at suse.com> wrote:
>
> On 7/3/21 6:02 AM, David Teigland wrote:
> > On Fri, Jul 02, 2021 at 09:22:03PM +0000, Martin Wilck wrote:
> >> On Fr, 2021-07-02 at 16:09 -0500, David Teigland wrote:
> >>> On Sun, Jun 06, 2021 at 02:15:23PM +0800, heming.zhao at suse.com wrote:
> >>>> dev_cache_scan //order: O(n^2)
> >>>>   + _insert_dirs //O(n)
> >>>>   | if obtain_device_list_from_udev() true
> >>>>   |   _insert_udev_dir //O(n)
> >>>>   |
> >>>>   + dev_cache_index_devs //O(n)
> >>>
> >>> I've been running some experiments and trying some patches to improve
> >>> this.  By setting obtain_device_list_from_udev=0, and using the
> >>> attached
> >>> patch to disable dev_cache_index_devs, the pvscan is much better.
> >>>
> >>> systemctl status lvm2-pvscan appears to show that the pvscan command
> >>> itself runs for only 2-4 seconds, while the service as a whole takes
> >>> around 15 seconds.  See the 16 sec gap below from the end of pvscan
> >>> to the systemd Started message.  If that's accurate, the remaining
> >>> delay
> >>> would lie outside lvm.
> >>>
> >>> Jul 02 15:27:57 localhost.localdomain systemd[1]: Starting LVM event
> >>> activation on device 253:1710...
> >>> Jul 02 15:28:00 localhost.localdomain lvm[65620]:   pvscan[65620] PV
> >>> /dev/mapper/mpathalz online, VG 1ed02c7d-0019-43c4-91b5-f220f3521ba9
> >>> is complete.
> >>> Jul 02 15:28:00 localhost.localdomain lvm[65620]:   pvscan[65620] VG
> >>> 1ed02c7d-0019-43c4-91b5-f220f3521ba9 run autoactivation.
> >>> Jul 02 15:28:00 localhost.localdomain lvm[65620]:   1 logical
> >>> volume(s) in volume group "1ed02c7d-0019-43c4-91b5-f220f3521ba9" now
> >>> active
> >>
> >> Printing this message is really the last thing that pvscan does?
> >
> > I've not seen anything measurable after that message.  However, digging
> > through the command init/exit paths I did find libudev setup/destroy calls
> > that can also be skipped when the command is not accessing libudev info.
> > A quick check seemed to show some further improvement from dropping those.
> > (That will be part of the larger patch isolating libudev usage.)
> >
> > I'm still seeing significant delay between the apparent command exit and
> > the systemd "Started" message, but this will require a little more work to
> > prove.
> >
>
> I applied the patch and got the very similar result.
>
>
> ```with patch
> # systemd-analyze blame
>           27.679s lvm2-pvscan at 134:896.service
>           27.664s lvm2-pvscan at 134:800.service
>           27.648s lvm2-pvscan at 134:960.service
>           27.645s lvm2-pvscan at 132:816.service
>           27.589s lvm2-pvscan at 134:816.service
>           27.582s lvm2-pvscan at 133:992.service
>           ... ...
>
> # systemctl status lvm2-pvscan at 134:896.service
> ... ...
> Jul 03 19:43:02 sle15sp2-base40g systemd[1]: Starting LVM event activation on device 134:896...
> Jul 03 19:43:03 sle15sp2-base40g lvm[24817]:   pvscan[24817] PV /dev/sdalm online, VG vg_sdalm is complet>
> Jul 03 19:43:03 sle15sp2-base40g lvm[24817]:   pvscan[24817] VG vg_sdalm run autoactivation.
> Jul 03 19:43:03 sle15sp2-base40g lvm[24817]:   1 logical volume(s) in volume group "vg_sdalm" now active
> Jul 03 19:43:30 sle15sp2-base40g systemd[1]: Started LVM event activation on device 134:896.
> ```
>
> the 27.679s get from 19:43:30 minus 19:43:02.
>
> and 27s is 10s quick than the lvm without patch (by today's test result)
> ```without patch
> # systemd-analyze blame
>           37.650s lvm2-pvscan at 133:992.service
>           37.650s lvm2-pvscan at 133:1008.service
>           37.649s lvm2-pvscan at 133:896.service
>           37.649s lvm2-pvscan at 134:960.service
>           37.612s lvm2-pvscan at 133:880.service
>           37.612s lvm2-pvscan at 133:864.servic
>           ... ...
>
> # systemctl status lvm2-pvscan at 133:992.service
> ... ...
> Jul 03 19:31:28 sle15sp2-base40g systemd[1]: Starting LVM event activation on device 133:992...
> Jul 03 19:31:30 sle15sp2-base40g lvm[24243]:   pvscan[24243] PV /dev/sdalc online, VG vg_sdalc is complet>
> Jul 03 19:31:30 sle15sp2-base40g lvm[24243]:   pvscan[24243] VG vg_sdalc run autoactivation.
> Jul 03 19:31:30 sle15sp2-base40g lvm[24243]:   1 logical volume(s) in volume group "vg_sdalc" now active
> Jul 03 19:32:05 sle15sp2-base40g systemd[1]: Started LVM event activation on device 133:992.
> ```
>
> I added log in lvm2 code, the time of log "1 logical volume(s) in volume
> group "vg_sdalm" now active" is the real pvscan finished time. Martin
> comment in previous mail is may right, systemd is may busy and delayed to
> detect/see the pvscan service completion.
>
> Thanks,
> heming
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm at redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/




More information about the linux-lvm mailing list