[linux-lvm] Discussion: performance issue on event activation mode
heming.zhao at suse.com
heming.zhao at suse.com
Tue Jun 8 16:18:17 UTC 2021
On 6/8/21 5:30 AM, David Teigland wrote:
> On Mon, Jun 07, 2021 at 10:27:20AM +0000, Martin Wilck wrote:
>> Most importantly, this was about LVM2 scanning of physical volumes. The
>> number of udev workers has very little influence on PV scanning,
>> because the udev rules only activate systemd service. The actual
>> scanning takes place in lvm2-pvscan at .service. And unlike udev, there's
>> no limit for the number of instances of a given systemd service
>> template that can run at any given time.
> Excessive device scanning has been the historical problem in this area,
> but Heming mentioned dev_cache_scan() specifically as a problem. That was
> surprising to me since it doesn't scan/read devices, it just creates a
> list of device names on the system (either readdir in /dev or udev
> listing.) If there are still problems with excessive scannning/reading,
> we'll need some more diagnosis of what's happening, there could be some
> cases we've missed.
the dev_cache_scan doesn't have direct disk IOs, but libudev will scan/read
udev db which issue real disk IOs (location is /run/udev/data).
we can see with combination "obtain_device_list_from_udev=0 &
event_activation=1" could largely reduce booting time from 2min6s to 40s.
the key is dev_cache_scan() does the scan device by itself (scaning "/dev").
I am not very familiar with systemd-udev, below shows a little more info
about libudev path. the top function is _insert_udev_dir, this function:
1. scans/reads /sys/class/block/. O(n)
2. scans/reads udev db (/run/udev/data). may O(n)
udev will call device_read_db => handle_db_line to handle every
line of a db file.
3. does qsort & deduplication the devices list. O(n) + O(n)
4. has lots of "memory alloc" & "string copy" actions during working.
it takes too much memory, from the host side, use 'top' can see:
- direct activation only used 2G memory during boot
- event activation cost ~20G memory.
I didn't test the related udev code, and guess the <2> takes too much time.
And there are thousand scanning job parallel in /run/udev/data, meanwhile
there are many devices need to generate udev db file in the same dir. I am
not sure if the filesystem can perfect handle this scenario.
the another code path, obtain_device_list_from_udev=0, which triggers to
scan/read "/dev", this dir has less write IOs than /run/udev/data.
More information about the linux-lvm