[linux-lvm] Discussion: performance issue on event activation mode

Martin Wilck martin.wilck at suse.com
Mon Jun 7 10:27:20 UTC 2021


On So, 2021-06-06 at 11:35 -0500, Roger Heflin wrote:
> This might be a simpler way to control the number of threads at the
> same time.
> 
> On large machines (cpu wise, memory wise and disk wise).   I have
> only seen lvm timeout when udev_children is set to default.   The
> default seems to be set wrong, and the default seemed to be tuned for
> a case where a large number of the disks on the machine were going to
> be timing out (or otherwise really really slow), so to support this
> case a huge number of threads was required..    I found that with it
> set to default on a close to 100 core machine that udev got about 87
> minutes of time during the boot up (about 2 minutes).  Changing the
> number of children to =4 resulted in udev getting around 2-3 minutes
> in the same window, and actually resulted in a much faster boot up
> and a much more reliable boot up (no timeouts).

Wow, setting the number of children to 4 is pretty radical. We decrease
this parameter often on large machines, but we never went all the way
down to a single-digit number. If that's really necessary under
whatever circumstances, it's clear evidence of udev's deficiencies.

I am not sure if it's better than Heming's suggestion though. It would
affect every device in the system. It wouldn't even be possible to
process more than 4 totally different events at the same time.

Most importantly, this was about LVM2 scanning of physical volumes. The
number of udev workers has very little influence on PV scanning,
because the udev rules only activate systemd service. The actual
scanning takes place in lvm2-pvscan at .service. And unlike udev, there's
no limit for the number of instances of a given systemd service
template that can run at any given time.

Note that there have been various changes in the way udev calculates
the default number of workers; what udev will use by default depends on
the systemd version and may even be patched by the distribution.

> Below is one case, but I know there are several other similar cases
> for other distributions.    Note the number of default workers = 8 +
> number_of_cpus * 64 which is going to be a disaster as it will result
> in one thread per disk/lun being started at the same time or the
> max_number_of_workers. 

What distribution are you using? This is not the default formula for
children-max any more, and hasn't been for a while.

Regards
Martin





More information about the linux-lvm mailing list