[RFC 0/1] Check for pid re-use before killing domain process

Jonathon Jongsma jjongsma at redhat.com
Tue Oct 11 16:20:00 UTC 2022


I believe that pidfd syscalls were introduced in kernel 5.2. Judging by 
our CI build setup, the oldest distrubution that we support is Alma 
Linux 8, which still has kernel version 4.18.


On 10/6/22 4:33 AM, manish.mishra wrote:
> Libvirt stores pid of domain(e.g Qemu) process when a domain process is started
> and same pid is used while destroying domain process. There is always a race
> possible that before libvirt tries to kill domain process, actual domain process
> is already dead and same pid is used for another process. In that case libvirt
> may kill an innocent process.
> 
> With this patch we store start time of domain process when domain is started and
> we match stime again while killing domain process if it does not match we can
> assume domain process is already dead.
> 
> This patch series tries to create a generic interface which can be used for
> non-domain processes too, even though initial patch series handles pid re-use
> for domain process only.
> 
> Proposed changes:
> 1. While creating a daemon process through virExec, create a file which stores
>     start time of the process, along with the pid file. A Separate file for start
>     time is created instead of storing start time too in *.pid file so that
>     existing external user of <domain-id>*.pid file does not break.
> 2. Persist stime in domstatus in domain xml.
> 3. In virProcessKill before sending signal to process also verify start time of
>     process. For sending signal to process pidfd_send_signal is used avoid any
>     race between verifying start time and sending signal. Following steps are
>     taken while killing a process.
>     3.1 Open a pid-fd for given pid with pidfd_open.
>     3.2 Verify start time of pid with stored start time. If start time does not
>         match, assume process is already dead.
>     3.3 Send signal to pid-fd with pidfd_send_signal. Even if pid was re-used
>         just after verifying star time, signal will be sent to a pidfd instance
>         which was created before verifying timestamp.



> 
> manish.mishra (1):
>    Check for pid re-use before killing qemu process
> 
>   src/conf/domain_conf.c                        |  14 ++-
>   src/conf/domain_conf.h                        |   1 +
>   src/libvirt_private.syms                      |   7 ++
>   src/qemu/qemu_domain.h                        |   1 +
>   src/qemu/qemu_process.c                       |  49 ++++++--
>   src/util/vircommand.c                         |  37 ++++++
>   src/util/vircommand.h                         |   3 +
>   src/util/virpidfile.c                         | 113 +++++++++++++-----
>   src/util/virpidfile.h                         |  17 +++
>   src/util/virprocess.c                         |  73 ++++++++++-
>   src/util/virprocess.h                         |   8 ++
>   .../qemustatusxml2xmldata/backup-pull-in.xml  |   2 +-
>   .../blockjob-blockdev-in.xml                  |   2 +-
>   .../blockjob-mirror-in.xml                    |   2 +-
>   .../migration-in-params-in.xml                |   2 +-
>   .../migration-out-nbd-bitmaps-in.xml          |   2 +-
>   .../migration-out-nbd-in.xml                  |   2 +-
>   .../migration-out-nbd-out.xml                 |   2 +-
>   .../migration-out-nbd-tls-in.xml              |   2 +-
>   .../migration-out-nbd-tls-out.xml             |   2 +-
>   .../migration-out-params-in.xml               |   2 +-
>   tests/qemustatusxml2xmldata/modern-in.xml     |   2 +-
>   tests/qemustatusxml2xmldata/upgrade-in.xml    |   2 +-
>   tests/qemustatusxml2xmldata/upgrade-out.xml   |   2 +-
>   .../qemustatusxml2xmldata/vcpus-multi-in.xml  |   2 +-
>   25 files changed, 295 insertions(+), 56 deletions(-)
> 



More information about the libvir-list mailing list