[PATCH 4/4] qemu_passt: Don't let passt fork off

Stefano Brivio sbrivio at redhat.com
Thu Feb 16 09:15:12 UTC 2023


On Thu, 16 Feb 2023 09:52:27 +0100
Michal Prívozník <mprivozn at redhat.com> wrote:

> On 2/15/23 19:30, Stefano Brivio wrote:
> > On Wed, 15 Feb 2023 18:04:56 +0100
> > Michal Prívozník <mprivozn at redhat.com> wrote:
> >   
> >> On 2/15/23 08:50, Laine Stump wrote:  
> >>> On 2/14/23 8:02 AM, Stefano Brivio wrote:    
> >>>> On Tue, 14 Feb 2023 12:51:22 +0100
> >>>> Michal Privoznik <mprivozn at redhat.com> wrote:
> >>>>    
> >>>>> When passt starts it tries to do some security measures to
> >>>>> restrict itself. For instance, it creates its own namespaces,
> >>>>> umounts basically everything, drops capabilities, forks off to
> >>>>> further restrict itself (the child is where all interesting work
> >>>>> takes place now). This is sound, except it's causing two
> >>>>> problems:
> >>>>>
> >>>>> 1) the PID file FD, which we leak into the passt process, gets
> >>>>>     closed (and thus our virPidFile*() helpers see unlocked PID
> >>>>>     file, which makes them think the process is gone),    
> >>>>
> >>>> I didn't realise this was the case, but giving passt write (unless I'm
> >>>> missing something) access to a file created by libvirtd doesn't look
> >>>> desirable to me.    
> >>>     
> >>>>    
> >>>>> 2) the PID file no longer reflects true PID of the process.
> >>>>>
> >>>>> Worse, the child calls setsid() so we can't even kill the whole
> >>>>> process group. I mean, we can but it won't be any good.    
> >>>
> >>> I think that (incorrect PID in the pidfile) is  happening because Michal
> >>> is using the original version of my patches that were pushed - I had
> >>> mimicked the behavior of slirp, where libvirt deamonizes the new
> >>> process. If that process then daemonizes itself, we have some sort of
> >>> "double daemon"; libvirt has saved off the pid of what it thinks is
> >>> going to be the final process, but then that process further forks and
> >>> exits from the process whose pid libvirt saved. But because passt was
> >>> cleaning up after itself I hadn't noticed the discrepancy in pids when
> >>> testing.
> >>>
> >>> Without going into all the details of the pidfile and locking and etc, I
> >>> just want to say that if we can fork/exec dnsmasq and let it daemonize
> >>> itself and create its own pidfile, then certainly we can do the same
> >>> thing for passt. (and if there's a fundamental problem, then it's a
> >>> fundamental problem for dnsmasq as well).    
> >>
> >> Alright. I think I have a solution that would please everybody involved.
> >> I'll post it tomorrow though. I need to test it thoroughly. We would be
> >> able to get passt's PID (which is needed not only for killing it, but
> >> also for CGroup placement), NOT use --foreground and still pass errors
> >> from it to users (that is unless logfile was specified, because
> >> unfortunately, --log-file and --stderr are mutually exclusive).  
> > 
> > That doesn't need to be the case (--log-file and --stderr being
> > mutually exclusive)... if you have a use case for it, let's change that
> > in passt. I just wanted to keep it simple for users ("give a log file,
> > and be sure it won't spam").
> > 
> > Also mind that Laine's series:
> >   https://archives.passt.top/passt-dev/20230215082437.110151-1-laine@redhat.com/  
> 
> Thanks, this looks exactly like what we need. So for now I can just pass
> --stderr if there's no --log-file, to deal with those "releases" that
> don't have those patches merged yet.

I wouldn't even bother with that, the user base (especially with
libvirt) is small enough that we can be quite confident that 100% of the
users will upgrade as soon as a new release (most likely coming this
week) includes them. I'd suggest to keep it simple, because you really
can.

Those are actual releases. Their naming just doesn't follow "semantic
versioning".

> > *should* already cover all the cases where libvirt is interested in
> > relaying "early" errors back to the user.
> > 
> > By the way, the one below is pretty much the patch I would have proposed
> > for libvirt. I prepared it earlier today and didn't have a chance to
> > test it yet, it's compile-tested only, and doesn't take cgroups into
> > account (which, it seems, is needed no matter the lifecycle).
> > 
> > So I'm sharing it here as reference (that's how simple I wanted it to
> > be -- minus cgroups), or if it's convenient for you to copy and paste
> > something.
> 
> This effectively disables placing passt into the CGroup set up for
> emulator thread. And I don't think we want that. Firstly, it makes
> statistics gathering report incorrect values. Secondly, these helper
> processes are "implementation detail" - I mean, users don't really care
> (from accounting POV) whether a task runs in emulator thread inside of
> QEMU or in a separate process. It's still an emulation and as such
> should be accounted for. And also, on NUMA machines we definitely want
> to place passt as close to the emulator as possible (i.e. if emulator
> thread is pinned than helper processes should be pinned too).

Yes, definitely, I see now -- I thought, earlier, that cgroups were just
used to handle lifecycles at the moment.

> [...]

-- 
Stefano


More information about the libvir-list mailing list