[libvirt] New QEMU daemon for persistent reservations

Mon Aug 28 11:11:31 UTC 2017

On 08/25/2017 12:41 AM, Paolo Bonzini wrote:
> On 22/08/2017 18:27, Paolo Bonzini wrote:
>> Hi all,

Hey, sorry for late reply. I was enjoying my PTO by not reading e-mails :-)

>>
>> I am adding a new daemon to QEMU, that QEMU can connect to in order to
>> issue persistent reservation commands.

Persistent reservation of what?

>>
>> The daemon can only issue the commands on file descriptor that QEMU
>> already has.  In addition normal users shouldn't have access to the
>> daemon's Unix socket in /run, so the daemon is protected against misuse.
>>
>> My question is what is the best way to handle the connection to the
>> daemon socket.  Currently, the path to the socket is passed to QEMU on
>> the command line:
>>
>>  -object pr-manager-helper,id=mgr,path=/run/qemu-pr-helper.sock \
>>  -drive if=none,id=hd,driver=raw,filename=/dev/sdb,file.pr-manager=mgr \
>>  -device scsi-block,drive=hd
>>
>> (the new parts are "-object pr-manager-helper" and "file.pr-manager").
>>
>> I could just make it root:root and pass a file descriptor from libvirt
>> to QEMU, but this would make it impossible for QEMU to reconnect to the
>> daemon in case someone does a "systemctl restart" or even just kills it
>> inadvertently.  The daemon is stateless, so transparent reconnection
>> would be a nice feature to have.
>>
>> The alternative is to somehow label the daemon socket so that it can be
>> accessed by QEMU, but I'm not very well versed in SELinux.
> 
> Thinking more about it, Libvirt could start the daemon on its own,
> assigning a socket per virtual machine.  SELinux MCS should then just
> work, because the same category is assigned to the daemon instance and QEMU.

Whoa, libvirt is not really prepared for qemu spawning processes, or
having more than one process per qemu domain. But it shouldn't be that
hard to prepare internal structs for that.

> 
> In particular, Libvirt could create the socket itself, label it, and
> pass it to the daemon through the systemd socket activation protocol
> (which qemu-pr-helper supports).

We can pass FDs to qemu (in fact any process). Even if its running. So
that shouldn't be a problem.

> 
> The XML to use the helper with a predefined socket could be:
> 
> 	<disk ...>
> 	   <pr mode='connect'>/path/to/unix.socket'</pr>
> 	</disk>

Do we want to/need to expose the path here? I mean, is user expected to
do something with it? We don't expose monitor path anywhere but keep it
private (of course we store it in so called status XML which is a
persistent storage solely for purpose of reloading the internal state
when daemon is restarted).

> 
> while to use it with a dedicated daemon
> 
> 	<disk ...>
> 	   <pr mode='private'>/path/to/qemu-pr-helper</pr>
> 	</disk>

Ah, so there isn't 1:1 relationship with qemu process and the daemon
helper. One daemon can serve multiple qemu processes, am I right? Also,
how would libvirt know if the daemon helper dies? I mean, if libvirt is
to start it again (there are some systemd-less distros), we have to know
that such situation happened. For instance, we can get an event on the
monitor to which we start the daemon and pass new FD to its socket to
qemu? Although, this would mean a significant work on libvirt side in
case there's 1:N relationship. Because on delivery of the event from two
domains we have to figure out if the domains are supposed to have their
own daemons or one shared.

Also, what happens when the daemon dies? What's the qemu state at that
point? Is the guest still running or is it paused (e.g. like on ENOSPC
error on disks)?

Michal