[libvirt] [Qemu-devel] [RFC 0/5] block: File descriptor passing using -open-hook-fd

Anthony Liguori anthony at codemonkey.ws
Tue May 1 21:52:05 UTC 2012


On 05/01/2012 03:56 PM, Eric Blake wrote:
> On 05/01/2012 02:25 PM, Anthony Liguori wrote:
>> Thanks for sending this out Stefan.
>
> Indeed.
>
>
>>> This series adds the -open-hook-fd command-line option.  Whenever QEMU
>>> needs to
>>> open an image file it sends a request over the given UNIX domain
>>> socket.  The
>>> response includes the file descriptor or an errno on failure.  Please
>>> see the
>>> patches for details on the protocol.
>>>
>>> The -open-hook-fd approach allows QEMU to support file descriptor passing
>>> without changing -drive.  It also supports snapshot_blkdev and other
>>> commands
>>> that re-open image files.
>>>
>>> Anthony Liguori<aliguori at us.ibm.com>   wrote most of these patches.  I
>>> added a
>>> demo -open-hook-fd server and added some small fixes.  Since Anthony is
>>> traveling right now I'm sending the RFC for discussion.
>>
>> What I like about this approach is that it's useful outside the block
>> layer and is conceptionally simple from a QEMU PoV.  We simply delegate
>> open() to libvirt and let libvirt enforce whatever rules it wants.
>>
>> This is not meant to be an alternative to blockdev, but even with
>> blockdev, I think we still want to use a mechanism like this even with
>> blockdev.
>
> The overall series looks like it would be rather interesting.  What sort
> of timing restrictions are there?  For example, the proposed
> 'drive-reopen' command (probably now delegated to qemu 1.2) would mean
> that qemu would be calling back into libvirt in order to do the reopen.
>   If libvirt takes its time in passing back an open fd, is it going to
> starve qemu from answering unrelated monitor commands in the meantime?

s/libvirt/kernel/g and your concerns are equally valid.

Doing open() should never be done in a path that could block things.  There's 
always the possibility that we're on top of NFS and the open could timeout.

For something like drive_reopen, we should use an asynchronous open() that 
dispatched the open() in the posix-aio thread pool.

That's part of what's nice about this approach, we could still call file_open() 
in the posix-aio thread pool...

> I definitely want to make sure we avoid deadlock where libvirt is
> waiting on a monitor command, but the monitor command is waiting on
> libvirt to pass an fd.
>
> Is this also an opportunity to request whether a particular fd must be
> seekable vs. acceptable as a one-pass read or write, perhaps by whether
> the command is 1 (seekable open) or 2 (one-pass open)?

I'm not really sure where the distinction lies...

I want the RPC to behave exactly like open().  So if we're assuming that open() 
of a /dev/ file returns something that is ioctl()'able, then that's what libvirt 
should return.

If we want to sort of do fd-transformation where a special protocol is used for 
things like ioctl, that's fine, but it ought to be a different mechanism (that's 
probably not nearly as generic).

> For example,
> migration is one-pass (and therefore libvirt passes a pipe which is
> hooked up to a helper app that uses O_DIRECT), while block devices must
> be seekable.

But migration doesn't involve doing an open().  This is not a replacement for fd 
passing.  This is a replacement for open() to make up for the facts that (1) 
some management tools like libvirt cannot isolate guests with DAC and (2) 
SELinux cannot be used to isolate guests across all file systems.

I would really prefer that the kernel fix this problem for us, but from what I'm 
told, the problem lies in the NFS standards committee so short of forking the 
NFS protocol, there isn't much that the kernel can do.

Regards,

Anthony Liguori

>




More information about the libvir-list mailing list