[libvirt] [PATCHv9 5/9] blockjob: make drive-reopen safer

Peter Krempa pkrempa at redhat.com
Fri Oct 26 13:08:46 UTC 2012


On 10/23/12 04:10, Eric Blake wrote:
> Since libvirt drops locks between issuing a monitor command and
> getting a response, it is possible for libvirtd to be restarted
> before getting a response on a drive-reopen command; worse, it is
> also possible for the guest to shut itself down during the window
> while libvirtd is down, ending the qemu process.  A management app
> needs to know if the pivot happened (and the destination file
> contains guest contents not in the source) or failed (and the source
> file contains guest contents not in the destination), but since
> the job is finished, 'query-block-jobs' no longer tracks the
> status of the job, and if the qemu process itself has disappeared,
> even 'query-block' cannot be checked to ask qemu its current state.
>
> This is mainly a problem for the RHEL 6.3 drive-reopen command; which
> partly explains why upstream qemu 1.3 abandoned that command and
> went with block-job-complete plus persistent bitmap instead.  At
> the time of this patch, the design for persistent bitmap has not
> been clarified, so a followup patch will be needed once we actually
> figure out how to use the qemu 1.3 interface.
>
> If we surround 'drive-reopen' with a pause/resume pair, then we can
> guarantee that the guest cannot modify either source or destination
> files in the window of libvirtd uncertainty, and the management app
> is guaranteed that either libvirt knows the outcome and reported it
> correctly; or that on libvirtd restart, the guest will still be
> paused and that the qemu process cannot have disappeared due to
> guest shutdown; and use that as a clue that the management app must
> implement recovery protocol, with both source and destination files
> still being in sync and with 'query-block' still being an option as
> part of that recovery.  My testing of the RHEL 6.3 implementation
> of 'drive-reopen' show that the pause window will typically be only
> a fraction of a second.
>
> * src/qemu/qemu_driver.c (qemuDomainBlockPivot): Pause around
> drive-reopen.
> (qemuDomainBlockJobImpl): Update caller.
> ---
>   src/qemu/qemu_driver.c | 37 +++++++++++++++++++++++++++++++++++--
>   1 file changed, 35 insertions(+), 2 deletions(-)
>

ACK with rhel stuff in, but should/could be dropped if we will support 
only the upstream functionality.

Peter




More information about the libvir-list mailing list