[libvirt] [PATCHv4 12/18] blockjob: implement block copy for qemu
Eric Blake
eblake at redhat.com
Fri Apr 13 15:15:18 UTC 2012
On 04/13/2012 08:46 AM, Jiri Denemark wrote:
> On Mon, Apr 09, 2012 at 21:52:21 -0600, Eric Blake wrote:
>> Minimal patch to wire up all the pieces in the previous patches
>> to actually enable a block copy job. By minimal, I mean that
>> qemu creates the file (that is, no REUSE_EXT flag support yet),
>> SELinux must be disabled, a lock manager is not informed, and
>> the audit logs aren't updated. But those will be added as
>> improvements in future patches.
>>
>> * src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function.
>> (qemuDomainBlockRebase): Call it when appropriate.
>> ---
>> src/qemu/qemu_driver.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++-
>> 1 files changed, 113 insertions(+), 1 deletions(-)
>> +
>> + priv = vm->privateData;
>> + if (!(qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_MIRROR) &&
>> + qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_REOPEN))) {
>> + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s",
>
> We usually use VIR_ERR_CONFIG_UNSUPPORTED in such cases.
Sure, easy to fix. Actually, looking at 'git grep -B3 -i "qemu binary"
src/qemu', I counted an INTERNAL_ERROR, an OPERATION_FAILED, and a
couple of OPERATION_INVALID that should all be cleaned up, to match the
fact that the majority of uses did indeed favor CONFIG_UNSUPPORTED.
I'll split the cleanup into a separate patch.
>
>> + _("block copy is not supported with this QEMU binary"));
>> + goto cleanup;
>> + }
>> + if (vm->persistent) {
>> + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s",
>> + _("domain is not transient"));
>> + goto cleanup;
>> + }
>
> I guess I wasn't paying enough attention somewhere but why do we forbid block
> copy for persistent domains? I understand why we want to forbid certain
> operations when block copy is active but I don't see a reason for penalizing
> persistent domains.
It was in patch 8/18 where I added the restrictions, and hopefully
documented in that commit message why limiting things to transient is a
good first step:
1. the first client of live migration is oVirt, which uses transient domains
2. qemu does not (yet) provide a way to resume a mirror when restarting
a domain, so anything that would require restoring a domain from saved
state is broken: incoming migration, virsh restore, and virsh start
after a managed save. But users would be upset if they saved a domain,
only to find out that they cannot then restore it, so I squelched things
one step earlier in the process, by preventing any save of a domain so
that we never have a broken save image in the first place.
My worry now comes from the fact that managedsave is on the list of
forbidden operations. If a domain is transient, managedsave is already
forbidden (it is assumed that you either don't care about the domain if
the host dies, or that you are running a higher-level app like oVirt
that knows how to rebuild the guest on a different host). But if a
guest is persistent, and you use the libvirt-guests init script, then
you have a right to assume that rebooting your host will resume your
guests in the same state that they were prior to the host going down -
because libvirt-guests uses managedsave. If we allow a mirror job on a
persistent domain, we violate this assumption (libvirt-guests will fail
to save the guest). Therefore, I forbid to start a mirror job on a
persistent domain, just as I forbid to 'virsh define' a transient domain
into a persistent domain if a mirror job is active.
If, at a later date, qemu comes up with a way to resume mirroring when
restarting a domain, we can relax these restrictions.
>> +
>> + /* Actually start the mirroring */
>> + qemuDomainObjEnterMonitorWithDriver(driver, vm);
>> + ret = qemuMonitorDriveMirror(priv->mon, NULL, device, dest, format, flags);
>> + if (ret == 0 && bandwidth != 0)
>> + ret = qemuMonitorBlockJob(priv->mon, device, NULL, bandwidth, NULL,
>> + BLOCK_JOB_SPEED_INTERNAL);
>> + qemuDomainObjExitMonitorWithDriver(driver, vm);
>
> Can BLOCK_JOB_SPEED_INTERNAL fail while block copy remains running? If so, we
> should try to abort the job in case we fail to set the speed, otherwise we
> will report an error and forget about the job while qemu will keep copying
> data.
Good catch, and virDomainBlockPull() suffers from the same
ramifications. I think both code paths need the same fix.
>> @@ -11889,6 +12000,7 @@ static int
>> qemuDomainBlockPull(virDomainPtr dom, const char *path, unsigned long bandwidth,
>> unsigned int flags)
>> {
>> + virCheckFlags(0, -1);
>> return qemuDomainBlockRebase(dom, path, NULL, bandwidth, flags);
>> }
>
> Hmm, personally, I would make qemuDomainBlockPull call qemuDomainBlockJobImpl
> directly instead of going through qemuDomainBlockRebase while adding the flags
> check...
Easy enough to change.
--
Eric Blake eblake at redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 620 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20120413/ce012ee4/attachment-0001.sig>
More information about the libvir-list
mailing list