[libvirt] [PATCH v2 3/3] qemuMigrationDriveMirror: Listen to events

Michal Privoznik mprivozn at redhat.com
Wed Feb 11 16:45:00 UTC 2015


On 11.02.2015 17:42, Jiri Denemark wrote:
> On Wed, Feb 11, 2015 at 13:51:11 +0100, Michal Privoznik wrote:
>> When migrating with storage, libvirt iterates over domain disks and
>> instruct qemu to migrate the ones we are interested in (shared, RO and
>> source-less disks are skipped). The disks are migrated in series. No
>> new disk is transferred until the previous one hasn't been quiesced.
>> This is checked on the qemu monitor via 'query-jobs' command. If the
>> disk has been quiesced, it practically went from copying its content
>> to mirroring state, where all disk writes are mirrored to the other
>> side of migration too. Having said that, there's one inherent error in
>> the design. The monitor command we use reports only active jobs. So if
>> the job fails for whatever reason, we will not see it anymore in the
>> command output. And this can happen fairly simply: just try to migrate
>> a domain with storage. If the storage migration fails (e.g. due to
>> ENOSPC on the destination) we resume the host on the destination and
>> let it run on partly copied disk.
>>
>> The proper fix is what even the comment in the code says: listen for
>> qemu events instead of polling. If storage migration changes state an
>> event is emitted and we can act accordingly: either consider disk
>> copied and continue the process, or consider disk mangled and abort
>> the migration.
>>
>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>> ---
>>  src/qemu/qemu_migration.c | 37 +++++++++++++++++--------------------
>>  1 file changed, 17 insertions(+), 20 deletions(-)
>>
>> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
>> index 4506d87..14a4ec6 100644
>> --- a/src/qemu/qemu_migration.c
>> +++ b/src/qemu/qemu_migration.c
>> @@ -1739,7 +1739,6 @@ qemuMigrationDriveMirror(virQEMUDriverPtr driver,
> ...
>> -            /* XXX Frankly speaking, we should listen to the events,
>> -             * instead of doing this. But this works for now and we
>> -             * are doing something similar in migration itself anyway */
>> +            /* XXX Turn this into virCond someday. */
> 
> Exactly, this is what I wanted to suggest when seeing this code. The
> code should be even simpler with virCond so why not doing the change
> right away? I think this could be done as a separate patch (in this
> series) to make reviewing the code easier.

Exactly. It has not direct influence on this patcheset. The virCond
patch should be easy and small though, so after this is merged, I can
propose one. But I'd really love to fix the bug asap.

Michal




More information about the libvir-list mailing list