[libvirt] [PATCH] virsh: improve waiting for block job readiness

Peter Krempa pkrempa at redhat.com
Wed Jan 6 06:42:59 UTC 2016


On Tue, Jan 05, 2016 at 11:19:44 +1100, Michael Chapman wrote:
> On Mon, 4 Jan 2016, Peter Krempa wrote:
> > On Thu, Dec 31, 2015 at 16:34:49 +1100, Michael Chapman wrote:
> >> Wait for a block job event after the job has reached 100% only if
> >> exactly one of the BLOCK_JOB and BLOCK_JOB_2 callbacks were successfully
> >> registered.
> >>
> >> If neither callback was registered, then no event will ever be received.
> >> If both were registered, then any user-supplied path is guaranteed to
> >> match one of the events.
> >>
> >> Signed-off-by: Michael Chapman <mike at very.puzzling.org>
> >> ---
> >>

[...]

> >
> > The new statement is not true. If the user provides a filesystem path
> > different from what is in the definition but matching the same volume [1]
> > the callback will still return the path present in the configuration and
> > thus might not ever match and the job would hang forever.
> >
> > [1] e.g. /path/to/image and /path/to/../to/image are equivalent but
> > won't be equal using strcmp. The problem is even more prominent if you
> > mix in some symlinks.
> 
> As far I can tell the block job won't even be able to be identified if the 
> user specifies a different path than the one in the domain XML.
> 
> At present, the only implementation of the virGetBlockJobInfo API 
> is qemuDomainGetBlockJobInfo. The disk definition is found in the call 
> chain:
> 
>    qemuDomainGetBlockJobInfo
>      -> virDomainDiskByName
>        -> virDomainDiskIndexByName
> 
> and virDomainDiskIndexByName only does STREQ tests to match the supplied 
> path against the <source> or <target> elements.

When responding I didn't look at the code and remembered that we used
the path supplied by the user verbatim in some cases. This one is not
one of them though, so ...

> 
> So at present, the disk path supplied by the user of virGetBlockJobInfo 
> has to be *exactly* the source path or the target name, and these are 
> precisely the two strings used in the two events.

... You are right. qemuBlockJobEventProcess looks up the strings in the
definition by the disk alias and fires both events one with disk->path
and the second one with disk->dst.

> 
> The only safe way to allow different-but-equivalent paths to match the one 
> disk definition would be record the device+inode numbers in the disk 
> definition. We can't simply follow symlinks to resolve the path, since the 

That becomes even harder with various remote and network filesystems.

> symlinks could have changed since the domain was started -- e.g. 
> /path/to/../to/image may now be equivalent to /path/to/image, but 
> /path/to/image may not be the same as what it was when the domain was 
> started.
> 
> So on that basis I think we have to settle for requiring the 
> virGetBlockJobInfo path to match the disk definition exactly.
> 
> >> +        if (data->cb_id2 < 0 || data->cb_id2 < 0) {
> >> +            if (result == 0)
> >> +                break;
> >>
> >> -        /* for two-phase jobs we will try to wait in the synchronized phase
> >> -         * for event arrival since 100% completion doesn't necessarily mean that
> >> -         * the block job has finished and can be terminated with success */
> >> -        if (info.end == info.cur && --retries == 0) {
> >> -            ret = VIR_DOMAIN_BLOCK_JOB_READY;
> >> -            goto cleanup;
> >> +            /* only wait in the synchronized phase for event arrival if
> >> +             * either callback was registered */
> >> +            if (info.end == info.cur &&
> >> +                ((data->cb_id2 < 0 && data->cb_id2 < 0) || --retries == 0)) {
> >> +                ret = VIR_DOMAIN_BLOCK_JOB_READY;
> >> +                goto cleanup;
> >> +            }
> >>          }
> >
> > NACK to this approach, if there was a way how this ugly stuff could be
> > avoided or deleted I'd already do so.

I take back, I've remembered the code incorrectly.
qemuDomainBlockJobAbort passes the user provided string to
qemuDiskPathToAlias which uses virDomainDiskIndexByName.
virDomainDiskIndexByName uses the same data as is present in the two
events combined.

> >
> > Peter
> 
> OK. At any rate, I do think 2.5 seconds is not enough. On a hypervisor 
> with a large amount of memory I was able to generate block jobs that would 
> take 10-15 seconds to fully flush out to disk.

Actually I've witnessed some failures when attempting to pivot to a new
image after a blockcopy operation which made virsh to attempt the pivot
after that so this actually might fix it.

If you might post this again with the bug in the event checking fixed
I'll review it then.

Thanks and sorry for the noise.

Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20160106/2483dc7e/attachment-0001.sig>


More information about the libvir-list mailing list