[PATCH 1/3] qemu: add option to process offloaded legacy blockjob event ealier

Thu Oct 29 12:49:14 UTC 2020

On 29.10.2020 10:59, Peter Krempa wrote:
> On Thu, Oct 22, 2020 at 17:46:28 +0300, Nikolay Shirokovskiy wrote:
>>
>>
>> On 22.10.2020 15:12, Peter Krempa wrote:
>>> On Tue, Oct 20, 2020 at 16:44:07 +0300, Nikolay Shirokovskiy wrote:
>>>> Currently in qemuProcessHandleBlockJob we either offload blockjob event
>>>> processing to the worker thread or notify another thread that waits for
>>>> blockjob event and that thread processes the event. But sometimes after event
>>>> is offloaded to the worker thread we need to process the event in a different
>>>> thread.
>>>>
>>>> To be able to to do it let's always set newstate/errmsg and then let whatever
>>>> thread is first process it.
>>>>
>>>> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
>>>> ---
>>>>  src/qemu/qemu_driver.c  | 17 ++++-------------
>>>>  src/qemu/qemu_process.c | 20 ++++++++++++--------
>>>>  2 files changed, 16 insertions(+), 21 deletions(-)
>>>>
>>>
>>> [...]
>>>
>>>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>>>> index 6422881..4d63e7d 100644
>>>> --- a/src/qemu/qemu_process.c
>>>> +++ b/src/qemu/qemu_process.c
>>>> @@ -953,13 +953,19 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon G_GNUC_UNUSED,
>>>>      if (!(disk = qemuProcessFindDomainDiskByAliasOrQOM(vm, diskAlias, NULL)))
>>>>          goto cleanup;
>>>>  
>>>> -    job = qemuBlockJobDiskGetJob(disk);
>>>> +    if (!(job = qemuBlockJobDiskGetJob(disk))) {
>>>> +        VIR_DEBUG("creating new block job object for '%s'", diskAlias);
>>>> +        if (!(job = qemuBlockJobDiskNew(vm, disk, type, diskAlias)))
>>>
>>> So this actually writes out the status XML. I'm not sure if that is a
>>> good idea to do if we don't hold the domain job. The premise is that
>>> threads holding the job lock might have partially modified it. 
>>>
>>
>> Hi, Peter.
>>
>> Yeah, I did not notice that.
>>
>> There is another reason to have this patch (modified of course to prevent the issue
>> you mentioned). Without it is just hard to synchronize block jobs correctly on 
>> reconnect.
>>
>> We call "query-block-jobs" to do the sync and set block jobs state based on
>> query result. But first we can have pending events that we received before
>> quering. So after the reconnect we have to deal with this outdated events.
>> Second we can receive events after querying but before reconnection is finished
>> and these events also will be offloaded. This can be an issue if we use sync
>> mode like as in qemuMigrationSrcCancel because we won't see this events as they
>> offloaded. I don't think qemuMigrationSrcCancel is really affected as all we
>> want to do it just cancel block jobs.
> 
> I'm still worried about moving the code which adds the blockjob to the 
> blockjob list outside of a MODIFY domain job. I considered the blockjob
> list to be modified only with the MODIFY domain job and thus this would
> at the very least require audit of all the code paths related to
> blockjobs.
> 
>> So we can miss some block job events in sync mode on reconnection and can have
>> outdated block job events after reconnection. Probably it can be handled
>> another way but with the approach as in this patch we can eliminate these
>> possibilities.
> 
> The legacy block job code should be robust against those though. Any
> state changes are done by redetection rather than internal state
> handling so firing a event twice should be fine. Same way the code must
> be robust against missing an event especially on reconnection. With the
> old code we don't store the job list so we don't know if we missed
> something in the first place.
> 
>> Also "new" block job code uses approach just as this patch namely save
>> state changes in job so other threads can see it not just the worker thread.
>> And this option is used by reconnection code for new block jobs.
> 
> Yes, but my worry is modifying the old block job code at all. Ideally
> we'll just delete it ASAP. My testing is nowadays limited to new qemu
> with new block job code and hopefully everybody upgrades ASAP.

Yeah, I understand. We in Virtuozzo still based on Centos 7 yet (but
moving to Centos 8 where last updates has qemu/libvirt that support new
blockjobs). I have to add some Vz specific changes to libvirt (to
support our Virtuozzo Storage) so I closely inspect old block job 
code and thus send these patches.

> 
> Thus I'm very worried by changes to the old block job code and
> especially those which would impact semantics of the job handling in
> regards of the new block job code (allowing additions to the block job
> table outside of the MODIFY domain job).
> 
> If you can pull off the same without adding a block job without holding
> the modify domain job I might be okay with such patch. But preferrably
> just upgrade to use the new code which makes sense to fix.
> 

Well I need to change patches at least in our version of libvirt to 
address the issues/worries you mentioned. I want to send next version
too so that you can take a look and give some advice/share your opinion.
It is helpful that for this version you noticed things I didn't.

Nikolay