[PATCH v11 13/13] block: apply COR-filter to block-stream jobs

Thu Oct 15 17:16:40 UTC 2020

On 14.10.2020 19:24, Max Reitz wrote:
> On 12.10.20 19:43, Andrey Shinkevich wrote:

[...]

>> ---
>>   block/stream.c             | 93 +++++++++++++++++++++++++++++-----------------
>>   tests/qemu-iotests/030     | 51 +++----------------------
>>   tests/qemu-iotests/030.out |  4 +-
>>   tests/qemu-iotests/141.out |  2 +-
>>   tests/qemu-iotests/245     | 19 +++++++---
>>   5 files changed, 81 insertions(+), 88 deletions(-)
> 
> Looks like stream_run() could be a bit streamlined now (the allocation
> checking should be unnecessary, unconditionally calling
> stream_populate() should be sufficient), but not necessary now.
> 

That is what I had kept in my mind when I tackled this patch. But there 
is an underwater reef to streamline. Namely, how the block-stream job 
gets known about a long unallocated tail to exit the loop earlier in the 
stream_run(). Shall we return the '-EOF' or another error code from the 
cor_co_preadv_part() to be handled by the stream_run()? Any other 
suggestions, if any, will be appreciated.

>> diff --git a/block/stream.c b/block/stream.c
>> index d3e1812..93564db 100644
>> --- a/block/stream.c
>> +++ b/block/stream.c
> 
> [...]

>> +
>> +    cor_filter_bs = bdrv_cor_filter_append(bs, opts, BDRV_O_RDWR, errp);
>> +    if (cor_filter_bs == NULL) {
>> +        goto fail;
>> +    }
>> +
>> +    if (bdrv_freeze_backing_chain(cor_filter_bs, bs, errp) < 0) {
> 
> Is there a reason why we can’t combine this with the
> bdrv_free_backing_chain() from bs down to above_base?  I mean, the
> effect should be the same, just asking.
> 

The bdrv_freeze_backing_chain(bs, above_base, errp) is called before the 
bdrv_reopen_set_read_only() to keep the backing chain safe during the 
context switch. Then we will want to freeze the 'COR -> TOP BS' link as 
well. Freezing/unfreezing parts is simlier to manage than doing that 
with the whole chain.
If we decide to invoke the bdrv_reopen_set_read_only() after freezing 
the backing chain together with the COR-filter, we will not be able to 
get the 'write' permission on the read-only node.

>> +        bdrv_cor_filter_drop(cor_filter_bs);
>> +        cor_filter_bs = NULL;
>> +        goto fail;
>> +    }
>> +
>> +    s = block_job_create(job_id, &stream_job_driver, NULL, cor_filter_bs,
>> +                         BLK_PERM_CONSISTENT_READ,
>> +                         basic_flags | BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD,
> 
> Not that I’m an expert on the GRAPH_MOD permission, but why is this
> shared here but not below?  Shouldn’t it be the same in both cases?
> (Same for taking it as a permission.)
> 

When we invoke the block_job_add_bdrv(&s->common, "active node", bs,..) 
below (particularly, we need it to block the operations on the top node, 
bdrv_op_block_all()), we ask for the GRAPH_MOD permission for the top 
node. To allow that, the parent filter node should share that permission 
for the underlying node. Otherwise, we get assertion failed in the 
bdrv_check_update_perm() called from bdrv_replace_node() when we remove 
the filter.

>>                            speed, creation_flags, NULL, NULL, errp);
>>       if (!s) {
>>           goto fail;
>>       }
>>   
>> +    /*
>> +     * Prevent concurrent jobs trying to modify the graph structure here, we
>> +     * already have our own plans. Also don't allow resize as the image size is
>> +     * queried only at the job start and then cached.
>> +     */
>> +    if (block_job_add_bdrv(&s->common, "active node", bs,
>> +                           basic_flags | BLK_PERM_GRAPH_MOD,
>> +                           basic_flags | BLK_PERM_WRITE, &error_abort)) {
>> +        goto fail;
>> +    }
>> +
>>       /* Block all intermediate nodes between bs and base, because they will
>>        * disappear from the chain after this operation. The streaming job reads
>>        * every block only once, assuming that it doesn't change, so forbid writes
> 
> [...]
> 
>> diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
>> index e60c832..940e85a 100755
>> --- a/tests/qemu-iotests/245
>> +++ b/tests/qemu-iotests/245
>> @@ -899,17 +899,26 @@ class TestBlockdevReopen(iotests.QMPTestCase):
>>           # make hd1 read-only and block-stream requires it to be read-write
>>           # (Which error message appears depends on whether the stream job is
>>           # already done with copying at this point.)
> 
> Hm.  Let’s look at the set of messages below... [1]
> 
>> -        self.reopen(opts, {},
>> +        # As the COR-filter node is inserted into the backing chain with the
>> +        # 'block-stream' operation, we move the options to their proper nodes.
>> +        opts = hd_opts(1)
> 
> Oh, so this patch changes it so that only the subtree below hd1 is
> reopened, and we don’t have to deal with the filter options.  Got it.
> (I think.)
> 

Yes, that's right.

>> +        opts['backing'] = hd_opts(2)
>> +        opts['backing']['backing'] = None
>> +        self.reopen(opts, {'read-only': True},
>>               ["Can't set node 'hd1' to r/o with copy-on-read enabled",
> 
> [1]
> 
> This isn’t done anymore as of this patch.  So I don’t think this error
> message can still appear.  Will some other message appear in its stead,
> or is it always going to be the second one?
> 

The only second message appears in the test case when I run it on my 
node. So, I will remove the first one as the bdrv_enable_copy_on_read() 
is not called for the top BS on the frozen backing chain anymore.
Also, I will delet the part of the comment:
"(Which error message appears depends on whether the stream job is 
already done with copying at this point.)"

>>                "Cannot make block node read-only, there is a writer on it"])
>>   
>>           # We can't remove hd2 while the stream job is ongoing
>> -        opts['backing']['backing'] = None
>> -        self.reopen(opts, {'backing.read-only': False}, "Cannot change 'backing' link from 'hd1' to 'hd2'")
>> +        opts['backing'] = None
>> +        self.reopen(opts, {'read-only': False},
>> +                    "Cannot change 'backing' link from 'hd1' to 'hd2'")
>>   
>> -        # We can detach hd1 from hd0 because it doesn't affect the stream job
>> +        # We can't detach hd1 from hd0 because there is the COR-filter implicit
>> +        # node in between.
>> +        opts = hd_opts(0)
>>           opts['backing'] = None
>> -        self.reopen(opts)
>> +        self.reopen(opts, {},
>> +                    "Cannot change backing link if 'hd0' has an implicit backing file")
> 
> Does “has an implicit backing file” mean that hd0 has an implicit node
> (the COR filter) as its backing file?  And then reopening isn’t allowed
> because the user supposedly doesn’t know about that implicit node?  If
> so, makes sense.

Yes, it is.

Andrey

> 
> Max
> 
>>   
>>           self.vm.run_job('stream0', auto_finalize = False, auto_dismiss = True)
>>   
>>
> 
>