[PATCH 8/8] qapi: add blockdev-replace command

Thu Sep 23 11:54:30 UTC 2021

23.09.2021 13:09, Markus Armbruster wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov at virtuozzo.com> writes:
> 
>> Thanks a lot for reviewing!
>>
>> 20.09.2021 09:44, Markus Armbruster wrote:
>>> Vladimir Sementsov-Ogievskiy <vsementsov at virtuozzo.com> writes:
>>>
>>>> Add command that can add and remove filters.
>>>>
>>>> Key points of functionality:
>>>>
>>>> What the command does is simply replace some BdrvChild.bs by some other
>>>> nodes. The tricky thing is selecting there BdrvChild objects.
>>>> To be able to select any kind of BdrvChild we use a generic parent_id,
>>>> which may be a node-name, or qdev id or block export id. In future we
>>>> may support block jobs.
>>>>
>>>> Any kind of ambiguity leads to error. If we have both device named
>>>> device0 and block export named device0 and they both point to same BDS,
>>>> user can't replace root child of one of these parents. So, to be able
>>>> to do replacements, user should avoid duplicating names in different
>>>> parent namespaces.
>>>>
>>>> So, command allows to replace any single child in the graph.
>>>>
>>>> On the other hand we want to realize a kind of bdrv_replace_node(),
>>>> which works well when we want to replace all parents of some node. For
>>>> this kind of task @parents-mode argument implemented.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov at virtuozzo.com>
>>>> ---
>>>>    qapi/block-core.json | 78 +++++++++++++++++++++++++++++++++++++++++
>>>>    block.c              | 82 ++++++++++++++++++++++++++++++++++++++++++++
>>>>    2 files changed, 160 insertions(+)
>>>>
>>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>>> index 675d8265eb..8059b96341 100644
>>>> --- a/qapi/block-core.json
>>>> +++ b/qapi/block-core.json
>>>> @@ -5433,3 +5433,81 @@
>>>>    { 'command': 'blockdev-snapshot-delete-internal-sync',
>>>>      'data': { 'device': 'str', '*id': 'str', '*name': 'str'},
>>>>      'returns': 'SnapshotInfo' }
>>>> +
>>>> +##
>>>> +# @BlockdevReplaceParentsMode:
>>>> +#
>>>> +# Alternative (to directly set @parent) way to chose parents in
>>>> +# @blockdev-replace
>>>> +#
>>>> +# @exactly-one: Exactly one parent should match a condition, otherwise
>>>> +#               @blockdev-replace fails.
>>>> +#
>>>> +# @all: All matching parents are taken into account. If replacing lead
>>>> +#       to loops in block graph, @blockdev-replace fails.
>>>> +#
>>>> +# @auto: Same as @all, but automatically skip replacing parents if it
>>>> +#        leads to loops in block graph.
>>>> +#
>>>> +# Since: 6.2
>>>> +##
>>>> +{ 'enum': 'BlockdevReplaceParentsMode',
>>>> +  'data': ['exactly-one', 'all', 'auto'] }
>>>> +
>>>> +##
>>>> +# @BlockdevReplace:
>>>> +#
>>>> +# Declaration of one replacement.
>>>
>>> Replacement of what?  A node in the block graph?
>>
>> A specific child node in one or in several edges
> 
> Spell that out in the doc comment, please.
> 
>>>
>>>> +#
>>>> +# @parent: id of parent. It may be qdev or block export or simple
>>>> +#          node-name.
>>>
>>> It may also be a QOM path, because find_device_state() interprets
>>> arguments starting with '/' as QOM paths.
>>>
>>> When is a node name "simple"?
>>>
>>> Suggest: It may be a qdev ID, a QOM path, a block export ID, or a node
>>> name.
>>
>> OK
>>
>>>
>>> The trouble is of course that we're merging three separate name spaces.
>>
>> Yes. Alternatively we can also add an enum of node-type [bds, device, export[, job]], and select graph nodes more explicit (by pair of id/path/name and type)
>>
>> But if we have to use these things in one context, it seems good to enforce users use different names for them. And in this way, we can avoid strict typing.
> 
> Is there precedence in QMP for merging ID name spaces, or for selecting
> a name space?

Hmm, I didn't see neither of it.

> 
>>> Aside: a single name space for IDs would be so much saner, but we
>>> screwed that up long ago.
> 
> Throwing some of the multiple name spaces together some of the time
> feels like another mistake.
> 
>>>
>>>>                          If id is ambiguous (for example node-name of
>>>> +#          some BDS equals to block export name), blockdev-replace
>>>> +#          fails.
>>>
>>> Is there a way out of this situation, or are is replacement simply
>>> impossible then?
>>
>> In my idea, it's simply impossible. If someone want to use this new interface, he should care to use different names for different things.
> 
> Reminds me of device_del, which simply could not delete a device without
> an ID.  Made many users go "oh" (or possibly a more colorful version
> thereof), until daniel fixed it in commit 6287d827d4 "monitor: allow
> device_del to accept QOM paths" for v2.5.
> 
>>>
>>>>                      If not specified, blockdev-replace goes through
>>>> +#          @parents-mode scenario, see below. Note, that @parent and
>>>> +#          @parents-mode can't be specified simultaneously.
>>>
>>> What if neither is specified?  Hmm, @parents-mode has a default, so
>>> that's what we get.
>>>
>>>> +#          If @parent is specified, only one edge is selected. If
>>>> +#          several edges match the condition, blockdev-replace fails.
>>>> +#
>>>> +# @edge: name of the child. If omitted, any child name matches.
>>>> +#
>>>> +# @child: node-name of the child. If omitted, any child matches.
>>>> +#         Must be present if @parent is not specified.
>>>
>>> Is @child useful when @parent is present?
>>
>> You may specify @child and @parent, to replace child in specific edge. Or @parent and @edge. Or all three fields: just to be strict.
>>
>>>
>>> What's the difference between "name of the child" and "node name of the
>>> child"?
>>
>> Although we have to deal with different kinds of nodes (BDS, exports, blks, ...),
>> children are always BDS.
>>
>> But, may be in the context, it's better say "id of the child".
> 
> I'm confused about the difference between "@edge: name of the child",
> and "@child: node-name of the child".
> 
>>>
>>>> +#
>>>> +# @parents-mode: declares how to select edge (or edges) when @parent
>>>> +#                is omitted. Default is 'one'.
>>>
>>> 'exactly-one'
>>>
>>> Minor combinatorial explosion.  There are four optional arguments, one
>>> of them an enum, and only some combination of argument presence and enum
>>> value are valid.  For a serious review, I'd have to make a table of
>>> combinations, then think through every valid row.
>>>
>>> Have you considered making this type a union?  Can turn some of your
>>> semantic constraints into syntactical ones.  Say you turn
>>> BlockdevReplaceParentsMode into a tag enum by adding value 'by-id'.
>>> Then branch 'by-id' has member @parent, and the others don't.
>>
>>
>> OK. Now, after some time passed, I see that some additional clarifications are needed. Even for me :)
> 
> Sounds familiar :)
> 
>> So, the actual modes I have in mind:
>>
>> 1. Replacement for backup: we want to inject copy-before-write filter F above some node X, so that all parents of node X start to access X through filter F. But we want automatically skip parents if modifications leads to loops in the graph (so, we can first carete node F with X as a child, than do replacement, and don't replace child of F by F :).
>>
>> That's  parents-mode=auto & parent=None & edge=None & child=X
>>
>> 2. Replacement of any specific edge in the graph.
>>
>> Edge may be specified in different ways: by parent, by child, by edge, and by some combinations of these things. It seems reasonable to allow any combination, if it specifies exactly one field.. Assume we have A -- backing --> B relation in the graph, and want to replace B by filter F in that relation.
> 
> An edge always goes from a source node (a.k.a. parent) to a target node
> (a.k.a. child).
> 
> Each edge from a source node has a unique name in the source node, such
> as "backing".
> 
> Correct?

I'm not sure.. Of course node can't have several backing children.. Quorum cares to name children differently.

But for example, block-stream job may have several children named simply "intermediate node".

But block-jobs children is so internal feature, that I'm not sure we can allow user simply replace them. That's why this series doesn't allow select jobs as parents.

> 
> The obvious way to identify an edge is (source node name, edge name).
> 
> Throwing in the target name is redundant.  Observation, not criticism.
> 
> All other ways can be ambigous:
> 
>      (source node name, target node name), because multiple edges can
>      connect the two.

Still, I have never seen such a use case)

> 
>      (edge name, target node name), because multiple source nodes can use
>      the same edge name to connect to the target node.
> 
>      ...
> 
> Even ways that can be ambigous need not be in a specific graph:
> 
>      Just source node name suffices when there is only one edge leaving
>      it.
> 
>      Even just edge name can theoretically suffice.
> 
>      ...
> 
> Do we really *need* this much flexibility?  Why can't we simply require
> (source node name, edge name), and call it a day?

I don't know) That just what come into my mind. It's simple enough to restrict the flexibility for now and add it in future if needed.

> 
>> 2.1 Specify parent:
>>
>> We may specify all information bits, to be sure that we do what we want and for high probability to fail if we have wrong impression about what's going on in the graph:
>>
>> parents-mode=None & parent=A & edge=backing & child=B
>>
>> We can omit edge:
>>
>> parents-mode=None & parent=A & edge=None & child=B
>>
>>    - that should fail as ambiguous if B is "double child" of A, with two edges from A to B. But I think, that's unused combination for now)
>>
>> Or we can omit child
>>
>> parents-mode=None & parent=A & edge=backing & child=None
>>
>>    - that should work well, as node shouldn't have more than one backing child.
>>
>> and we can omit both edge and child:
>>
>> parents-mode=None & parent=A & edge=None & child=None
>>
>>    - that will work only if A has exactly one child and fails otherwise. So, that's bad for format nodes but good for filters and for block devices.
>>
>> 2.2 Don't specify parent but specify child:
>>
>> parents-mode=exactly-one & parent=None & edge=backing & child=B
>>
>>    - works if B has only one parent with B as backing child
>>
>> parents-mode=exactly-one & parent=None & edge=None & child=B
>>
>>    - works if B has only one parent
>>
>> ======================
>>
>> Now, what's more?
>>
>> parents-mode=auto & parent=None & edge=root & child=X
>>
>> - replace only child only for root parents of X  -  may make sense
>>
>>
>> And all other combinations are
>>
>> parents-mode=ANY & parent=None & edge=ANY & child=None
>>
>>    - don't specify neither parent nor child. That works bad with any mode.. Theoretically, we still can support it by looking through the whole graph. If edge=backing and we only only one backing edge in the whole graph we can serve the request.. But we can simply fail and not care.
>>
>> =====================
>>
>> What's bad, is that 2.1 and 2.2 are not symmetrical. So, right, it seems better to turn it into union:
>>
>> 1. mode = auto
>>
>> Replace child in all it's parents where edge match to @edge and avoiding creating loops in the graph
>>
>> child: required, specify child
>> edge: optional, if specified, do replacement only in such edges
> 
> This is almost the same as a transaction of one-edge replacements for
> all parents, optionally filtered by @edge.
> 
> They differ when the parents can change spontaneously.  The transaction
> then might be for a stale set of parents.  Can this happen?
> 
> The other difference is of course that having to enumerate the edges
> could be bothersome.  Some amount of bother is okay.  QMP provides basic
> building blocks.  When we try to provide more, we tend to fail.
> 
>> 2. mode = one-edge
>>
>> Replace child in exactly one edge. If more than one edge matches - re[ace nothing and fail.
>>
>> parent: optional
>> edge: optional
>> child: optional
>>
>>    - all fields optional, but user is responsible to not be ambiguous. Still, we can enforce that at least one of @parent and @child should be specified.
> 
> Do we really need this much flexibility in edge selection?
> 
>>
>>>
>>>> +#
>>>> +# Since: 6.2
>>>> +#
>>>> +# Examples:
>>>> +#
>>>> +# 1. Change root node of some device.
>>>> +#
>>>> +# Note, that @edge name is omitted, as
>>>
>>> Scratch "name".
>>>
>>> Odd line break.
>>>
>>>> +# devices always has only one child. As well, no need in specifying
>>>> +# old @child.
>>>
>>> "the old @child".
>>>
>>>> +#
>>>> +# -> { "parent": "device0", "new-child": "some-node-name" }
>>>> +#
>>>> +# 2. Insert copy-before-write filter.
>>>> +#
>>>> +# Assume, after blockdev-add we have block-node 'source', with several
>>>> +# writing parents and one copy-before-write 'filter' parent. And we want
>>>> +# to actually insert the filter. We do:
>>>> +#
>>>> +# -> { "child": "source", "parent-mode": "auto", "new-child": "filter" }
>>>> +#
>>>> +# All parents of source would be switched to 'filter' node, except for
>>>> +# 'filter' node itself (otherwise, it will make a loop in block-graph).
>>>
>>> Good examples.  I think we need more, to give us an idea on the use
>>> cases for the combinatorial explosion.  I need to know them to be able
>>> to review the interface.
>>>
>>>> +##
>>>> +{ 'struct': 'BlockdevReplace',
>>>> +  'data': { '*parent': 'str', '*edge': 'str', '*child': 'str',
>>>> +            '*parents-mode': 'BlockdevReplaceParentsMode',
>>>> +            'new-child': 'str' } }
>>>> +
>>>> +##
>>>> +# @blockdev-replace:
>>>> +#
>>>> +# Do one or several replacements transactionally.
>>>> +##
>>>> +{ 'command': 'blockdev-replace',
>>>> +  'data': { 'replacements': ['BlockdevReplace'] } }
>>>
>>> Ignorant question: integration with transaction.json makes no sense?
>>
>> Recently we allowed do several reopens in one blockdev-reopen. So, it's reasonable to behave same way in blockdev-replace.
> 
> I didn't see that going in.  I trust reopening multiple in one
> transaction is useful, but commit 3908b7a899 fails to explain why.
> Mistake; we should *insist* on capturing the "why" in the commit
> message.

The reason was that to remove filter, we should do two replacements in one transaction, otherwise filter may conflict with original parent after replacement..

But finally, I had to add "if (!QLIST_EMPTY(&bs->parents))" hack to cbw_child_perm() of copy-before-write filter, so it should be possible to remove the filter in two steps: 1. replace child in original parent 2. remove the filter. (and filter will not conflict, thanks to the hack).

And that time we thought blockdev-reopen is good for manipulation with filters. Now it's obvious that it is not.

> 
> I dislike having multiple ways to do the same thing (here:
> transactions).  If there are reasons why the transaction command cannot
> be used, fine, provide another suitable interface.  But when the
> existing interface serves, please don't reinvent it.
> 
>> Still, I think combination of different commands in a transaction make sense too. So, in my thought, transaction support for blockdev-* graph modification commands is a TODO.
>>
>>>
>>> [...]
>>>
> 

Oh, that's all makes more questions than answers :)

1. It's OK to use one BlockdevReplace instead of a list and concentrate on transaction support. That's a mission I keep in mind: moving qapi transactions to use util/transactions.c engine for native integration with modern block modifications.

2. It's OK to limit "one-edge" flexibility, anyway, I don't know do we need it or not.

Still, are use sure that for user it will be simpler to replace root node by qdev path then by node-name? Both variants allow to determine the edge in the graph : qdiv --root--> node-name. But node-name may be preferable in graph operations.
Hmm, on the other hand, if user rely on possibility to specify edge by child, he'll make implementation which will fail to support several qdev parents for one driver node. So, maybe you are right, better not allow it.

3. I'm not sure that we can avoid "auto" mode. It makes inserting copy-before-write filter rather simple. If we force user to specify all parents by hand, it may complicate implementation in libvirt. Note also that we don't have a good way to query all parents of the node..

4. And I'm not sure about id namespaces merging, how much is it bad. Now I tend to agree that it seems unsafe.

Do we want force users use different names globally, or instead use pairs (type, id) to identify them? Like ("qdev", "<qdev_id>"), ("export", "<export-name>"), ("driver", "node-name")..

Peter, Kevin, what do you think about this all?

-- 
Best regards,
Vladimir