[Virtio-fs] [RFC 2/2] vhost-user-fs: Implement stateful migration
Anton Kuchin
antonkuchin at yandex-team.ru
Wed Mar 22 11:18:30 UTC 2023
On 21/03/2023 19:49, Hanna Czenczek wrote:
> On 20.03.23 13:39, Anton Kuchin wrote:
>> On 20/03/2023 11:33, Hanna Czenczek wrote:
>>> On 17.03.23 19:37, Anton Kuchin wrote:
>>>> On 17/03/2023 19:52, Hanna Czenczek wrote:
>>>>> On 17.03.23 18:19, Anton Kuchin wrote:
>>>>>> On 13/03/2023 19:48, Hanna Czenczek wrote:
>>>>>>> A virtio-fs device's VM state consists of:
>>>>>>> - the virtio device (vring) state (VMSTATE_VIRTIO_DEVICE)
>>>>>>> - the back-end's (virtiofsd's) internal state
>>>>>>>
>>>>>>> We get/set the latter via the new vhost-user operations
>>>>>>> FS_SET_STATE_FD,
>>>>>>> FS_GET_STATE, and FS_SET_STATE.
>>>>>>>
>>>
>>> [...]
>>>
>>>>>>> static const VMStateDescription vuf_vmstate = {
>>>>>>> .name = "vhost-user-fs",
>>>>>>> - .unmigratable = 1,
>>>>>>> + .version_id = 1,
>>>>>>> + .fields = (VMStateField[]) {
>>>>>>> + VMSTATE_VIRTIO_DEVICE,
>>>>>>> + {
>>>>>>> + .name = "back-end",
>>>>>>> + .info = &(const VMStateInfo) {
>>>>>>> + .name = "virtio-fs back-end state",
>>>>>>> + .get = vuf_load_state,
>>>>>>> + .put = vuf_save_state,
>>>>>>> + },
>>>>>>> + },
>>>>>>
>>>>>> I've been working on stateless migration patch [1] and there was
>>>>>> discussed that we
>>>>>> need to keep some kind of blocker by default if orchestrators
>>>>>> rely on unmigratable
>>>>>> field in virtio-fs vmstate to block the migration.
>>>>>> For this purpose I've implemented flag that selects "none" or
>>>>>> "external" and is checked
>>>>>> in pre_save, so it could be extended with "internal" option.
>>>>>> We didn't come to conclusion if we also need to check incoming
>>>>>> migration, the discussion
>>>>>> has stopped for a while but I'm going back to it now.
>>>>>>
>>>>>> I would appreciate if you have time to take a look at the
>>>>>> discussion and consider the idea
>>>>>> proposed there to store internal state as a subsection of vmstate
>>>>>> to make it as an option
>>>>>> but not mandatory.
>>>>>>
>>>>>> [1]
>>>>>> https://patchew.org/QEMU/20230217170038.1273710-1-antonkuchin@yandex-team.ru/
>>>>>
>>>>> So far I’ve mostly considered these issues orthogonal. If your
>>>>> stateless migration goes in first, then state is optional and I’ll
>>>>> adjust this series.
>>>>> If stateful migration goes in first, then your series can simply
>>>>> make state optional by introducing the external option, no?
>>>>
>>>> Not really. State can be easily extended by subsections but not
>>>> trimmed. Maybe this can be worked around by defining two types of
>>>> vmstate and selecting the correct one at migration, but I'm not sure.
>>>
>>> I thought your patch included a switch on the vhost-user-fs device
>>> (on the qemu side) to tell qemu what migration data to expect. Can
>>> we not transfer a 0-length state for 'external', and assert this on
>>> the destination side?
>>
>> Looks like I really need to make the description of my patch and the
>> documentation more clear :) My patch proposes switch on _source_ side
>> to select which data to save in the stream mostly to protect old
>> orchestrators that don't expect virtio-fs to be migratable (and for
>> internal case it can be extended to select if qemu needs to request
>> state from backend), Michael insists that we also need to check on
>> destination but I disagree because I believe that we can figure this
>> out from stream data without additional device flags.
>>
>>>
>>>>>
>>>>> But maybe we could also consider making stateless migration a
>>>>> special case of stateful migration; if we had stateful migration,
>>>>> can’t we just implement stateless migration by telling virtiofsd
>>>>> that it should submit a special “I have no state” pseudo-state,
>>>>> i.e. by having a switch on virtiofsd instead?
>>>>
>>>> Sure. Backend can send empty state (as your patch treats 0 length
>>>> as a valid response and not error) or dummy state that can be
>>>> recognized as stateless. The only potential problem is that then we
>>>> need support in backend for new commands even to return dummy
>>>> state, and if backend can support both types then we'll need some
>>>> switch in backend to reply with real or empty state.
>>>
>>> Yes, exactly.
>>>
>>>>>
>>>>> Off the top of my head, some downsides of that approach would be
>>>>> (1) it’d need a switch on the virtiofsd side, not on the qemu side
>>>>> (not sure if that’s a downside, but a difference for sure),
>>>>
>>>> Why would you? It seems to me that this affects only how qemu
>>>> treats the vmstate of device. If the state was requested backend
>>>> sends it to qemu. If state subsection is present in stream qemu
>>>> sends it to the backend for loading. Stateless one just doesn't
>>>> request state from the backend. Or am I missing something?
>>>>
>>>>> and (2) we’d need at least some support for this on the virtiofsd
>>>>> side, i.e. practically can’t come quicker than stateful migration
>>>>> support.
>>>>
>>>> Not much, essentially this is just a reconnect. I've sent a draft
>>>> of a reconnect patch for old C-virtiofsd, for rust version it takes
>>>> much longer because I'm learning rust and I'm not really good at it
>>>> yet.
>>>
>>> I meant these two downsides not for your proposal, but instead if we
>>> implemented stateless migration only in the back-end; i.e. the
>>> front-end would only implement stateful migration, and the back-end
>>> would send and accept an empty state.
>>>
>>> Then, qemu would always request a “state” (even if it turns out
>>> empty for stateless migration), and qemu would always treat it the
>>> same, so there wouldn’t be a switch on the qemu side, but only on
>>> the virtiofsd side. Doesn’t really matter, but what does matter is
>>> that we’d need to implement the migration interface in virtiofsd,
>>> even if in the end no state is transferred.
>>>
>>> So exactly what you’ve said above (“The only potential problem is
>>> […]”). :)
>>>
>>> Hanna
>>>
>>
>> Oh, yes, we were talking about the same thing. So do you agree that
>> storing internal state data in subsection will allow backend code to
>> be more straightforward without additional switches?
>
> Sounds good. I think we should rename VHOST_USER_MIGRATION_TYPE_NONE
> to ..._INTERNAL, though, and then use that (default) mode for stateful
> migration (via VMState), because I think that should be the default
> migration type (and while there’s no support for it, it will just
> continue to block migration).
My plan was to add new ..._INTERNAL option and keep ..._NONE as default
one that blocks migration unless orchestrator explicitly selects
migration type to avoid accidental migrations of old orchestrators that
expect blocker for virtio-fs. But if INTERNAL blocks migration when
backend doesn't support new commands your idea to make it default may be
even better.
>
>
> So I suppose you mean something like this, where
> vuf_stateful_migration() basically returns `fs->migration_type ==
> VHOST_USER_MIGRATION_TYPE_INTERNAL`, and on the destination, you’d
> check whether the subsection is present to decide which kind of
> migration it is, right?
Yes, exactly.
>
>
> static const VMStateDescription vuf_backend_vmstate;
>
> static const VMStateDescription vuf_vmstate = {
> .name = "vhost-user-fs",
> .version_id = 0,
> .fields = (VMStateField[]) {
> VMSTATE_VIRTIO_DEVICE,
> VMSTATE_END_OF_LIST()
> },
> .subsections = (const VMStateDescription*[]) {
> &vuf_backend_vmstate,
> NULL,
> }
> };
>
> static const VMStateDescription vuf_backend_vmstate = {
> .name = "vhost-user-fs-backend",
> .version_id = 0,
> .needed = vuf_stateful_migration,
> .fields = (VMStateField[]) {
> {
> .name = "back-end",
> .info = &(const VMStateInfo) {
> .name = "virtio-fs back-end state",
> .get = vuf_load_state,
> .put = vuf_save_state,
> },
> },
> VMSTATE_END_OF_LIST()
> },
> };
>
More information about the Virtio-fs
mailing list