[Virtio-fs] [PATCH] vhost-user-fs: add capability to allow migration

Thu Feb 2 00:54:40 UTC 2023

On 01/02/2023 16:26, Juan Quintela wrote:
> Anton Kuchin <antonkuchin at yandex-team.ru> wrote:
>> On 19/01/2023 18:02, Stefan Hajnoczi wrote:
>>> On Thu, 19 Jan 2023 at 10:29, Anton Kuchin <antonkuchin at yandex-team.ru> wrote:
>>>> On 19/01/2023 16:30, Stefan Hajnoczi wrote:
>>>>> On Thu, 19 Jan 2023 at 07:43, Anton Kuchin <antonkuchin at yandex-team.ru> wrote:
>>>>>> On 18/01/2023 17:52, Stefan Hajnoczi wrote:
>>>>>>> On Sun, 15 Jan 2023 at 12:21, Anton Kuchin <antonkuchin at yandex-team.ru> wrote:
> Hi
>
> Sorry to come so late into the discussion.

You are just in time, I'm working on v2 of this patch right now :-)

>
>>>>>>>> +static int vhost_user_fs_pre_save(void *opaque)
>>>>>>>> +{
>>>>>>>> +    MigrationState *s = migrate_get_current();
>>>>>>>> +
>>>>>>>> +    if (!s->enabled_capabilities[MIGRATION_CAPABILITY_VHOST_USER_FS]) {
>>>>>>>> +        error_report("Migration of vhost-user-fs devices requires internal FUSE "
>>>>>>>> +                     "state of backend to be preserved. If orchestrator can "
>>>>>>>> +                     "guarantee this (e.g. dst connects to the same backend "
>>>>>>>> +                     "instance or backend state is migrated) set 'vhost-user-fs' "
>>>>>>>> +                     "migration capability to true to enable migration.");
>>>>>>>> +        return -1;
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +    return 0;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>      static const VMStateDescription vuf_vmstate = {
>>>>>>>>          .name = "vhost-user-fs",
>>>>>>>> -    .unmigratable = 1,
>>>>>>>> +    .minimum_version_id = 0,
>>>>>>>> +    .version_id = 0,
>>>>>>>> +    .fields = (VMStateField[]) {
>>>>>>>> +        VMSTATE_VIRTIO_DEVICE,
>>>>>>>> +        VMSTATE_END_OF_LIST()
>>>>>>>> +    },
>>>>>>>> +   .pre_save = vhost_user_fs_pre_save,
>>>>>>>>      };
> I don't object to extend the vmstate this way.
>
> But I object to the migration capability for several reasons:
> - This feature has _nothing_ to do with migration, the problem, what it
>    describes, etc is related to vhost_user_fs.
> - The number of migration capabilities is limited
>    And we add checks to see if they are valid, consistent, etc
>    see qemu/migration/migration.c:migrate_caps_check()
> - As Stefan says, we can have several vhost_user_fs devices, and each
>    one should know if it can migrate or not.
> - We have to options for the orchestator:
>    * it knows that all the vhost_user_fs devices can be migration
>      Then it can add a property to each vhost_user_fs device
>    * it don't know it
>      Then it is a good idea that we are not migrating this VM.
>
>> I think we'd be better without a new marker because migration code
>> has standard generic way of solving such puzzles that I described
>> above. So adding new marker would go against existing practice.
>> But if you could show me where I missed something I'll be grateful
>> and will fix it to avoid potential problems.
>> I'd also be happy to know the opinion of Dr. David Alan Gilbert.
> If everybody agrees that any vhost_user_fs device is going to have a
> virtio device, then I agree with you that the marker is not needed at
> this point.
>
> Once told that, I think that you are making your live harder in the
> future when you add the other migratable devices.
>
> I am assuming here that your "underlying device" is:
>
> enum VhostUserFSType {
>      VHOST_USER_NO_MIGRATABLE = 0,
>      // The one we are doing here
>      VHOST_USER_EXTERNAL_MIGRATABLE,
>      // The one you describe for the future
>      VHOST_USER_INTERNAL_MIGRATABLE,
> };
>
> struct VHostUserFS {
>      /*< private >*/
>      VirtIODevice parent;
>      VHostUserFSConf conf;
>      struct vhost_virtqueue *vhost_vqs;
>      struct vhost_dev vhost_dev;
>      VhostUserState vhost_user;
>      VirtQueue **req_vqs;
>      VirtQueue *hiprio_vq;
>      int32_t bootindex;
>      enum migration_type;
>      /*< public >*/
> };
>
>
> static int vhost_user_fs_pre_save(void *opaque)
> {
>      VHostUserFS *s = opaque;
>
>      if (s->migration_type == VHOST_USER_NO_MIGRATABLE)) {
>          error_report("Migration of vhost-user-fs devices requires internal FUSE "
>                       "state of backend to be preserved. If orchestrator can "
>                       "guarantee this (e.g. dst connects to the same backend "
>                       "instance or backend state is migrated) set 'vhost-user-fs' "
>                       "migration capability to true to enable migration.");
>          return -1;
>      }
>      if (s->migration_type == VHOST_USER_EXTERNAL_MIGRATABLE) {
>          return 0;
>      }
>      if (s->migration_type == VHOST_USER_INTERNAL_MIGRATABLE) {
>          error_report("still not implemented");
>          return -1;
>      }
>      assert("we don't reach here");
> }
>
> Your initial vmstateDescription
>
> static const VMStateDescription vuf_vmstate = {
>      .name = "vhost-user-fs",
>      .unmigratable = 1,
>      .minimum_version_id = 0,
>      .version_id = 0,
>      .fields = (VMStateField[]) {
>          VMSTATE_INT8(migration_type, struct VHostUserFS),
>          VMSTATE_VIRTIO_DEVICE,
>          VMSTATE_END_OF_LIST()
>      },
>      .pre_save = vhost_user_fs_pre_save,
> };
>
> And later you change it to something like:
>
> static bool vhost_fs_user_internal_state_needed(void *opaque)
> {
>      VHostUserFS *s = opaque;
>
>      return s->migration_type == VMOST_USER_INTERNAL_MIGRATABLE;
> }
>
> static const VMStateDescription vmstate_vhost_user_fs_internal_sub = {
>      .name = "vhost-user-fs/internal",
>      .version_id = 1,
>      .minimum_version_id = 1,
>      .needed = &vhost_fs_user_internal_state_needed,
>      .fields = (VMStateField[]) {
>          .... // Whatever
>          VMSTATE_END_OF_LIST()
>      }
> };
>
> static const VMStateDescription vuf_vmstate = {
>      .name = "vhost-user-fs",
>      .minimum_version_id = 0,
>      .version_id = 0,
>      .fields = (VMStateField[]) {
>          VMSTATE_INT8(migration_type, struct VHostUserFS),
>          VMSTATE_VIRTIO_DEVICE,
>          VMSTATE_END_OF_LIST()
>      },
>      .pre_save = vhost_user_fs_pre_save,
>      .subsections = (const VMStateDescription*[]) {
>          &vmstate_vhost_user_fs_internal_sub,
>          NULL
>      }
> };
>
> And you are done.
>
> I will propose to use a property to set migration_type, but I didn't
> want to write the code right now.
>
> I think that this proposal will make Stephan happy, and it is just
> adding and extra uint8_t that is helpul to implement everything.

That is exactly the approach I'm trying to implement it right now. Single
flag can be convenient for orchestrator but makes it too hard to account in
all cases for all devices on qemu side without breaking future 
compatibility.
So I'm rewriting it with properties.

BTW do you think each vhost-user device should have its own enum of 
migration
types or maybe we could make them common for all device types?

>
> Later, Juan.
>
> PD.  One of the few things that Pascal got right and C got completely
>       wrong were pascal variant registers vs C union's.  If you have a
>       union, if should be "required" that there is a field in the
>       enclosing struct that specifies what element of the union we have.
>       This is exactly that case.
>