[Virtio-fs] Regression save/restore of vsock: (was [PULL 07/55] hw/virtio: move vm_running check to virtio_device_started)

Alex Bennée alex.bennee at linaro.org
Fri Oct 14 11:07:17 UTC 2022


Christian Borntraeger <borntraeger at linux.ibm.com> writes:

> Am 14.10.22 um 09:30 schrieb Christian Borntraeger:
>> Am 10.10.22 um 19:29 schrieb Michael S. Tsirkin:
>>> From: Alex Bennée <alex.bennee at linaro.org>
>>>
>>> All the boilerplate virtio code does the same thing (or should at
>>> least) of checking to see if the VM is running before attempting to
>>> start VirtIO. Push the logic up to the common function to avoid
>>> getting a copy and paste wrong.
>>>
>>> Signed-off-by: Alex Bennée <alex.bennee at linaro.org>
>>> Message-Id: <20220802095010.3330793-11-alex.bennee at linaro.org>
>>> Reviewed-by: Michael S. Tsirkin <mst at redhat.com>
>>> Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
>> This results in a regression for our s390x CI when doing
>> save/restore of guests with vsock:
>>                  #1  0x000003ff9a248580 raise (libc.so.6 + 0x48580)
>>                  #2  0x000003ff9a22b5c0 abort (libc.so.6 + 0x2b5c0)
>>                  #3  0x000003ff9a2409da __assert_fail_base (libc.so.6 + 0x409da)
>>                  #4  0x000003ff9a240a4e __assert_fail (libc.so.6 + 0x40a4e)
>>                  #5  0x000002aa2d69a066 vhost_vsock_common_pre_save (qemu-system-s390x + 0x39a066)
>>                  #6  0x000002aa2d55570e vmstate_save_state_v (qemu-system-s390x + 0x25570e)
>>                  #7  0x000002aa2d556218 vmstate_save_state (qemu-system-s390x + 0x256218)
>>                  #8  0x000002aa2d570ba4
>> qemu_savevm_state_complete_precopy_non_iterable (qemu-system-s390x +
>> 0x270ba4)
>>                  #9  0x000002aa2d5710b6 qemu_savevm_state_complete_precopy (qemu-system-s390x + 0x2710b6)
>>                  #10 0x000002aa2d564d0e migration_completion (qemu-system-s390x + 0x264d0e)
>>                  #11 0x000002aa2d8db25c qemu_thread_start (qemu-system-s390x + 0x5db25c)
>>                  #12 0x000003ff9a296248 start_thread (libc.so.6 + 0x96248)
>>                  #13 0x000003ff9a31183e thread_start (libc.so.6 + 0x11183e)
>> 
>
>
> Something like
> diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
> index 7dc3c7393122..b4d056ae6f01 100644
> --- a/hw/virtio/vhost-vsock.c
> +++ b/hw/virtio/vhost-vsock.c
> @@ -73,6 +73,10 @@ static void vhost_vsock_set_status(VirtIODevice *vdev, uint8_t status)
>      bool should_start = virtio_device_started(vdev, status);
>      int ret;
>  +    if (!vdev->vm_running) {
> +        should_start = false;
> +    }
> +
>      if (vhost_dev_is_started(&vvc->vhost_dev) == should_start) {
>          return;
>      }
>
> helps.
>
> The problem seems to be that virtio_device_started does ignore
> vm_running when use_start is set.

Wouldn't it make more sense to re-order the check there, something like:

  static inline bool virtio_device_started(VirtIODevice *vdev, uint8_t status)
  {
      if (!vdev->vm_running) {
          return false;
      }

      if (vdev->use_started) {
          return vdev->started;
      }

      return status & VIRTIO_CONFIG_S_DRIVER_OK;
  }

Is the problem that vdev->started gets filled during the migration but
because the VM isn't running yet we can never actually run?

-- 
Alex Bennée


More information about the Virtio-fs mailing list