[Virtio-fs] Regression save/restore of vsock: (was [PULL 07/55] hw/virtio: move vm_running check to virtio_device_started)

Christian Borntraeger borntraeger at linux.ibm.com
Fri Oct 14 08:44:48 UTC 2022


Am 14.10.22 um 10:37 schrieb Alex Bennée:
> 
> Christian Borntraeger <borntraeger at linux.ibm.com> writes:
> 
>> Am 10.10.22 um 19:29 schrieb Michael S. Tsirkin:
>>> From: Alex Bennée <alex.bennee at linaro.org>
>>> All the boilerplate virtio code does the same thing (or should at
>>> least) of checking to see if the VM is running before attempting to
>>> start VirtIO. Push the logic up to the common function to avoid
>>> getting a copy and paste wrong.
>>> Signed-off-by: Alex Bennée <alex.bennee at linaro.org>
>>> Message-Id: <20220802095010.3330793-11-alex.bennee at linaro.org>
>>> Reviewed-by: Michael S. Tsirkin <mst at redhat.com>
>>> Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
>>
>> This results in a regression for our s390x CI when doing save/restore of guests with vsock:
>>
>>
>>                  #1  0x000003ff9a248580 raise (libc.so.6 + 0x48580)
>>                  #2  0x000003ff9a22b5c0 abort (libc.so.6 + 0x2b5c0)
>>                  #3  0x000003ff9a2409da __assert_fail_base (libc.so.6 + 0x409da)
>>                  #4  0x000003ff9a240a4e __assert_fail (libc.so.6 + 0x40a4e)
>>                  #5  0x000002aa2d69a066 vhost_vsock_common_pre_save (qemu-system-s390x + 0x39a066)
>>                  #6  0x000002aa2d55570e vmstate_save_state_v (qemu-system-s390x + 0x25570e)
>>                  #7  0x000002aa2d556218 vmstate_save_state (qemu-system-s390x + 0x256218)
>>                  #8 0x000002aa2d570ba4
>> qemu_savevm_state_complete_precopy_non_iterable (qemu-system-s390x +
>> 0x270ba4)
>>                  #9  0x000002aa2d5710b6 qemu_savevm_state_complete_precopy (qemu-system-s390x + 0x2710b6)
>>                  #10 0x000002aa2d564d0e migration_completion (qemu-system-s390x + 0x264d0e)
>>                  #11 0x000002aa2d8db25c qemu_thread_start (qemu-system-s390x + 0x5db25c)
>>                  #12 0x000003ff9a296248 start_thread (libc.so.6 + 0x96248)
>>                  #13 0x000003ff9a31183e thread_start (libc.so.6 + 0x11183e)
> 
> Which test does this break?

migrate to file and restore.

> 
> Looking at the change the only thing I can think of is there is a subtle
> change in the order of checks because if the device is set as
> use_started we return the result regardless of vm or config state:
> 
>      if (vdev->use_started) {
>          return vdev->started;
>      }
> 
> Could some printfs confirm that?

Right. The problem is we now ignore the vm state and thus run into the assertion in vhost_vsock_common_pre_save.
Removing the asserting then results in virtio errors, which really indicates that the device must not be started.



More information about the Virtio-fs mailing list