[Virtio-fs] Regression save/restore of vsock: (was [PULL 07/55] hw/virtio: move vm_running check to virtio_device_started)
Christian Borntraeger
borntraeger at linux.ibm.com
Fri Oct 14 11:58:50 UTC 2022
Am 14.10.22 um 13:07 schrieb Alex Bennée:
>
> Christian Borntraeger <borntraeger at linux.ibm.com> writes:
>
>> Am 14.10.22 um 09:30 schrieb Christian Borntraeger:
>>> Am 10.10.22 um 19:29 schrieb Michael S. Tsirkin:
>>>> From: Alex Bennée <alex.bennee at linaro.org>
>>>>
>>>> All the boilerplate virtio code does the same thing (or should at
>>>> least) of checking to see if the VM is running before attempting to
>>>> start VirtIO. Push the logic up to the common function to avoid
>>>> getting a copy and paste wrong.
>>>>
>>>> Signed-off-by: Alex Bennée <alex.bennee at linaro.org>
>>>> Message-Id: <20220802095010.3330793-11-alex.bennee at linaro.org>
>>>> Reviewed-by: Michael S. Tsirkin <mst at redhat.com>
>>>> Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
>>> This results in a regression for our s390x CI when doing
>>> save/restore of guests with vsock:
>>> #1 0x000003ff9a248580 raise (libc.so.6 + 0x48580)
>>> #2 0x000003ff9a22b5c0 abort (libc.so.6 + 0x2b5c0)
>>> #3 0x000003ff9a2409da __assert_fail_base (libc.so.6 + 0x409da)
>>> #4 0x000003ff9a240a4e __assert_fail (libc.so.6 + 0x40a4e)
>>> #5 0x000002aa2d69a066 vhost_vsock_common_pre_save (qemu-system-s390x + 0x39a066)
>>> #6 0x000002aa2d55570e vmstate_save_state_v (qemu-system-s390x + 0x25570e)
>>> #7 0x000002aa2d556218 vmstate_save_state (qemu-system-s390x + 0x256218)
>>> #8 0x000002aa2d570ba4
>>> qemu_savevm_state_complete_precopy_non_iterable (qemu-system-s390x +
>>> 0x270ba4)
>>> #9 0x000002aa2d5710b6 qemu_savevm_state_complete_precopy (qemu-system-s390x + 0x2710b6)
>>> #10 0x000002aa2d564d0e migration_completion (qemu-system-s390x + 0x264d0e)
>>> #11 0x000002aa2d8db25c qemu_thread_start (qemu-system-s390x + 0x5db25c)
>>> #12 0x000003ff9a296248 start_thread (libc.so.6 + 0x96248)
>>> #13 0x000003ff9a31183e thread_start (libc.so.6 + 0x11183e)
>>>
>>
>>
>> Something like
>> diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
>> index 7dc3c7393122..b4d056ae6f01 100644
>> --- a/hw/virtio/vhost-vsock.c
>> +++ b/hw/virtio/vhost-vsock.c
>> @@ -73,6 +73,10 @@ static void vhost_vsock_set_status(VirtIODevice *vdev, uint8_t status)
>> bool should_start = virtio_device_started(vdev, status);
>> int ret;
>> + if (!vdev->vm_running) {
>> + should_start = false;
>> + }
>> +
>> if (vhost_dev_is_started(&vvc->vhost_dev) == should_start) {
>> return;
>> }
>>
>> helps.
>>
>> The problem seems to be that virtio_device_started does ignore
>> vm_running when use_start is set.
>
> Wouldn't it make more sense to re-order the check there, something like:
>
> static inline bool virtio_device_started(VirtIODevice *vdev, uint8_t status)
> {
> if (!vdev->vm_running) {
> return false;
> }
>
> if (vdev->use_started) {
> return vdev->started;
> }
>
> return status & VIRTIO_CONFIG_S_DRIVER_OK;
> }
That does work as well. (and it restores the original ordering so that makes sense).
>
> Is the problem that vdev->started gets filled during the migration but
> because the VM isn't running yet we can never actually run?
I dont know.
More information about the Virtio-fs
mailing list