Issue report: Snapshot-create-as block while the vm's process is killed out.

13520781732 at 189.cn 13520781732 at 189.cn
Mon Aug 16 03:04:12 UTC 2021


Hello, 

I have meet one issues while using snapshot-create-as with external file.

Centos 8, 
Libivirt 7.0.0
Qemu: 5.2.0

While making one snapshot with  memory and disk extaernal file,  and killing the process of the vm qemu same time mostly,  the snapshot-create-as command blocked.
So it will failed to  start the vm again for "cannot acquire state change lock (held by monitor=remoteDispatchDomainSnapshotCreateXML)".

The log is as bellow: 
2021-08-12 10:27:38.373+0000: 4344: error : qemuDomainAgentAvailable:8125 : Guest agent is not responding: QEMU guest agent is not connected
2021-08-12 10:27:38.573+0000: 4340: error : qemuDomainAgentAvailable:8125 : Guest agent is not responding: QEMU guest agent is not connected
2021-08-12 10:27:38.757+0000: 128211: error : qemuDomainAgentAvailable:8125 : Guest agent is not responding: QEMU guest agent is not connected
2021-08-12 10:27:39.397+0000: 4341: error : qemuDomainAgentAvailable:8125 : Guest agent is not responding: QEMU guest agent is not connected
2021-08-12 10:27:39.887+0000: 4309: error : virNetSocketReadWire:1817 : End of file while reading data: Input/output error
2021-08-12 10:27:53.735+0000: 4340: error : qemuDomainAgentAvailable:8125 : Guest agent is not responding: QEMU guest agent is not connected
2021-08-12 10:27:54.822+0000: 4344: error : qemuDomainAgentAvailable:8125 : Guest agent is not responding: QEMU guest agent is not connected
2021-08-12 10:27:55.230+0000: 4340: error : qemuDomainAgentAvailable:8125 : Guest agent is not responding: QEMU guest agent is not connected
2021-08-12 10:46:40.994+0000: 4344: warning : qemuDomainObjTaint:6187 : Domain id=95 name='vm_6_135' uuid=b2812e01-331a-4e65-912b-73be5ace2271 is tainted: high-privileges
2021-08-12 10:48:05.785+0000: 4309: error : virNetSocketReadWire:1817 : End of file while reading data: Input/output error
2021-08-12 10:49:50.440+0000: 1643666: error : qemuMonitorIO:583 : internal error: End of file from qemu monitor
2021-08-12 10:50:13.809+0000: 4341: warning : qemuDomainObjTaint:6187 : Domain id=96 name='vm_6_135' uuid=b2812e01-331a-4e65-912b-73be5ace2271 is tainted: high-privileges
2021-08-12 10:51:21.587+0000: 1646757: error : qemuMonitorIO:583 : internal error: End of file from qemu monitor
2021-08-12 10:52:28.421+0000: 4342: warning : qemuDomainObjBeginJobInternal:953 : Cannot start job (none, none, start) for domain vm_6_135; current job is (none, none, snapshot) owned by (0 <null>, 0 <null>, 4340 remoteDispatchDomainSnapshotCreateXML (flags=0x280)) for (0s, 0s, 67s)
2021-08-12 10:52:28.421+0000: 4342: error : qemuDomainObjBeginJobInternal:975 : Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainSnapshotCreateXML)
2021-08-12 10:53:10.522+0000: 4342: warning : qemuDomainObjBeginJobInternal:953 : Cannot start job (none, none, start) for domain vm_6_135; current job is (none, none, snapshot) owned by (0 <null>, 0 <null>, 4340 remoteDispatchDomainSnapshotCreateXML (flags=0x280)) for (0s, 0s, 109s)
2021-08-12 10:53:10.522+0000: 4342: error : qemuDomainObjBeginJobInternal:975 : Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainSnapshotCreateXML)
2021-08-12 10:53:17.821+0000: 128211: warning : qemuDomainObjBeginJobInternal:953 : Cannot start job (none, none, start) for domain vm_6_135; current job is (none, none, snapshot) owned by (0 <null>, 0 <null>, 4340 remoteDispatchDomainSnapshotCreateXML (flags=0x280)) for (0s, 0s, 117s)
2021-08-12 10:53:17.821+0000: 128211: error : qemuDomainObjBeginJobInternal:975 : Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainSnapshotCreateXML)
... ... 
2021-08-13 05:49:41.240+0000: 4342: warning : qemuDomainObjBeginJobInternal:953 : Cannot start job (none, none, start) for domain vm_6_135; current job is (none, none, snapshot) owned by (0 <null>, 0 <null>, 4340 remoteDispatchDomainSnapshotCreateXML (flags=0x280)) for (0s, 0s, 68300s)


Gdb attach the proces of the libvirt, found there are two suspicious threads, stack as below:
gdb) bt
#0  0x00007f8f91a492fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f8f9574ed7a in virCondWait () from /lib64/libvirt.so.0
#2  0x00007f8f95784ef5 in virDomainObjWait () from /lib64/libvirt.so.0
#3  0x00007f8f422eb26d in qemuMigrationSrcWaitForCompletion () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#4  0x00007f8f422ee5d3 in qemuMigrationSrcToFile () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#5  0x00007f8f42331caa in qemuSaveImageCreate () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#6  0x00007f8f42336ac0 in qemuSnapshotCreateXML () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#7  0x00007f8f422b6ac1 in qemuDomainSnapshotCreateXML () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#8  0x00007f8f9591f604 in virDomainSnapshotCreateXML () from /lib64/libvirt.so.0
#9  0x0000562390841b81 in remoteDispatchDomainSnapshotCreateXMLHelper ()
#10 0x00007f8f9580e807 in virNetServerProgramDispatch () from /lib64/libvirt.so.0
#11 0x00007f8f95813b26 in virNetServerHandleJob () from /lib64/libvirt.so.0
#12 0x00007f8f9574f85f in virThreadPoolWorker () from /lib64/libvirt.so.0
#13 0x00007f8f9574eecb in virThreadHelper () from /lib64/libvirt.so.0
#14 0x00007f8f91a4314a in start_thread () from /lib64/libpthread.so.0
#15 0x00007f8f941f1f23 in clone () from /lib64/libc.so.6
 
(gdb) bt
#0  0x00007f8f91a492fc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f8f9574ed7a in virCondWait () from /lib64/libvirt.so.0
#2  0x00007f8f95784ef5 in virDomainObjWait () from /lib64/libvirt.so.0
#3  0x00007f8f42257ac8 in qemuBlockStorageSourceCreateGeneric () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#4  0x00007f8f4225b88d in qemuBlockStorageSourceCreate () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#5  0x00007f8f423346ee in qemuSnapshotDiskPrepareOne () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#6  0x00007f8f423353f8 in qemuSnapshotCreateXML () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#7  0x00007f8f422b6ac1 in qemuDomainSnapshotCreateXML () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#8  0x00007f8f9591f604 in virDomainSnapshotCreateXML () from /lib64/libvirt.so.0
#9  0x0000562390841b81 in remoteDispatchDomainSnapshotCreateXMLHelper ()
#10 0x00007f8f9580e807 in virNetServerProgramDispatch () from /lib64/libvirt.so.0
#11 0x00007f8f95813b26 in virNetServerHandleJob () from /lib64/libvirt.so.0
#12 0x00007f8f9574f85f in virThreadPoolWorker () from /lib64/libvirt.so.0
#13 0x00007f8f9574eecb in virThreadHelper () from /lib64/libvirt.so.0
#14 0x00007f8f91a4314a in start_thread () from /lib64/libpthread.so.0
#15 0x00007f8f941f1f23 in clone () from /lib64/libc.so.6
(gdb) 


The call work flow one is: 
qemuSnapshotCreateActiveExternal -> qemuSaveImageCreate -> qemuMigrationSrcToFile -> qemuMigrationSrcWaitForCompletion -> virDomainObjWait 

The others is similar :
 qemuSnapshotCreateXML  -> ...


It may be resume after the restart libvirtd.service.

Is it correct that the stat change lock should be unlocked while the vm is terminal or shutdown and how to fix it?

Thanks!




13520781732 at 189.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20210816/6a0d7932/attachment-0001.htm>


More information about the libvir-list mailing list