race condition? virsh migrate --copy-storage-all

Valentijn Sessink valentijn at sessink.nl
Tue Apr 19 13:51:32 UTC 2022


Hi Peter,

Thanks.

On 19-04-2022 13:22, Peter Krempa wrote:
> It would be helpful if you provide the VM XML file to see how your disks
> are configured and the debug log file when the bug reproduces:

I created a random VM to show the effect. XML file attached.

> Without that my only hunch would be that you ran out of disk space on
> the destination which caused the I/O error.

... it's an LVM2 volume with exact the same size as the source machine, 
so that would be rather odd ;-)

I'm guessing that it's this weird message at the destination machine:

2022-04-19 13:31:09.394+0000: 1412559: error : 
virKeepAliveTimerInternal:137 : internal error: connection closed due to 
keepalive timeout

Source machine says:
2022-04-19 13:31:09.432+0000: 2641309: debug : 
qemuMonitorJSONIOProcessLine:220 : Line [{"timestamp": {"seconds": 
1650375069, "microseconds": 432613}, "event": "BLOCK_JOB_ERROR", "data": 
{"device": "drive-virtio-disk2", "operation": "write", "action": "report"}}]
2022-04-19 13:31:09.432+0000: 2641309: debug : 
virJSONValueFromString:1822 : string={"timestamp": {"seconds": 
1650375069, "microseconds": 432613}, "event": "BLOCK_JOB_ERROR", "data": 
{"device": "drive-virtio-disk2", "operation": "write", "action": "report"}}
2022-04-19 13:31:09.432+0000: 2641309: info : 
qemuMonitorJSONIOProcessLine:234 : QEMU_MONITOR_RECV_EVENT: 
mon=0x7f70080028a0 event={"timestamp": {"seconds": 1650375069, 
"microseconds": 432613}, "event": "BLOCK_JOB_ERROR", "data": {"device": 
"drive-virtio-disk2", "operation": "write", "action": "report"}}
2022-04-19 13:31:09.432+0000: 2641309: debug : qemuMonitorEmitEvent:1198 
: mon=0x7f70080028a0 event=BLOCK_JOB_ERROR
2022-04-19 13:31:09.432+0000: 2641309: debug : 
qemuMonitorJSONIOProcessLine:220 : Line [{"timestamp": {"seconds": 
1650375069, "microseconds": 432668}, "event": "BLOCK_JOB_ERROR", "data": 
{"device": "drive-virtio-disk2", "operation": "write", "action": "report"}}]
2022-04-19 13:31:09.432+0000: 2641309: debug : 
virJSONValueFromString:1822 : string={"timestamp": {"seconds": 
1650375069, "microseconds": 432668}, "event": "BLOCK_JOB_ERROR", "data": 
{"device": "drive-virtio-disk2", "operation": "write", "action": "report"}}
2022-04-19 13:31:09.433+0000: 2641309: info : 
qemuMonitorJSONIOProcessLine:234 : QEMU_MONITOR_RECV_EVENT: 
mon=0x7f70080028a0 event={"timestamp": {"seconds": 1650375069, 
"microseconds": 432668}, "event": "BLOCK_JOB_ERROR", "data": {"device": 
"drive-virtio-disk2", "operation": "write", "action": "report"}}
2022-04-19 13:31:09.433+0000: 2641309: debug : qemuMonitorEmitEvent:1198 
: mon=0x7f70080028a0 event=BLOCK_JOB_ERROR
2022-04-19 13:31:09.433+0000: 2641309: debug : 
qemuMonitorJSONIOProcessLine:220 : Line [{"timestamp": {"seconds": 
1650375069, "microseconds": 432688}, "event": "BLOCK_JOB_ERROR", "data": 
{"device": "drive-virtio-disk2", "operation": "write", "action": "report"}}]
2022-04-19 13:31:09.433+0000: 2641309: debug : 
virJSONValueFromString:1822 : string={"timestamp": {"seconds": 
1650375069, "microseconds": 432688}, "event": "BLOCK_JOB_ERROR", "data": 
{"device": "drive-virtio-disk2", "operation": "write", "action": "report"}}
2022-04-19 13:31:09.433+0000: 2641309: info : 
qemuMonitorJSONIOProcessLine:234 : QEMU_MONITOR_RECV_EVENT: 
mon=0x7f70080028a0 event={"timestamp": {"seconds": 1650375069, 
"microseconds": 432688}, "event": "BLOCK_JOB_ERROR", "data": {"device": 
"drive-virtio-disk2", "operation": "write", "action": "report"}}
2022-04-19 13:31:09.433+0000: 2641309: debug : qemuMonitorEmitEvent:1198 
: mon=0x7f70080028a0 event=BLOCK_JOB_ERROR

... and more of these. XML file attached.

Does that show anything? Please note that there is no real "block error" 
anywhere, there is an exact LVM volume on the other side, I'm actually 
using a script to extract the name of the volume at source; then I'm 
reading the source volume size and I'm creating a destination volume 
with the exact size before I start the migration. Disks are RAID volumes 
and there are no read or write errors.

Best regards,

Valentijn
-- 
Durgerdamstraat 29, 1507 JL Zaandam; telefoon 075-7100071
-------------- next part --------------
A non-text attachment was scrubbed...
Name: water.xml
Type: text/xml
Size: 3707 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20220419/11f9643b/attachment.xml>


More information about the libvirt-users mailing list