[libvirt] [BUG?] qemu was accidently killed by libvirtd after we restart libvirtd by service command

Hailiang Zhang zhang.zhanghailiang at huawei.com
Thu Dec 17 15:45:11 UTC 2015


Hi all,

We came across a problem, qemu was accidently killed by libvirtd after we restart libvirtd by service command.
Steps to reproduce:
1) Attach net device to VM by 'virsh attach-device' command
2) Detach the net device that added before by 'virsh detach-device' command
3) Do the above two steps repeatedly, and at the same time, we restart libvirtd daemon.
And then we have chance to find that, after libvirtd is restarted at one time, qemu is killed accidently.

The libvirt log message is:
[2015-12-17 11:34:36]: libvirtd : 216046: info : libvirt version: 1.2.17
[2015-12-17 11:34:36]: libvirtd : 216046: warning : virDriverLoadModule:65 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not accessible
[2015-12-17 11:34:36]: libvirtd : 216046: info : virNetlinkEventServiceStart:662 : starting netlink event service with protocol 0
[2015-12-17 11:34:36]: libvirtd : 216046: info : virNetlinkEventServiceStart:712 : set netlink socket big buffer size
[2015-12-17 11:34:36]: libvirtd : 216046: info : virNetlinkEventServiceStart:662 : starting netlink event service with protocol 15
[2015-12-17 11:34:36]: libvirtd : 216046: info : virNetlinkEventServiceStart:712 : set netlink socket big buffer size
[2015-12-17 11:34:36]: libvirtd : 216079: error : dnsmasqCapsRefreshInternal:741 : Cannot check dnsmasq binary /usr/sbin/dnsmasq: No such file or directory
[2015-12-17 11:34:36]: libvirtd : 216079: info : networkReloadFirewallRules:1861 : Reloading iptables rules
[2015-12-17 11:34:36]: libvirtd : 216079: info : networkRefreshDaemons:1830 : Refreshing network daemons
[2015-12-17 11:34:36]: libvirtd : 216079: error : virNodeSuspendSupportsTarget:332 : internal error: Cannot probe for supported suspend types
[2015-12-17 11:34:36]: libvirtd : 216079: warning : virQEMUCapsInit:1037 : Failed to get host power management capabilities
[2015-12-17 11:34:36]: libvirtd : 216079: info : virDomainObjListLoadAllConfigs:23039 : Scanning for configs in /var/run/libvirt/qemu
[2015-12-17 11:34:36]: libvirtd : 216079: info : virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName2.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName3.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : virDomainObjListLoadAllConfigs:23039 : Scanning for configs in /etc/libvirt/qemu
[2015-12-17 11:34:36]: libvirtd : 216079: info : virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName2.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName3.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : qemuDomainSnapshotLoad:490 : Scanning for snapshots for domain VMName in /var/lib/libvirt/qemu/snapshot/VMName
[2015-12-17 11:34:36]: libvirtd : 216079: info : qemuDomainSnapshotLoad:490 : Scanning for snapshots for domain VMName2 in /var/lib/libvirt/qemu/snapshot/VMName2
[2015-12-17 11:34:36]: libvirtd : 216079: info : qemuDomainSnapshotLoad:490 : Scanning for snapshots for domain VMName3 in /var/lib/libvirt/qemu/snapshot/VMName3
[2015-12-17 11:34:36]: libvirtd : 216047: info : remoteDispatchDomainDetachDevice:4105 : Domain is detached a device by client: 127.0.0.1;0
[2015-12-17 11:34:36]: libvirtd : 216050: info : remoteDispatchDomainDetachDevice:4105 : Domain is detached a device by client: 127.0.0.1;0
[2015-12-17 11:34:36]: libvirtd : 216047: info : virDomainDetachDevice:8508 : enter virDomainDetachDevice domainname=VMName2, xml= <interface type='bridge'>^M
   <source bridge='br-1'/>^M
           <mac address='00:16:3e:76:85:17'/>^M
<virtualport type='openvswitch' />^M
       <model type='virtio'/>^M
</interface>^M
^M

[2015-12-17 11:34:36]: libvirtd : 216050: info : virDomainDetachDevice:8508 : enter virDomainDetachDevice domainname=VMName, xml= <interface type='bridge'>^M
   <source bridge='br-1'/>^M
           <mac address='00:16:3e:76:85:15'/>^M
2015-12-17 11:34:36]: libvirtd : 216095: error : qemuMonitorJSONCheckError:380 : internal error: unable to execute QEMU command 'netdev_del': Device 'hostnet5' not found
[2015-12-17 11:34:36]: libvirtd : 216096: error : qemuMonitorJSONCheckError:380 : internal error: unable to execute QEMU command 'netdev_del': Device 'hostnet2' not found
[2015-12-17 11:34:36]: libvirtd : 216097: error : qemuMonitorJSONCheckError:380 : internal error: unable to execute QEMU command 'netdev_del': Device 'hostnet1' not found
[2015-12-17 11:34:37]: libvirtd : 216095: info : virSecurityDACRestoreSecurityFileLabelInternal:325 : Restoring DAC user and group on '/data/suse11_sp3_64_1'
[2015-12-17 11:34:37]: libvirtd : 216095: info : virSecurityDACSetOwnershipInternal:248 : Setting DAC user and group on '/data/suse11_sp3_64_1' to '0:0'
[2015-12-17 11:34:37]: libvirtd : 216096: info : virSecurityDACRestoreSecurityFileLabelInternal:325 : Restoring DAC user and group on '/data/suse11_sp3_64_2'
[2015-12-17 11:34:37]: libvirtd : 216096: info : virSecurityDACSetOwnershipInternal:248 : Setting DAC user and group on '/data/suse11_sp3_64_2' to '0:0'
[2015-12-17 11:34:37]: libvirtd : 216095: info : qemuProcessStop:5720 : VM 'VMName' is dying,remove vBMC
[2015-12-17 11:34:37]: libvirtd : 216095: error : rtev_vBMC_send:209 : Failed to remove domain vBMC bind path for 33ae1cb0-c451-4620-8cee-1bbddd124b82, errno is 2
[2015-12-17 11:34:37]: libvirtd : 216050: info : remoteDispatchDomainAttachDevice:2855 : Domain is attached a device by client: 127.0.0.1;0
[2015-12-17 11:34:37]: libvirtd : 216050: info : virDomainAttachDevice:8390 : enter virDomainAttachDevice domainname=VMName, xml=<interface type='bridge'>^M

The VM's qemu log is:
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: device_del, qmp_cmd_arguments: {"id": "net4"}
[2015-12-17 11:34:30] virtio_set_status:524 virtio-net device status is 1 that means ACKNOWLEDGE
[2015-12-17 11:34:30] virtio_set_status:524 virtio-net device status is 1 that means ACKNOWLEDGE
[2015-12-17 11:34:30] virtio_pci_device_unplugged:966 virtio-net
[2015-12-17 11:34:30] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352070, "microseconds": 402473}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net4/virtio-backend"}}
[2015-12-17 11:34:30] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352070, "microseconds": 402586}, "event": "DEVICE_DELETED", "data": {"device": "net4", "path": "/machine/peripheral/net4"}}
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: netdev_del, qmp_cmd_arguments: {"id": "hostnet4"}
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: qom-list, qmp_cmd_arguments: {"path": "/machine/peripheral"}
---------------------------------------------- At this time, We restart libvirtd just right ------------------------------------------------------------
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: device_del, qmp_cmd_arguments: {"id": "net5"}
[2015-12-17 11:34:30] virtio_set_status:524 virtio-net device status is 1 that means ACKNOWLEDGE
[2015-12-17 11:34:30] virtio_set_status:524 virtio-net device status is 1 that means ACKNOWLEDGE
[2015-12-17 11:34:30] virtio_pci_device_unplugged:966 virtio-net
[2015-12-17 11:34:30] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352070, "microseconds": 810030}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net5/virtio-backend"}}
[2015-12-17 11:34:30] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352070, "microseconds": 810150}, "event": "DEVICE_DELETED", "data": {"device": "net5", "path": "/machine/peripheral/net5"}}
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: netdev_del, qmp_cmd_arguments: {"id": "hostnet5"}
[2015-12-17 11:34:35] send_control_event:225 virtio serial port 2 send control message event = 6, value = 0
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: qmp_capabilities
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-commands
[2015-12-17 11:34:36] handle_qmp_command:5115 qmp_cmd_name: add-fd, qmp_cmd_arguments: {"fdset-id": 0, "opaque": "/dev/null"}
[2015-12-17 11:34:36] handle_qmp_command:5115 qmp_cmd_name: block-commit, qmp_cmd_arguments: {"device": "bogus"}
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-events
[2015-12-17 11:34:36] send_control_event:225 virtio serial port 2 send control message event = 6, value = 1
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-status
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-block
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-chardev
[2015-12-17 11:34:36] handle_qmp_command:5115 qmp_cmd_name: qom-list, qmp_cmd_arguments: {"path": "/machine/peripheral"}
[2015-12-17 11:34:36] handle_qmp_command:5115 qmp_cmd_name: netdev_del, qmp_cmd_arguments: {"id": "hostnet5"}
[2015-12-17 11:34:36]: shutting down
[2015-12-17 11:34:36] send_control_event:225 virtio serial port 2 send control message event = 6, value = 0
qemu: terminating on signal 15 from pid 216046(libvirtd)
[2015-12-17 11:34:36] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352076, "microseconds": 267427}, "event": "SHUTDOWN"}
[2015-12-17 11:34:36] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352076, "microseconds": 267773}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "drive-ide0-0-1", "tray-open": true}}

We have analyzed this problem, and thought it was triggered by the inconsistent state for the net device
between qemu and the new libvirtd.
After libvirtd is started, it will connect to qemu, and updates the devices state by command 'qom-list',
and also it will read the configure information from XML stored.
The codes call process is:
qemuProcessReconnect
   ->qemuProcessUpdateDevices
        ->if (qemuDomainUpdateDeviceList(driver, vm, QEMU_ASYNC_JOB_NONE) < 0)
        ->if ((tmp = old)) {
         while (*tmp) {
             if (!virStringArrayHasString(priv->qemuDevices, *tmp) &&
                 virDomainDefFindDevice(vm->def, *tmp, &dev, false) == 0 &&
                 qemuDomainRemoveDevice(driver, vm, &dev) < 0) {
                 goto cleanup;  --->If we fail here, it will return -1; and it will call
                                    qemuProcessStop() in qemuProcessReconnect to kill VM.
             }
             tmp++;
         }
     }
It is a little hard to reproduce this problem, we can reproduce it in a special way:
1) Attach a net device by virsh command
2) Stop libvirtd
3) Detach the net device by directly send qmp command to QEMU though the monitor, just like:
# ./qmp-shell /var/lib/libvirt/qemu/redhat.monitor
Welcome to the QMP low-level shell!
Connected to QEMU 2.3.0

(QEMU) qom-list path=/machine/peripheral
{u'return': [{u'type': u'child<virtio-net-pci>', u'name': u'net1'}, {u'type': u'child<cirrus-vga>', u'name': u'video0'}, {u'type': u'child<usb-tablet>', u'name': u'input0'}, {u'type': u'child<isa-serial>', u'name': u'serial0'}, {u'type': u'child<virtio-net-pci>', u'name': u'net0'}, {u'type': u'child<scsi-hd>', u'name': u'scsi0-0-0-0'}, {u'type': u'child<virtio-scsi-pci>', u'name': u'scsi0'}, {u'type': u'child<piix3-usb-uhci>', u'name': u'usb'}, {u'type': u'string', u'name': u'type'}]}
(QEMU) device_del id=net1
{u'return': {}}
(QEMU) netdev_del id=hostnet1
{u'return': {}}
4) Restart libvirtd daemon and then you will found that qemu is killed by libvirtd.

The main reason for the inconsistent state is, we accidently restarted libvirtd before it finished treating
'detach device' work.
It seems a little violent to kill such VM directly.
Can we just do the unfinished cleanup work for the VM after libvirtd is starting ?
(IMHO, we can't ensure not to restart libvirtd while it is treating the detach device work.)


Thanks.
zhanghailiang







More information about the libvir-list mailing list