[rhelv6-list] KVM issues post RHEL6-1->6.2 update
Ben
bda20 at cam.ac.uk
Thu Dec 8 09:31:39 UTC 2011
Just thought I'd share my experiences of updating a KVM host and guests this
morning. I'll acknowledge up front that I didn't do things in the right
order so the mistakes were mine.
Start: RHEL6.1 KVM host, x2 RHEL6.1 guests using .img files (LVM partitions
inside). Fully up to date as of just before the RHEL6.2 errata release.
I did "yum clean all ; yum update" on both the host and the guests at the
same time (yeah, I know). In my defence, a seemingly identical setup I did
this on yesterday worked without issues.
At the point at which the host was completing its cleanup this happened in
/var/log/messages:
Dec 8 07:14:47 frazil libvirtd: 07:14:47.926: 14778: warning : qemudDispatchSignalEvent:403 : Shutting down on signal 15
Dec 8 07:14:49 frazil yum[1235]: Updated: libvirt-0.9.4-23.el6_2.1.x86_64
and further down
Dec 8 07:15:00 frazil kernel: br1: port 2(vnet1) entering disabled state
Dec 8 07:15:00 frazil kernel: device vnet1 left promiscuous mode
Dec 8 07:15:00 frazil kernel: br1: port 2(vnet1) entering disabled state
Dec 8 07:15:02 frazil ntpd[2194]: Deleting interface #23 vnet1, fe80::fc54:ff:fe01:6b3b#123, interface stats: received=0, sent=0, dropped=0, active_time=7241352 secs
Dec 8 07:15:05 frazil kernel: br0: port 2(vnet0) entering disabled state
Dec 8 07:15:05 frazil kernel: device vnet0 left promiscuous mode
Dec 8 07:15:05 frazil kernel: br0: port 2(vnet0) entering disabled state
Dec 8 07:15:07 frazil ntpd[2194]: Deleting interface #25 vnet0, fe80::fc54:ff:fe49:fae6#123, interface stats: received=0, sent=0, dropped=0, active_time=7238050 secs
At this point I lost connection to the guests, which (according to the SSH
connections I had open to them) had apparently finished cleaning up after
the yum update (according to the right-hand side X/Y counter) but hadn't
returned a prompt yet so were obviously still busy doing stuff.
I guess the restart of the libvirtd service dropped the guests (except the
same lines appear in the messages file of the server on which the guests
didn't get killed).
Given I was rebooting the host anyway I didn't bother to bring the guests
back up again and rebooted the host (yeah, I know). On reboot neither of
the guests autostarted, so I logged in to the host and tried to start them
with "virsh start <domain>". Both complained that
error: internal error unable to reserve PCI address 0:0:2.0
and didn't start. Checking the .xml files for both guests I noted that
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
was listed for the 'disk' device. I also noticed that the following lines
were missing
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='5901' autoport='no'/>
<video>
<model type='cirrus' vram='9216' heads='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
whereas they were in place for the KVM setup/host and guests which had
successfully updated. I added in the lines, made the 'disk' PCI ID
something else and after restarting libvirtd tried booting the guests again.
Still no joy. Still the same error. In the end I commented out the
"address type='pci'" line for 'video' and attempted to boot again. This
time I got failures booting the newly installed kernel at the point at which
the root LVM mount was attempted. It recommended I look at the "root=" part
of the boot line, but didn't give me suggestions as to what to put there.
At this point I tried mounting the guests' disk images to see if the update
of the kernel hadn't worked fully and the grub.conf was in a mess:
# losetup /dev/loop0 foo.img
# kpartx -av /dev/loop0
# mount /dev/mapper/loop0p1 /mnt
...
# umount /mnt
# kpartx -dv /dev/loop0
# losetup -d /dev/loop0
Once inside the image I looked at the grub.conf files and couldn't see any
issues. I umounted the image and tried booting into an older kernel and the
guests booted successfully. "yum update" indicated an incomplete
transaction so I ran "yum-complete-transaction" and then "yum update kernel"
and rebooted both guests successfully into the new kernel. All now
seems well. Phew.
My questions are:
1) Is it a bad idea to patch the host's libvirtd while guests are running?
2) Should libvirtd have killed the guests like that?
3) With this update to KVM/qemu/libvird are "address type='pci'" now
unnecessary and removable from /etc/libvirt/qemu/<domain>.xml files as PCI
IDs are now dynamically assigned?
Ben
--
Unix Support, MISD, University of Cambridge, England
Plugger of wire, typer of keyboard, imparter of Clue
Life Is Short. It's All Good.
More information about the rhelv6-list
mailing list