[libvirt-users] 100% CPU when using nested virtualization

Digimer lists at alteeve.ca
Fri Mar 11 13:22:46 UTC 2016


On 11/03/16 04:11 AM, Kashyap Chamarthy wrote:
> On Thu, Mar 10, 2016 at 10:29:08PM -0500, Digimer wrote:
>> Hi all,
>>
>>   I got a new laptop recently and what worked before no longer works
>> (Fedora 23 on the laptops in both cases)...
>>
>>   I'm trying to get nested virtualization to work because I use the VMs
>> on the laptop to simulate an HA cluster that itself hosts VMs. I don't
>> care much at all about the performance of the nested VM, it's just there
>> so that I can work on the cluster's code.
>>
>>
>>   When I try to provision a VM inside a VM (the host VM is CentOS/RHEL
>> 6.7), the CPU load spikes to such a high degree that my ssh session
>> times out after a while. The VM appears in libvirtd (as viewed by
>> virt-manager on another machine), but the VM itself never starts.
> 
> Just to clearly restate your environment (on the problematic newer
> laptop):
> 
>     - Physical host (L0) == Fedora 23
>     - Guest hypervisor (L1) == CentOS 6.7
>     - Nested guest (L2) == ?  (I'd assume CentOS 6.7; correct me if I'm
>       wrong)
>     
> Hmm, then it's relatively old libvirt/QEMU/Kernel versions on the host.
> It's one of the downsides with Nested Virt -- the explosion of test
> matrix involving different Kernel combinations.  FWIW, I had the most
> productive experience when I run newest stable Kernel on the physical
> host; even better experience if all levels are running it too.

The nested guest was going to be centos 6, but it never booted.

After sending this email, I tested a fresh VM (host) install, rather
than using a VM I migrated from my previous laptop, and it worked. So I
undefined and recreated the cluster nodes and then, magically, nested
VMs worked...

Below is the diff of the definition file I migrated from the old laptop
(an-a04n01) and the new one that works (an-a03n01). Ignore the name
change... I had previously dumped the broken an-a03n01's XML to create a
new VM.

>>   In one case, the VM host remained somewhat functional and killing
>> kvm/qemu/libvirtd didn't reduce the CPU load.
>>
>>   The main difference between the setups is that the older laptop had a
>> Sandy bridge(? Thinkpad W530) and the new laptop is a Broadwell
>> (Thinkpad P70).
> 
> [A side note on Broadwell CPUs, you might've noticed by now: Intel
> released a microcode update to remove TSX; if you're using the model
> without the TSX, your guest hypervisor (L1) should be using  the CPU
> model Haswell-noTSX.]

I tried setting the CPU type to 'SandyBridge' without success, and the
(now working) 'Broadwell' is ok. Good to know about the TSX things though.

>>   I've tried to loading vhost_net without much luck.
> 
> What is preventing you from loading `vhost_net`?  Assuming VHOST_NET is
> compiled in your Kernel, `sudo modprobe vhost_net` fails for you?  If
> so, in what way?
> 
> On Fedora 23, it is compiled in by default:
> 
>     $ grep CONFIG_VHOST_NET /boot/config-4.3.5-300.fc23.x86_64 
>     CONFIG_VHOST_NET=m

Sorry, I wasn't clear; I meant that I loaded it and it didn't help with
the CPU load problem.

>> I have, of course, enabled nesting on the actual hardware:
>>
>> cat /sys/module/kvm_intel/parameters/nested
>> Y
>>
>>   Any tips on how to debug?
>>
>>   I'm in quite a pickle with this, so any and all help is much
>>   appreciated.
> 
> You might want to try a few things to identify where the problem might
> be:
> 
>   - `mpstat` shows processor details and CPU utilization, probably you
>     might want to run that to get a general view.  There are several
>     values it presents: %guest (guest code); %usr (QEMU device
>     emulation), etc.  Refer its man page.
> 
>   - `kvm_stat` (provided by 'qemu-kvm-tools' package on Fedora-based
>     systems), a `top`-like tool to show runtime statics of KVM events.
>     An example of what the results look like[1].
> 
>   - Or use the 'perf' to record nested virtualization related KVM
>     events:
> 
>         $ perf record -a -e kvm:kvm_exit -e kvm:kvm_entry \
>             -e kvm:kvm_nested_vmexit -e kvm:kvm_nested_vmrun
>     
>     Or the 'perf kvm' utlitiy.  See here[2] for more details.
> 
> [1]
> https://kashyapc.fedorapeople.org/virt/kvm_stat-VMCS-Shadowing/kvm_stat-L0-VMCS-Shadowing-enabled.txt
> [2] http://www.linux-kvm.org/page/Perf_events

Thank you kindly for all this!

digimer

ps - diff of broken vs working XML:

====
  1 digimer at pulsar:~$ diff -u an-a04n01.xml an-a03n01.xml
--- an-a04n01.xml	2016-03-09 23:21:53.196100400 -0500
+++ an-a03n01.xml	2016-03-11 08:16:55.753793610 -0500
@@ -1,6 +1,6 @@
-<domain type='kvm' id='22'>
-  <name>an-a04n01</name>
-  <uuid>2e431895-d1cf-425a-8820-9fecb75e1141</uuid>
+<domain type='kvm' id='29'>
+  <name>an-a03n01</name>
+  <uuid>e7e1bf1d-924f-4559-9672-a5c7d01e9699</uuid>
   <memory unit='KiB'>8388608</memory>
   <currentMemory unit='KiB'>8388608</currentMemory>
   <vcpu placement='static'>4</vcpu>
@@ -9,15 +9,14 @@
   </resource>
   <os>
     <type arch='x86_64' machine='pc-i440fx-2.4'>hvm</type>
-    <bootmenu enable='yes'/>
   </os>
   <features>
     <acpi/>
     <apic/>
-    <pae/>
+    <vmport state='off'/>
   </features>
   <cpu mode='custom' match='exact'>
-    <model fallback='allow'>SandyBridge</model>
+    <model fallback='allow'>Broadwell</model>
   </cpu>
   <clock offset='utc'>
     <timer name='rtc' tickpolicy='catchup'/>
@@ -35,7 +34,7 @@
     <emulator>/usr/bin/qemu-kvm</emulator>
     <disk type='file' device='disk'>
       <driver name='qemu' type='qcow2'/>
-      <source file='/var/lib/libvirt/images/an-a04n01.qcow2'/>
+      <source file='/var/lib/libvirt/images/an-a03n01.qcow2'/>
       <backingStore/>
       <target dev='vda' bus='virtio'/>
       <boot order='1'/>
@@ -44,7 +43,6 @@
     </disk>
     <disk type='file' device='cdrom'>
       <driver name='qemu' type='raw'/>
-      <source file='/data0/VMs/files/rhel-server-6.7-x86_64-dvd.iso'/>
       <backingStore/>
       <target dev='hda' bus='ide'/>
       <readonly/>
@@ -53,110 +51,116 @@
     </disk>
     <controller type='usb' index='0' model='ich9-ehci1'>
       <alias name='usb'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x7'/>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x7'/>
     </controller>
     <controller type='usb' index='0' model='ich9-uhci1'>
       <alias name='usb'/>
       <master startport='0'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x0' multifunction='on'/>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x0' multifunction='on'/>
     </controller>
     <controller type='usb' index='0' model='ich9-uhci2'>
       <alias name='usb'/>
       <master startport='2'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x1'/>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x1'/>
     </controller>
     <controller type='usb' index='0' model='ich9-uhci3'>
       <alias name='usb'/>
       <master startport='4'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x2'/>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x2'/>
     </controller>
     <controller type='pci' index='0' model='pci-root'>
       <alias name='pci.0'/>
     </controller>
-    <controller type='virtio-serial' index='0'>
-      <alias name='virtio-serial0'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x0'/>
-    </controller>
     <controller type='ide' index='0'>
       <alias name='ide'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x1'/>
     </controller>
+    <controller type='virtio-serial' index='0'>
+      <alias name='virtio-serial0'/>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x0'/>
+    </controller>
     <interface type='network'>
-      <mac address='52:54:00:ef:b3:db'/>
-      <source network='bcn_bridge1' bridge='bcn_bridge1'/>
-      <target dev='vnet4'/>
+      <mac address='52:54:00:85:7e:66'/>
+      <source network='ifn_bridge1' bridge='ifn_bridge1'/>
+      <target dev='vnet8'/>
       <model type='e1000'/>
       <link state='up'/>
-      <boot order='2'/>
       <alias name='net0'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
function='0x0'/>
     </interface>
     <interface type='network'>
-      <mac address='52:54:00:b4:a3:57'/>
-      <source network='bcn_bridge1' bridge='bcn_bridge1'/>
-      <target dev='vnet5'/>
+      <mac address='52:54:00:7b:99:7d'/>
+      <source network='ifn_bridge1' bridge='ifn_bridge1'/>
+      <target dev='vnet9'/>
       <model type='e1000'/>
       <link state='up'/>
       <alias name='net1'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x09'
function='0x0'/>
     </interface>
     <interface type='network'>
-      <mac address='52:54:00:0d:c6:9f'/>
+      <mac address='52:54:00:f6:0f:f7'/>
       <source network='sn_bridge1' bridge='sn_bridge1'/>
-      <target dev='vnet6'/>
+      <target dev='vnet10'/>
       <model type='e1000'/>
       <link state='up'/>
       <alias name='net2'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x0a'
function='0x0'/>
     </interface>
     <interface type='network'>
-      <mac address='52:54:00:fc:15:2d'/>
+      <mac address='52:54:00:a4:c4:aa'/>
       <source network='sn_bridge1' bridge='sn_bridge1'/>
-      <target dev='vnet7'/>
+      <target dev='vnet11'/>
       <model type='e1000'/>
       <link state='up'/>
       <alias name='net3'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x0b'
function='0x0'/>
     </interface>
     <interface type='network'>
-      <mac address='52:54:00:b7:9b:b4'/>
-      <source network='ifn_bridge1' bridge='ifn_bridge1'/>
-      <target dev='vnet8'/>
+      <mac address='52:54:00:c0:5a:f2'/>
+      <source network='bcn_bridge1' bridge='bcn_bridge1'/>
+      <target dev='vnet12'/>
       <model type='e1000'/>
       <link state='up'/>
+      <boot order='2'/>
       <alias name='net4'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x0c'
function='0x0'/>
     </interface>
     <interface type='network'>
-      <mac address='52:54:00:d3:a4:63'/>
-      <source network='ifn_bridge1' bridge='ifn_bridge1'/>
-      <target dev='vnet9'/>
+      <mac address='52:54:00:4a:e4:4f'/>
+      <source network='bcn_bridge1' bridge='bcn_bridge1'/>
+      <target dev='vnet13'/>
       <model type='e1000'/>
       <link state='up'/>
       <alias name='net5'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x0d'
function='0x0'/>
     </interface>
     <serial type='pty'>
-      <source path='/dev/pts/15'/>
+      <source path='/dev/pts/6'/>
       <target port='0'/>
       <alias name='serial0'/>
     </serial>
-    <console type='pty' tty='/dev/pts/15'>
-      <source path='/dev/pts/15'/>
+    <console type='pty' tty='/dev/pts/6'>
+      <source path='/dev/pts/6'/>
       <target type='serial' port='0'/>
       <alias name='serial0'/>
     </console>
-    <channel type='spicevmc'>
-      <target type='virtio' name='com.redhat.spice.0'
state='disconnected'/>
+    <channel type='unix'>
+      <source mode='bind'
path='/var/lib/libvirt/qemu/channel/target/an-a03n01.org.qemu.guest_agent.0'/>
+      <target type='virtio' name='org.qemu.guest_agent.0'
state='disconnected'/>
       <alias name='channel0'/>
       <address type='virtio-serial' controller='0' bus='0' port='1'/>
     </channel>
+    <channel type='spicevmc'>
+      <target type='virtio' name='com.redhat.spice.0'
state='disconnected'/>
+      <alias name='channel1'/>
+      <address type='virtio-serial' controller='0' bus='0' port='2'/>
+    </channel>
     <input type='tablet' bus='usb'>
       <alias name='input0'/>
     </input>
     <input type='mouse' bus='ps2'/>
     <input type='keyboard' bus='ps2'/>
-    <graphics type='spice' port='5901' autoport='yes' listen='127.0.0.1'>
+    <graphics type='spice' port='5902' autoport='yes' listen='127.0.0.1'>
       <listen type='address' address='127.0.0.1'/>
     </graphics>
     <sound model='ich6'>
@@ -175,13 +179,14 @@
       <alias name='redir1'/>
     </redirdev>
     <memballoon model='virtio'>
+      <stats period='5'/>
       <alias name='balloon0'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x08'
function='0x0'/>
     </memballoon>
   </devices>
   <seclabel type='dynamic' model='selinux' relabel='yes'>
-    <label>system_u:system_r:svirt_t:s0:c37,c557</label>
-    <imagelabel>system_u:object_r:svirt_image_t:s0:c37,c557</imagelabel>
+    <label>system_u:system_r:svirt_t:s0:c805,c1011</label>
+    <imagelabel>system_u:object_r:svirt_image_t:s0:c805,c1011</imagelabel>
   </seclabel>
 </domain>
====

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?




More information about the libvirt-users mailing list