[libvirt-users] cgroup blkio.weight working, but not for KVM guests

Ben Clay rbclay at ncsu.edu
Wed Oct 17 17:31:03 UTC 2012


I'm running libvirt 0.10.2 and qemu-kvm-1.2.0, both compiled from source, on
CentOS 6.  I've got a working blkio cgroup hierarchy which I'm attaching
guests to using the following XML guest configs:

 

VM1 (foreground):

 

  <cputune>

    <shares>2048</shares>

  </cputune>

  <blkiotune>

    <weight>1000</weight>

  </blkiotune>

 

VM2 (background): 

 

  <cputune>

    <shares>2</shares>

  </cputune>

  <blkiotune>

    <weight>100</weight>

  </blkiotune>

 

I've tested write throughput on the host using cgexec and dd, demonstrating
that libvirt has correctly set up the cgroups:

 

cgexec -g blkio:libvirt/qemu/foreground time dd if=/dev/zero of=trash1.img
oflag=direct bs=1M count=4096 & cgexec -g blkio:libvirt/qemu/background time
dd if=/dev/zero of=trash2.img oflag=direct bs=1M count=4096 &

 

Snap from iotop, showing an 8:1 ratio (should be 10:1, but 8:1 is
acceptable):

 

Total DISK READ: 0.00 B/s | Total DISK WRITE: 91.52 M/s

  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND

9602 be/4 root        0.00 B/s   10.71 M/s  0.00 % 98.54 % dd if=/dev/zero
of=trash2.img oflag=direct bs=1M count=4096

9601 be/4 root        0.00 B/s   80.81 M/s  0.00 % 97.76 % dd if=/dev/zero
of=trash1.img oflag=direct bs=1M count=4096

 

Further, checking the task list inside each cgroup shows the guest's main
PID, plus those of the virtio kernel threads.  It's hard to tell if all the
virtio kernel threads are listed, but all the ones I've hunted down appear
to be there.

 

However, when running the same dd commands inside the guests, I get
roughly-equal performance - nowhere near the ~8:1 relative bandwidth
enforcement I get from the host: (background ctrl-c'd right after foreground
finishes, both started within 1s of each other)

 

[ben at foreground ~]$ dd if=/dev/zero of=trash1.img oflag=direct bs=1M
count=4096

4096+0 records in

4096+0 records out

4294967296 bytes (4.3 GB) copied, 104.645 s, 41.0 MB/s

 

[ben at background ~]$ dd if=/dev/zero of=trash2.img oflag=direct bs=1M
count=4096

^C4052+0 records in

4052+0 records out

4248829952 bytes (4.2 GB) copied, 106.318 s, 40.0 MB/s

 

I thought based on this statement: "Currently, the Block I/O subsystem does
not work for buffered write operations. It is primarily targeted at direct
I/O, although it works for buffered read operations." from this page:
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/ht
ml/Resource_Management_Guide/ch-Subsystems_and_Tunable_Parameters.html that
this problem might be due to host-side buffering, but I have that explicitly
disabled in my guest configs:

 

  <devices>

    <emulator>/usr/bin/qemu-kvm</emulator>

    <disk type="file" device="disk">

      <driver name="qemu" type="raw" cache="none"/>

      <source file="/path/to/disk.img"/>

      <target dev="vda" bus="virtio"/>

      <alias name="virtio-disk0"/>

      <address type="pci" domain="0x0000" bus="0x00" slot="0x04"
function="0x0"/>

    </disk>

 

Here is the qemu line from ps, showing that it's clearly being passed
through from the guest XML config:

 

root      5110 20.8  4.3 4491352 349312 ?      Sl   11:58   0:38
/usr/bin/qemu-kvm -name background -S -M pc-1.2 -enable-kvm -m 2048 -smp
2,sockets=2,cores=1,threads=1 -uuid ea632741-c7be-36ab-bd69-da3cbe505b38
-no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/background.monitor,server,n
owait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
file=/path/to/disk.img,if=none,id=drive-virtio-disk0,format=raw,cache=none
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virti
o-disk0,bootindex=1 -netdev tap,fd=20,id=hostnet0,vhost=on,vhostfd=22
-device
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:11:22:33:44:55,bus=pci.0,addr=
0x3 -chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc
127.0.0.1:1 -vga cirrus -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

 

For fun I tried a few different cache options to try to force a bypass the
host buffercache, including writethough and directsync, but the number of
virtio kernel threads appeared to explode (especially for directsync) and
the throughput dropped quite low: ~50% of "none" for writethrough and ~5%
for directsync.

 

With cache=none, when I generate write loads inside the VMs, I do see growth
in the host's buffer cache.  Further, if I use non-direct I/O inside the
VMs, and inflate the balloon (forcing the guest's buffer cache to flush), I
don't see a corresponding drop in background throughput.  Is it possible
that the cache="none" directive is not being respected?  

 

Since cgroups is working for host-side processes I think my blkio subsystem
is correctly set up (using cfq, group_isolation=1 etc).  Maybe I miscompiled
qemu, without some needed direct I/O support?  Has anyone seen this before?

 

Ben Clay

rbclay at ncsu.edu

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20121017/b5aae174/attachment.htm>


More information about the libvirt-users mailing list