[libvirt] blkio cgroup

Fri Feb 18 19:31:49 UTC 2011

On Fri, Feb 18, 2011 at 11:31:37AM -0500, Vivek Goyal wrote:

[..]
> > So I went and tried to throttle I/O of kernel3-8 to 10MB/s instead of
> > weighing I/O. First I rebooted everything so that no old configuration
> > of cgroup was left in place and then setup everything except the 100 and
> > 1000 weight configuration.
> > 
> > quote from blkio.txt:
> > ------------
> > - blkio.throttle.write_bps_device
> >         - Specifies upper limit on WRITE rate to the device. IO rate is
> >           specified in bytes per second. Rules are per deivce. Following is
> >           the format.
> > 
> >   echo "<major>:<minor>  <rate_bytes_per_second>" >
> > /cgrp/blkio.write_bps_device
> > -------------
> > 
> > for vm in kernel1 kernel2 kernel3 kernel4 kernel5 kernel6 kernel7
> > kernel8; do ls -lH /dev/vdisks/$vm; done
> > brw-rw---- 1 root root 254, 23 Feb 18 13:45 /dev/vdisks/kernel1
> > brw-rw---- 1 root root 254, 24 Feb 18 13:45 /dev/vdisks/kernel2
> > brw-rw---- 1 root root 254, 25 Feb 18 13:45 /dev/vdisks/kernel3
> > brw-rw---- 1 root root 254, 26 Feb 18 13:45 /dev/vdisks/kernel4
> > brw-rw---- 1 root root 254, 27 Feb 18 13:45 /dev/vdisks/kernel5
> > brw-rw---- 1 root root 254, 28 Feb 18 13:45 /dev/vdisks/kernel6
> > brw-rw---- 1 root root 254, 29 Feb 18 13:45 /dev/vdisks/kernel7
> > brw-rw---- 1 root root 254, 30 Feb 18 13:45 /dev/vdisks/kernel8
> > 
> > /bin/echo 254:25 10000000 >
> > /mnt/notimportant/blkio.throttle.write_bps_device
> > /bin/echo 254:26 10000000 >
> > /mnt/notimportant/blkio.throttle.write_bps_device
> > /bin/echo 254:27 10000000 >
> > /mnt/notimportant/blkio.throttle.write_bps_device
> > /bin/echo 254:28 10000000 >
> > /mnt/notimportant/blkio.throttle.write_bps_device
> > /bin/echo 254:29 10000000 >
> > /mnt/notimportant/blkio.throttle.write_bps_device
> > /bin/echo 254:30 10000000 >
> > /mnt/notimportant/blkio.throttle.write_bps_device
> > /bin/echo 254:30 10000000 >
> > /mnt/notimportant/blkio.throttle.write_bps_device
> > 
> > Then I ran the previous test again. This resulted in an ever increasing
> > load (last I checked was ~ 300) on the host system. (This is perfectly
> > reproducible).
> > 
> > uptime
> > Fri Feb 18 14:42:17 2011
> > 14:42:17 up 12 min,  9 users,  load average: 286.51, 142.22, 56.71
> 
> Have you run top or something to figure out why load average is shooting
> up. I suspect that because of throttling limit, IO threads have been
> blocked and qemu is forking more IO threads. Can you just run top/ps
> and figure out what's happening.
> 
> Again, is it some kind of linear volume group from which you have carved
> out logical volumes for each virtual machine?
> 
> For throttling to begin with, can we do a simple test first. That is
> run a single virtual machine, put some throttling limit on logical volume
> and try to do READs. Once READs work, lets test WRITES and check why
> does system load go up.

Ok, I was playing a bit with blkio throttling policy and it is working
for me.

Following is my setup.

- Kernel 2.6.38-rc5

- I have a some Luns exported from an storage array. Created a multipath
  device on top of it and then partitioned it. Exported one of the
  partitions to virtual machine as virtio disk. So in the guest I see
  virtual disk /dev/vdb and corresponding device on host is /dev/dm-2. I
  am using cache=none option.

- I create a test cgroup /cgroup/blkio/test1/ and move virtual machine
  there.

- Run time dd if=/dev/zero of=/mnt/vdb/testfile bs=1M count=1500 with
  and without any throttling limits.

Without throttling limits.

# time dd if=/dev/zero of=/mnt/vdb/testfile bs=1M
count=1500
1500+0 records in
1500+0 records out
1572864000 bytes (1.6 GB) copied, 6.37301 s, 247 MB/s

real	0m6.446s
user	0m0.003s
sys	0m4.074s

With throttling limit (50MB/s write limit)
-----------------------------------------
# time dd if=/dev/zero of=/mnt/vdb/testfile bs=1M
count=1500
1500+0 records in
1500+0 records out
1572864000 bytes (1.6 GB) copied, 30.1553 s, 52.2 MB/s

real	0m30.422s
user	0m0.004s
sys	0m2.102s

So throttling seems to be working for me as expected. Without throttling
I got 247MB/s bandwidth and when I put a 50MB/s limit on virtual mahcine
I get 52.2MB/s.

I had used following command to limit virtual machine.

echo "252:2 50000000" > /cgroup/blkio/test1/blkio.throttle.write_bps_device

I was also running "vmstat 2" when throttled machine did IO and noticed
that number of blocked processes went up, around 25-35. I am assuming
these all are qemu IO threads blocked waiting for throttled IO to finish.
I am not sure if blocked processes also contribute towards load average.

While googling a bit, I found one wiki page which says following.

"Most UNIX systems count only processes in the running (on CPU) or runnable
(waiting for CPU) states. However, Linux also includes processes in
uninterruptible sleep states (usually waiting for disk activity), which
can lead to markedly different results if many processes remain blocked in
I/O due to a busy or stalled I/O system."

If this is true, that explains high system load in your testing.
Throttling is working and we have around 30-35 IO threads/processes
per qemu instance. You have 8 qemu instance running and roughly 240-280
processes blocked waiting for IO to finish and that will explain high
load. But that is expected given the fact we are throttling IO?

I also tried direct IO in virtual machine and that seems to be forking
only 1 IO thread.

# time dd if=/dev/zero of=/mnt/vdb/testfile bs=1M count=1500 oflag=direct
1500+0 records in
1500+0 records out
1572864000 bytes (1.6 GB) copied, 31.4301 s, 50.0 MB/s

real	0m31.664s
user	0m0.003s
sys	0m0.819s

While running this I noticed number of processes blocked was 1 all the
time and hence low load average. Try oflag=direct option in your tests.

Thanks
Vivek