[libvirt] Revisiting qemu monitor timeout on VM start

Thu Mar 9 15:16:32 UTC 2017

On Thu, Mar 09, 2017 at 09:00:47AM +0100, Michal Privoznik wrote:
> On 03/08/2017 10:19 PM, Jim Fehlig wrote:
> > Hi All,
> > 
> > Encountering a qemu monitor timeout when starting a VM has been discussed here
> > before, e.g.
> > 
> > https://www.redhat.com/archives/libvir-list/2014-January/msg00060.html
> > https://www.redhat.com/archives/libvir-list/2014-January/msg00408.html
> > 
> > Recently I've received reports of the same when starting large memory VMs backed
> > by 1G huge pages. In one of the reports, Matt timed how long it takes to
> > allocate 402GB worth of hugetlbfs pages (these are 1G pages, but the time is
> > similar for 2M):
> > 
> > real 105.47
> > user 0.05
> > sys 105.42
> > 
> > The time is spent entirely in the kernel zero'ing pages and as you can see it
> > exceeds the 30 second monitor timeout in libvirt. Do folks have any suggestions
> > on how to avoid the timeout?
> > 
> > Obviously one solution is to introduce a knob in qemu.conf to control the
> > timeout, as was proposed in the above threads. Another solution that came to
> > mind is changing qemu to open the monitor earlier, making it available while the
> > kernel is off scrubbing pages. I'm not familiar enough with qemu code to know if
> > such a change is possible, but given the amount of initialization done in main()
> > prior to calling mon_init_func(), my confidence in this idea is low. Perhaps
> > someone more familiar with qemu initialization can comment on that. Thanks in
> > advance for comments on these ideas or alternate proposals!
> 
> As suggested in one of the threads, the ideal solution would be that
> libvirt would create the unix socket and then just pass it to qemu
> during exec(). This way there would be no need for timeout. On the other
> hand, this approach obviously requires some work on qemu side too and
> I'm not sure: a) how much  b) whether there is somebody working on it.
> 
> If we would introduce the timeout now (say in qemu.conf), then we would
> be unable to honour it once the approach described above gets implemented.
> 
> Another workaround might be to raise the 30 second limit we currently
> have hard coded in our sources. Although, I'm not sure if this is an
> upstream material or a downstream one (e.g. if a distro aims on
> supporting such large guests, they can have a downstream only patch that
> increases the timeout to say 2 minutes - this might be undesirable for
> upstream).

I think enough different people have reported this problem that it is
something we need to address upstream.  IIUC, it only occurs when
using -mem-prealloc, and is proportional to size of RAM. So instead
of raising timeout for every guest, which would cause increased delay
if QEMU gets stuck on launch, we could only increase timeout when
needed, and scale it based on RAM. A recent QEMU patch showed similar
problems - 256 GB guest took 2 minutes to start.

eg add 5 second per 5 GB of guest RAM, to be quite conservative on
giving enough time.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|