[libvirt] PATCH: Fix infinite loop when QEMU quits at startup

Daniel P. Berrange berrange at redhat.com
Fri Jan 30 21:41:57 UTC 2009


On Fri, Jan 30, 2009 at 07:00:07PM +0100, Jim Meyering wrote:
> "Daniel P. Berrange" <berrange at redhat.com> wrote:
> > The recent refactoring of the QEMU startup process now reads the monitor
> > TTY from the logfile. Unfortunately in this refactoring we lost the check
> > for the 'ret == 0' scenario in the read() return value. So if QEMU quits
> > at startup, eg  due to missing disk image, we loop forever on read() == 0
> > because we hit end-of-file and QEMU has quit.
> >
> > This patch adds back handling for this scenario, and takes care to
> > propagate the contents of the log to the user as an error message
> >
> >  # start demo
> > libvir: QEMU error : internal error unable to start guest: qemu: could
> >   not open disk image /home/berrange/Fedora-9-i686-Live.iso
> > error: Failed to start domain demo
> >
> > In addition, there were a couple of other bugs
> >
> >
> >  - a memory leak where we set the 'monitorpath' variable, even
> >    though we'd just set it moments before.
> >
> >  - a missing check for whether the driver VNC password was present
> >    when initializing passwords at VM startupo
> >
> >  - missing initialization of the monitor_watch field, and missing
> >    checking for whether the watch was set before removing it.
> >
> >  - a gratuitous LOG_INFO when shutting down any VM, which could
> >    just be LOG_DEBUG.
> 
> FYI, I reproduced the infloop with the following:
> 
> cat <<\EOF > d.xml
> <domain type='qemu'>
>   <name>D</name>
>   <uuid>c7a5fdbd-cdaf-9455-926a-d65c16db1809</uuid>
>   <memory>219200</memory>
>   <currentMemory>219200</currentMemory>
>   <vcpu>2</vcpu>
>   <os>
>     <type arch='i686' machine='pc'>hvm</type>
>     <boot dev='cdrom'/>
>   </os>
>   <devices>
>     <emulator>/usr/bin/qemu-system-x86_64</emulator>
>     <disk type='file' device='cdrom'>
>       <source file='no-such-file'/>
>       <target dev='hdc'/>
>       <readonly/>
>     </disk>
>     <disk type='file' device='disk'>
>       <source file='no-such-file'/>
>       <target dev='hda'/>
>     </disk>
>     <graphics type='vnc' port='-1'/>
>   </devices>
> </domain>
> EOF
> 
> Before, this use of virsh would hang:
> 
>     qemud/libvirtd &
>     src/virsh -c qemu:///session "define d.xml; start D"
> 
> Now, it fails with a diagnostic, as you'd expect.


FWIW, any scenario where the XML points to a disk image that doesn't
exist should hit the codepath I did. That said it wasn't 100% reliable
but I put that down to the LXC corruption in valgrind i posted.

> Somehow, while testing this, I got numerous segfaults
> from libvirtd (due to dereferencing NULL doms->objs[i]->def,
> but those should never be NULL), yet when I went to set up a
> reproducer, it stopped happening altogether.

It shouldn't be possible to allocate a virDomainObjPtr instance
without having a valid virDomainDefPtr instance. That said, it
could be the case that the doms->objs[i] access is doing an array
out of bounds access, due to inadequate thread locking, or a bogus
re-allocation

This patch is applied to CVS.

Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|




More information about the libvir-list mailing list