[libvirt] Easy reproducer for multiple races and segfaults in libvirtd

Richard W.M. Jones rjones at redhat.com
Fri Nov 16 15:36:39 UTC 2012


On Fri, Nov 16, 2012 at 03:31:43PM +0000, Daniel P. Berrange wrote:
> On Fri, Nov 16, 2012 at 02:48:03PM +0000, Richard W.M. Jones wrote:
> > On Fri, Nov 16, 2012 at 02:40:31PM +0000, Daniel P. Berrange wrote:
> > > On Fri, Nov 16, 2012 at 02:16:04PM +0000, Richard W.M. Jones wrote:
> > > > 
> > > > You need to read the instructions at the top, and download the
> > > > following appliance too:
> > > > 
> > > > http://libguestfs.org/download/binaries/appliance/appliance-1.18.9.tar.xz
> > > > 
> > > > So far I've filed the following bugs:
> > > > 
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=875741
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=877110
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=877312
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=877429
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=877430
> > > 
> > > Thanks for the reproducer program, that should make life much easier.
> > > 
> > > Just to confirm, you're seeing these problems on both 1.0.0 and
> > > current GIT master ?
> > 
> > Actually I'm testing libvirt-0.10.2.1-2.fc18.x86_64 & libvirt from
> > git, and seeing roughly the same set of problems with both.  Didn't
> > try 1.0.0 at all.
> > 
> > To use libvirt from git, I'm doing:
> > 
> >   killall libvirtd lt-libvirtd
> >   ~/d/libvirt/run ./test-parallel 
> > 
> > Plus I should note a few things about my environment:
> > 
> >  - Fedora 18
> > 
> >  - 16 GB of RAM (if you don't have that, reduce NR_THREADS in the test)
> > 
> >  - baremetal with KVM on a very fast Intel Sandybridge
> >    (I doubt this is reproducible in a VM)
> > 
> >  - I've configured core_pattern and ulimit to capture coredumps in /tmp
> 
> I've run the test on several machines, and finally found one which
> would reproduce the "Operation is not valid" bug. I don't see any
> of the other BZs you list above occurring.
> 
> At least i can now investigate what's gone wrong with 877430

Of the two machines I'm using, 877430 is most "popular" by far on the
slower machine.  Segfaults in libvirtd also happen on the slower
machine, but much less regularly, and because of a configuration error
I didn't manage to catch a core dump yet.

875741 happens most frequently on the faster machine.

This might be caused by the relative speed or it might be because of
some other combination of installed software.

In any case, it takes at least 10 minutes on the faster machine (and
usually longer) to get an error.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora




More information about the libvir-list mailing list