[libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility
Ian Jackson
Ian.Jackson at eu.citrix.com
Thu Jan 30 17:12:45 UTC 2014
Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility"):
> Ok, thanks. I'm currently testing on your git branch referenced earlier
> in this thread
>
> git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v2.1
Great. That's the one. My current version is pretty much identical -
some unused variables deleted and comments edited.
> > * You need to fix the timer deregistration arrangements in the
> > libvirt/libxl driver to avoid the crash you identified the other day.
>
> Yes, I'm testing a fix now.
Great.
> > * Something needs to be done about the 20ms slop in the libvirt event
> > loop (as it could cause libxl to lock up). If you can't get rid of
> > it in the libvirt core, then adding 20ms to the every requested
> > callback time in the libvirt/libxl driver would work for now.
> >
>
> The commit msg adding the fuzz says
>
> Fix event test timer checks on kernels with HZ=100
>
> On kernels with HZ=100, the resolution of sleeps in poll() is
> quite bad. Doing a precise check on the expiry time vs the
> current time will thus often thing the timer has not expired
> even though we're within 10ms of the expected expiry time. This
> then causes another pointless sleep in poll() for <10ms. Timers
> do not need to have such precise expiration, so we treat a timer
> as expired if it is within 20ms of the expected expiry time. This
> also fixes the eventtest.c test suite on kernels with HZ=100
I think this is a bug in the kernel. poll() may sleep longer, but not
shorter, than expected.
> * daemon/event.c: Add 20ms fuzz when checking for timer expiry
>
> I could handle this in the libxl driver as you say, but doing so makes
> me a bit nervous. Potentially locking up libxl makes me nervous too :).
I was going to say that the code in libxl_osevent_occurred_timeout
checked the time against the requested time and would ignore the event
(thinking it was stale) if it was too early.
But in fact now that I read the code this is not true. In fact I
think it will work OK (modulo some things happening too soon). So the
upshot is that I still think this is a bug in libvirt but I don't
think it's critical to fix it.
Sorry to cause undue alarm.
> Yes. I've been running my tests for about 24 hours now with no problems
> noted. The tests include starting/stopping a persistent VM,
> creating/stopping a transient VM, rebooting a persistent VM,
> saving/restoring a transient VM, and getting info on all of these VMs.
>
> I should probably add saving/restoring a persistent VM to the mix since
> the associated libxl_ctx is never freed. Only when a persistent VM is
> undefined is the libxl_ctx freed.
Right. Great.
Thanks,
Ian.
More information about the libvir-list
mailing list