[libvirt] [PATCH v2 1/3] daemon: keep daemon until all hypervisors drivers are cleaned up

John Ferlan jferlan at redhat.com
Thu Oct 27 14:53:58 UTC 2016



On 10/27/2016 10:04 AM, Nikolay Shirokovskiy wrote:
> 
> 
> On 27.10.2016 15:59, John Ferlan wrote:
>>
>>
>> On 10/04/2016 10:27 AM, Nikolay Shirokovskiy wrote:
>>> This fix SEGV with the backtrace [1]
>>>
>>> This happens due to race on simultaneous finishing of libvirtd and
>>> qemu process. We need to keep daemon object until all hypervisor
>>> drivers are cleaned up. The other option is to take reference to
>>> daemon in every hypervisor driver but this will take more work
>>> and we really don't need to. Drivers cleanup procedure is synchronous
>>> thus let's just remove last reference to daemon after drivers cleanup.
>>>
>>> [1] backtrace (shortened a bit)
>>>
>>> 1  0x00007fd3a791b2e6 in virCondWait (c=<optimized out>, m=<optimized out>) at util/virthread.c:154
>>> 2  0x00007fd3a791bcb0 in virThreadPoolFree (pool=0x7fd38024ee00) at util/virthreadpool.c:266
>>> 3  0x00007fd38edaa00e in qemuStateCleanup () at qemu/qemu_driver.c:1116
>>> 4  0x00007fd3a79abfeb in virStateCleanup () at libvirt.c:808
>>> 5  0x00007fd3a85f2c9e in main (argc=<optimized out>, argv=<optimized out>) at libvirtd.c:1660
>>>
>>> Thread 1 (Thread 0x7fd38722d700 (LWP 32256)):
>>> 0  0x00007fd3a7900910 in virClassIsDerivedFrom (klass=0xdfd36058d4853, parent=0x7fd3a8f394d0) at util/virobject.c:169
>>> 1  0x00007fd3a7900c4e in virObjectIsClass (anyobj=anyobj at entry=0x7fd3a8f2f850, klass=<optimized out>) at util/virobject.c:365
>>> 2  0x00007fd3a7900c74 in virObjectLock (anyobj=0x7fd3a8f2f850) at util/virobject.c:317
>>> 3  0x00007fd3a7a24d5d in virNetDaemonRemoveShutdownInhibition (dmn=0x7fd3a8f2f850) at rpc/virnetdaemon.c:547
>>> 4  0x00007fd38ed722cf in qemuProcessStop (driver=driver at entry=0x7fd380103810, vm=vm at entry=0x7fd38025b6d0, reason=reason at entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN, asyncJob=asyncJob at entry=QEMU_ASYNC_JOB_NONE,
>>>     flags=flags at entry=0) at qemu/qemu_process.c:5786
>>> 5  0x00007fd38edd9428 in processMonitorEOFEvent (vm=0x7fd38025b6d0, driver=0x7fd380103810) at qemu/qemu_driver.c:4588
>>> 6  qemuProcessEventHandler (data=<optimized out>, opaque=0x7fd380103810) at qemu/qemu_driver.c:4632
>>> 7  0x00007fd3a791bb55 in virThreadPoolWorker (opaque=opaque at entry=0x7fd3a8f1e4c0) at util/virthreadpool.c:145
>>> ---
>>>  daemon/libvirtd.c | 4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>
>> I'd like to adjust the commit message just a bit...  In particular
>> removing the "The other option..." sentence.
> 
> ok
> 
>>
>> So the "problem" discovered is that virStateInitialize requires/uses the
>> "dmn" reference as the argument to the daemonInhibitCallback and by
>> dereferencing the object too early and then calling the virStateCleanup
>> which calls the daemonInhibitCallback using the dmn we end up with the SEGV.
> 
> yep
> 
>>
>>> diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c
>>> index 95c1b1c..eefdefc 100644
>>> --- a/daemon/libvirtd.c
>>> +++ b/daemon/libvirtd.c
>>> @@ -1628,7 +1628,6 @@ int main(int argc, char **argv) {
>>>      virObjectUnref(qemuProgram);
>>>      virObjectUnref(adminProgram);
>>>      virNetDaemonClose(dmn);
>>> -    virObjectUnref(dmn);
>>>      virObjectUnref(srv);
>>>      virObjectUnref(srvAdm);
>>>      virNetlinkShutdown();
>>> @@ -1658,6 +1657,9 @@ int main(int argc, char **argv) {
>>>          driversInitialized = false;
>>>          virStateCleanup();
>>>      }
>>
>> Another option would have been to move the above hunk up... Since
>> setting the daemonStateInit is one of the last things done before
>> running the event loop, it stands to reason shutting down should be done
>> in the opposite order thus causing those InhibitCallbacks to be run
>> perhaps after virNetlinkEventServiceStopAll and before virNetDaemonClose.
> 
> By the above hunk you mean virStateCleanup() et all? If yes - then I don't
> think so. The thing is that start is tricky.
> 
> 1. We start the network part, but keep it in disabled state.
> In this state it creates sockets and listens on them but not accept
> connections.
> 
> 2. Initialize all the hyper drivers and then enable the network part
>  to let it accept connections.
> 
> Thus effectively the order is reverse to what is seems - 
> first hyper drivers and then network servers.
> 
> Actually there is a direct reasoning why hyper drivers shutted down after
> network part - otherwise requests can be delivered to hyper drivers in
> invalid state.
> 

Yeah - it's an intricate and delicate balance, especially that inhibit
callback stuff. Since you only have 1 domain and 1 libvirtd

I'll keep the order as is...

Tks -

John

> Nikolay
> 
>>
>> Any thoughts to that?
>>
>> This I assume works, but do we run into "other" issues because we're
>> running those callbacks after virNetDaemonClose and virNetlinkShutdown?
>>
>> Of course the "fear" of moving is that it causes "other" timing issues
>> with previous adjustments here, in particular:
>>
>> commit id '5c756e580' and 'a602e90bc1' which introduced and adjusted
>> driversInitialized
>>
>> I'm OK with leaving things as is, but perhaps someone else will see this
>> an comment as well.
>>
>> John
>>
>>
>>> +    /* unref daemon only here as hypervisor drivers can
>>> +       call shutdown inhibition functions on cleanup */
>>> +    virObjectUnref(dmn);
>>>  
>>>      return ret;
>>>  }
>>>




More information about the libvir-list mailing list