[libvirt] an AB deadlock or libvirtd crash problem in virsh console and virsh destroy

Michal Privoznik mprivozn at redhat.com
Thu Aug 11 09:02:35 UTC 2016


On 04.08.2016 14:37, weifuqiang wrote:

> The reason of this problem is that fdstream abort event or close event occured at the same time, libvirtd doesn't deal with the synchronousness well enough.
> the flows about fdStream is bellow
> 1、qemuDomainDefineXMLFlags -> virDomainObjListAdd  -> qemuDomainObjPrivateAlloc -> virChrdevAlloc -> virHashCreate
> 2、qemuDomainOpenConsole -> virChrdevOpen -> virHashAddEntry(devs->hash, path, st)
> 3、virDomainObjDispose -> privateDataFreeFunc (qemuDomainObjPrivateFree) - > virChrdevFree(*dev locked*) -> virChrdevFreeClearCallbacks - >  virFDStreamSetInternalCloseCb(*fdst locked*)
> 4、virFDStreamCloseInt (*fdst locked*) -> icbFreeOpaque(virChrdevFDStreamCloseCb (*dev locked*)) -> virHashRemoveEntry

Thank you for your very detailed description of the problem.

> 
> The AB lock problem is obviouse: in step 3, it locks chardev before fdst, and in step 4, it's the opsite way.
> The reason of libvirtd crash is that: in virFDStreamCloseInt function we set fdst to NULL, while in virFDStreamSetInternalCloseCb we use fdst->lock, note that fdst has already been freed.
> another crash problem occurs because that when virChrdevFree was earlier finished, dev->hash freed and all date of hash is freed, but fdstream event flow use fdStream after hash free. Ahha~~, libvirtd coredump.
> 
> All of those problem is because  clear vm flow and fdStream flow concurs synchronously.
> 
> then I fix this problem by modify virChrdevFree()
> 
> void virChrdevFree(virChrdevsPtr devs)
> {
>     if (!devs || !devs->hash)
>         return;
> 
>     for (;;) {
>         virMutexLock(&devs->lock);
>         if (0 == virHashSize(devs->hash)) {
>             virMutexUnlock(&devs->lock);
>             break;
>         }
>         virMutexUnlock(&devs->lock);
>         usleep(10 * 1000);
>     }
> 
>     virMutexLock(&devs->lock);
>     virHashFree(devs->hash);
>     virMutexUnlock(&devs->lock);
>     virMutexDestroy(&devs->lock);
>     VIR_FREE(devs);

Usually, a race fix that contains usleep() is not really a fix.

> }
> If the chardev is removed by fdStream close or fdStream abort when vm destroy or shutdown, the modification works well. But I'm not sure is that all chardev would be removed when we clear vm,  if not, it would always sleep here.
> 
> Another solution is as follows:
> 
> virMutexLock(vm);
> virChrdevFree();
> virMutexUnlock(vm);
> 
> virMutexLock(vm);
> virFDStreamCloseInt();
> virMutexUnlock(vm);

I like this one more.

However, what I'm thinking is:
1) looks like when qemuDomainObjPrivateFree() is called, it's a lost
game thereafter. I mean, qemuDomainObjPrivateFree() callsirChrdevFree()
which destroys the virChrdevs::mutex. Any later attempts to lock it will
result in crash (or something similarly nasty). Therefore I think in
virChrdevFree() we should unregister the stream close callbacks (icbCb
and icbFreeOpaque). We are doing the cleanup anyway.


2) I'm wondering whether it is necessary to have the close callbacks
(icbCb and icbFreeOpaque) guarded with stream mutex locked. Looks to me
like we could unlock the stream while the callback are running and then
lock it back again.

Michal




More information about the libvir-list mailing list