[libvirt] Race condition between qemuDomainCreate and qemuDomainDestroy

Marc Hartmayer mhartmay at linux.vnet.ibm.com
Mon Apr 9 14:59:32 UTC 2018


On Fri, Apr 06, 2018 at 06:27 PM +0200, John Ferlan <jferlan at redhat.com> wrote:
> On 04/03/2018 07:47 AM, Marc Hartmayer wrote:
>> On Tue, Mar 20, 2018 at 11:25 AM +0100, Marc Hartmayer <mhartmay at linux.vnet.ibm.com> wrote:
>>> Hi,
>>>
>>> there is a race condition between 'qemuDomainCreate' and
>>> 'qemuDomainDestroy' causing a NULL pointer segmentation fault when
>>> accessing priv->monConfig. The race condition can be easily reproduced
>>> using gdb.
>>>
>>> (gdb) set non-stop on
>>> # set breakpoint on line 'mon = qemuMonitorOpen(vm, …)'
>>> (gdb) b qemu_process.c:1799
>>> # Actually, this second breakpoint is optional but it’s good to see
>>> where priv->monConfig is set to NULL
>>> # set breakpoint on line priv->monConfig = NULL;
>>> (gdb) b qemu_process.c:6589
>>> (gdb) run
>>> # continue all threads - just for the case we hit a breakpoint already
>>> (gdb) c -a
>>>
>>> Now start a domain (that is using QEMU)
>>>
>>> $ virsh start domain
>>>
>>> The first breakpoint will be hit. Now run in a second shell
>>>
>>> $ virsh destroy domain
>>>
>>> The second breakpoint will be hit. Continue the thread where the second
>>> breakpoint was hit (for this example this is thread 4)
>>>
>>> (gdb) thread apply 4 continue
>>>
>>> Now continue the thread where the first breakpoint was hit.
>>>
>>> => Segmentation fault because of a NULL pointer dereference at
>>>    config->value
>>>
>>> Since I'm not very familiar with that part of the code, I wanted to ask
>>> for your advice.
>>>
>>> Thanks in advance.
>>>
>>> Beste Grüße / Kind regards
>>>    Marc Hartmayer
>>>
>>> IBM Deutschland Research & Development GmbH
>>> Vorsitzende des Aufsichtsrats: Martina Koederitz
>>> Geschäftsführung: Dirk Wittkopp
>>> Sitz der Gesellschaft: Böblingen
>>> Registergericht: Amtsgericht Stuttgart, HRB 243294
>> 
>> Any ideas?
>> 
>
> Seeing as no one else has an exact or authoritative answer...
>
> qemuDomainCreate{XML|WithFlags} (and a few others) will call
> qemuProcessBeginJob which calls qemuDomainObjBeginAsyncJob and
>         qemuDomainObjSetAsyncJobMask which IIUC allows QEMU_JOB_DESTROY
> to be run.
>
> The qemuDomainDestroyFlags calls qemuProcessBeginStopJob which calls
> qemuDomainObjBeginJob (e.g. sync job) using QEMU_JOB_DESTROY, which
> again IIUC is allowed to happen alongside the Async job because of the
> mask setting.
>
> In the code where you've broken during create, the @vm object lock is
> dropped allowing destroy to obtain it. So with the perfect timing and
> storm the window of opportunity does exist that the monConfig could be
> free'd and thus priv->monConfig set to NULL before or during the create
> processing uses it in qemuMonitorOpen.  If during the processing, then
> obviously the config-> would "appear" to be valid, but it may not
> actually be what was passed.
>
> The fix I believe involves using objects for virDomainChrSourceDef
> rather than VIR_ALLOC'd and VIR_FREE'd memory directly. I've put
> together a few patches and will post them shortly. Using the patches I
> don't see a core, but rather the (I believe) expected "error: internal
> error: qemu unexpectedly closed the monitor"
>
> John
>
>
>

Thanks for the fix!

-- 
Beste Grüße / Kind regards
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294





More information about the libvir-list mailing list