[libvirt] Found mem leak in libvirtd, need help to debug
mprivozn at redhat.com
Tue Feb 9 15:59:01 UTC 2016
On 09.02.2016 16:34, Piotr Rybicki wrote:
> W dniu 2016-02-09 o 16:12, Michal Privoznik pisze:
>> On 09.02.2016 13:36, Piotr Rybicki wrote:
>>> Hi guys.
>>> W dniu 2015-11-20 o 11:29, Piotr Rybicki pisze:
>>>>> I've seen some of theese already. The bug is actually not in
>>>>> libvirt but
>>>>> in gluster's libgfapi library, so any change in libvirt won't help.
>>>>> This was tracked in gluster as:
>>>>> I suggest you update the gluster library to resolve this issue.
>>> I've tested further this issue.
>>> I have to report, that mem leak still exists in latest versions
>>> gluster: 3.7.6
>>> libvirt 1.3.1
>>> mem leak exists even when starting domain (virsh start DOMAIN) which
>>> acesses drivie via libgfapi (although leak is much smaller than with
>>> gluster 3.5.X).
>>> when using drive via file (gluster fuse mount), there is no mem leak
>>> when starting domain.
>>> my drive definition (libgfapi):
>>> <disk type='network' device='disk'>
>>> <driver name='qemu' type='raw' cache='writethrough'
>>> <source protocol='gluster' name='pool/disk-sys.img'>
>>> <host name='X.X.X.X' transport='rdma'/>
>>> <blockio logical_block_size='512' physical_block_size='32768'/>
>>> <target dev='vda' bus='virtio'/>
>>> <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
>>> valgrind details (libgfapi):
>>> # valgrind --leak-check=full --show-reachable=yes
>>> --child-silent-after-fork=yes libvirtd --listen 2> libvirt-gfapi.log
>>> ==6532== Memcheck, a memory error detector
>>> ==6532== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
>>> ==6532== Using Valgrind-3.11.0 and LibVEX; rerun with -h for
>>> copyright info
>>> ==6532== Command: libvirtd --listen
>>> ==6532== Warning: noted but unhandled ioctl 0x89a2 with no
>>> size/direction hints.
>>> ==6532== This could cause spurious value errors to appear.
>>> ==6532== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing
>>> a proper wrapper.
>>> 2016-02-09 12:20:26.732+0000: 6535: info : libvirt version: 1.3.1
>>> 2016-02-09 12:20:26.732+0000: 6535: info : hostname: adm-office
>>> 2016-02-09 12:20:26.732+0000: 6535: warning : qemuDomainObjTaint:2223 :
>>> Domain id=1 name='gentoo-intel'
>>> uuid=f9fd934b-cbda-af4e-cc98-0dd2c8dd6c2c is tainted: host-cpu
>>> 2016-02-09 12:21:29.924+0000: 6532: error : qemuMonitorIO:689 : internal
>>> error: End of file from monitor
>>> ==6532== HEAP SUMMARY:
>>> ==6532== in use at exit: 3,726,573 bytes in 15,324 blocks
>>> ==6532== total heap usage: 238,573 allocs, 223,249 frees,
>>> 1,020,776,752 bytes allocated
>>> ==6532== LEAK SUMMARY:
>>> ==6532== definitely lost: 19,760 bytes in 97 blocks
>>> ==6532== indirectly lost: 21,098 bytes in 122 blocks
>>> ==6532== possibly lost: 2,698,764 bytes in 67 blocks
>>> ==6532== still reachable: 986,951 bytes in 15,038 blocks
>>> ==6532== suppressed: 0 bytes in 0 blocks
>>> ==6532== For counts of detected and suppressed errors, rerun with: -v
>>> ==6532== ERROR SUMMARY: 96 errors from 96 contexts (suppressed: 0
>>> from 0)
>>> full log:
>> I still think these are libgfapi leaks; All the definitely lost bytes
>> come from the library.
>> ==6532== 3,064 (96 direct, 2,968 indirect) bytes in 1 blocks are
>> definitely lost in loss record 1,106 of 1,142
>> ==6532== at 0x4C2C0D0: calloc (vg_replace_malloc.c:711)
>> ==6532== by 0x10701279: __gf_calloc (mem-pool.c:117)
>> ==6532== by 0x106CC541: xlator_dynload (xlator.c:259)
>> ==6532== by 0xFC4E947: create_master (glfs.c:202)
>> ==6532== by 0xFC4E947: glfs_init_common (glfs.c:863)
>> ==6532== by 0xFC4EB50: glfs_init@@GFAPI_3.4.0 (glfs.c:916)
>> ==6532== by 0xF7E4A33: virStorageFileBackendGlusterInit
>> ==6532== by 0xF7D56DE: virStorageFileInitAs (storage_driver.c:2788)
>> ==6532== by 0xF7D5E39: virStorageFileGetMetadataRecurse
>> ==6532== by 0xF7D6295: virStorageFileGetMetadata
>> ==6532== by 0x1126A2B0: qemuDomainDetermineDiskChain
>> ==6532== by 0x11269AE6: qemuDomainCheckDiskPresence
>> ==6532== by 0x11292055: qemuProcessLaunch (qemu_process.c:4708)
>> Care to reporting it to them?
> Of course - i will.
> But, are You sure there is no need to call glfs_fini() after qemu
> process is launched? Are all of those resources still needed in libvirt?
> I understand, that libvirt needs to check presence / other-things of
> storage, but after qemu is launched?
We call glfs_fini(). And that's the problem. It does not free everything
that glfs_init() allocated. Hence the leaks. Actually every time we call
glfs_init() we print a debug message from
virStorageFileBackendGlusterInit() which wraps it. And then another
debug message from virStorageFileBackendGlusterDeinit() when we call
glfs_fini(). So if you set up debug logs, you can check whether our init
and finish calls match.
More information about the libvir-list