[libvirt] Entering freeze for libvirt-4.6.0

Michal Privoznik mprivozn at redhat.com
Tue Jul 31 10:42:44 UTC 2018


On 07/31/2018 12:25 PM, Daniel P. Berrangé wrote:
> On Tue, Jul 31, 2018 at 12:18:16PM +0200, Ján Tomko wrote:
>> On Tue, Jul 31, 2018 at 10:26:41AM +0200, Michal Privoznik wrote:
>>> On 07/30/2018 05:20 PM, Andrea Bolognani wrote:
>>>> On Sat, 2018-07-28 at 21:56 +0800, Daniel Veillard wrote:
>>>>>
>>>
>>>>
>>>> Unfortunately I've spotted an issue during my testing of rc1 today:
>>>> with the libvirt_guest NSS module enabled, Evolution would crash a
>>>> few seconds after being started. Here's the stack trace:
>>>>
>>>>   #0  0x00007fffe7b69ba5 in json_object_iter_next () at /lib64/libjson-glib-1.0.so.0
>>>>   #1  0x00007fffad8e757b in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2
>>>>   #2  0x00007fffad8e75d8 in virJSONValueFromJansson () at /lib64/libnss_libvirt_guest.so.2
>>>>   #3  0x00007fffad8e8994 in virJSONValueFromString () at /lib64/libnss_libvirt_guest.so.2
>>>>   #4  0x00007fffad8ecb5a in virMacMapNew () at /lib64/libnss_libvirt_guest.so.2
>>>>   #5  0x00007fffad8cc140 in findLease () at /lib64/libnss_libvirt_guest.so.2
>>>>   #6  0x00007fffad8ccb1c in _nss_libvirt_guest_gethostbyname4_r () at /lib64/libnss_libvirt_guest.so.2
>>>>   #7  0x00007fffeb2599d2 in gaih_inet.constprop () at /lib64/libc.so.6
>>>>   #8  0x00007fffeb25aab4 in getaddrinfo () at /lib64/libc.so.6
>>>>   #9  0x00007ffff1d41a04 in do_lookup_by_name () at /lib64/libgio-2.0.so.0
>>>>   #10 0x00007ffff1d3e937 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0
>>>>   #11 0x00007ffff5c39933 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
>>>>   #12 0x00007ffff5c38f2a in g_thread_proxy () at /lib64/libglib-2.0.so.0
>>>>   #13 0x00007ffff6314594 in start_thread () at /lib64/libpthread.so.0
>>>>   #14 0x00007fffeb2700df in clone () at /lib64/libc.so.6
>>>>
>>>> I've talked about it with a few colleagues and we believe the issue
>>>> to be caused by jansson and json-glib both exporting a symbol called
>>>> json_object_iter_next: Evolution itself (indirectly?) links against
>>
>> For the libraries installed on my Fedora 28, objdump shows
>> json_object_iter_next as the only conflicting symbol, maybe we can
>> get away with using a different iterator as a quick fix.
>>
>>>> the latter library, so when the libvirt_guest NSS module is loaded
>>>> and attempts to process JSON using the former, it picks up the wrong
>>>> implementation, leading to a crash. gnome-boxes also crashes with
>>>> the same stack trace.
>>>
>>> Worse. querying gentoo portage I've found some important packages
>>> requiring json-glib:
>>>
>>> x11-libs/gtk
>>> gnome-base/gnome-shell
>>>
>>> So once users of these app update to latest libvirt they will see the
>>> crashes.
>>>
>>>>
>>>> It seems like a similar issue could affect any application linking
>>>> both to libvirt and json-glib, regardless of whether or not the NSS
>>>> plugin has been enabled, which is of course pretty bad.
>>>
>>> Yes, any application can crash.
>>>
>>>>
>>>> Unfortunately, I don't have any bright ideas on how to solve this,
>>>> so anyone who might: please step forward! We're just a few days
>>>> away from the next release, and if we can't figure out a way around
>>>> this soon I'm afraid the only reasonable course of action would be
>>>> to (temporarily) revert the switch from yajl to jansson.
>>>>
>>>
>>> Well, what if we linked with jansson statically? I'm not sure if it is
>>> possible (and have no idea how to achieve that), but what if our dynamic
>>> libraries we produce already contained jansson and thus linker would not
>>> even try to resolve json_* symbols.
>>
>>
>> For the client library, we can just compile out JSON - it should not be
>> needed for anything. And we can generate the data for libvirt_nss in
>> some simpler format.
> 
> There's no such concept of 'client library', our libvirt.so library
> is used in both client and server sides, providing the shared code
> to both.
> 
> Changing the NSS format is doable, but not before release.

Actually it isn't. A small history window - JSON was used when Nehal was
implementing support for reporting domain IP addresses. He created a
small binary (leasehelper) that dnsmasq runs every time a lease is given
to a domain. The leasehelper then stores assigned IP address into a file
so that virDomainInterfaceAddresses can report it. Back then JSON was
chosen because we have a set of good APIs to work with the format from
C. The NSS plugin just made use of the stored data for NSS (9 releases
later).

Fast forward to today. So if we change NSS plugin (which is not the one
to blame) then we have to change the leasehelper too. But more
importantly, we would need a script (run at %post possibly) that reworks
JSON to the new format so that the data is preserved. It's not only NSS
plugin in play here.

> 
> In fact bearing this problem in mind, I tend to think we should perhaps
> make sure that the NSS library doesn't link to anything except glibc.
> It can be loaded into any process running on the host, so the less we
> load from it the safer we'll be.

Sure. That's why src/libvirt_nss.la is built - disabling all other
features. So far we've been quite successful with this, the NSS links
with jansson (apart from standard libs like libc).

But we are losing the big picture here. Switching format that NSS plugin
uses is not going to help, because we have apps (like gnome-boxes) that
links both GTK and libvirt. And as I stated earlier, GTK drags in
json-glib and libvirt.so drags in jansson. And I guess changing qemu
monitor format from JSON to something else is not going to happen ;-)

Michal




More information about the libvir-list mailing list