Caching qemu capabilities and KubeVirt

Roman Mohr rmohr at google.com
Wed Oct 12 10:37:46 UTC 2022


Hi Daniel,

just wanted to ask if you had a chance to check the shared file.

Let me know if you need more info.

Best regards,
Roman

On Tue, Sep 13, 2022 at 5:31 PM Roman Mohr <rmohr at google.com> wrote:

> On Thu, Sep 8, 2022 at 4:32 PM Daniel P. Berrangé <berrange at redhat.com>
> wrote:
>
>> On Thu, Sep 08, 2022 at 04:22:17PM +0200, Roman Mohr wrote:
>> > On Thu, Sep 8, 2022 at 4:04 PM Daniel P. Berrangé <berrange at redhat.com>
>> > wrote:
>> >
>> > > On Thu, Sep 08, 2022 at 02:23:31PM +0100, Daniel P. Berrangé wrote:
>> > > > On Thu, Sep 08, 2022 at 03:10:09PM +0200, Roman Mohr wrote:
>> > > > > On Thu, Sep 8, 2022 at 2:56 PM Daniel P. Berrangé <
>> berrange at redhat.com
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > On Thu, Sep 08, 2022 at 02:24:00PM +0200, Roman Mohr wrote:
>> > > > > > > Hi,
>> > > > > > >
>> > > > > > >
>> > > > > > > I have a question regarding capability caching  in the
>> context of
>> > > > > > KubeVirt.
>> > > > > > > Since we start in KubeVirt one libvirt instance per VM,
>> libvirt
>> > > has to
>> > > > > > > re-discover on every VM start the qemu capabilities which
>> leads to
>> > > a
>> > > > > > 1-2s+
>> > > > > > > delay in startup.
>> > > > > > >
>> > > > > > > We already discover the features in a dedicated KubeVirt pod
>> on
>> > > each
>> > > > > > node.
>> > > > > > > Therefore I tried to copy the capabilities over to see if that
>> > > would
>> > > > > > work.
>> > > > > > >
>> > > > > > > It looks like in general it could work, but libvirt seems to
>> > > detect a
>> > > > > > > mismatch in the exposed KVM CPU ID in every pod. Therefore it
>> > > invalidates
>> > > > > > > the cache. The recreated capability cache looks esctly like
>> the
>> > > original
>> > > > > > > one though ...
>> > > > > > >
>> > > > > > > The check responsible for the invalidation is this:
>> > > > > > >
>> > > > > > > ```
>> > > > > > > Outdated capabilities for '%s': host cpuid changed
>> > > > > > > ```
>> > > > > > >
>> > > > > > > So the KVM_GET_SUPPORTED_CPUID call seems to return
>> > > > > > > slightly different values in different containers.
>> > > > > > >
>> > > > > > > After trying out the attached golang scripts in different
>> > > containers, I
>> > > > > > > could indeed see differences.
>> > > > > > >
>> > > > > > > I can however not really judge what the differences in these
>> KVM
>> > > function
>> > > > > > > registers mean and I am curious if someone else knows. The
>> files
>> > > are
>> > > > > > > attached too (as json for easy diffing).
>> > > > > >
>> > > > > > Can you confirm whether the two attached data files were
>> captured
>> > > > > > by containers running on the same physical host, or could each
>> > > > > > container have run on a different host.
>> > > > > >
>> > > > >
>> > > > > They are coming from the same host, that is the most surprising
>> bit
>> > > for me.
>> > > > > I am also very sure that this is the case, because I only had one
>> k8s
>> > > node
>> > > > > from where I took these.
>> > > > > The containers however differ (obviously) on namespaces and on the
>> > > > > privilege level (less obvious). The handler dump is from a fully
>> > > privileged
>> > > > > container.
>> > > >
>> > > > The privilege level sounds like something that might be impactful,
>> > > > so I'll investigate that.  I'd be pretty surprised for namespaces
>> > > > to have any impact thnough.
>> > >
>> > > The privilege level is a red herring. Peter reminded me that we have
>> > > to filter out some parts of CPUID because the APIC IDs vary depending
>> > > on what host CPU the task executes on.
>> > >
>> > >
>> > >
>> https://gitlab.com/libvirt/libvirt/-/blob/master/src/util/virhostcpu.c#L1346
>> > >
>> > > In the 2 jSON files you provide, the differences i see should already
>> > > be matched by
>> > >
>> > >         /* filter out local apic id */
>> > >         if (entry->function == 0x01 && entry->index == 0x00)
>> > >             entry->ebx &= 0x00ffffff;
>> > >         if (entry->function == 0x0b)
>> > >             entry->edx &= 0xffffff00;
>> > >
>> > > so those differences ought not to be causing the cache to be
>> > > invalidated.
>> > >
>> >
>> > Hm, maybe I misinterpreted the logs then. The snipped I looked at was
>> this:
>> >
>> >
>> > ```
>> > {"component":"virt-launcher","level":"info","msg":"/dev/kvm has changed
>> > (1661786802 vs
>> >
>> 0)","pos":"virQEMUCapsKVMUsable:4850","subcomponent":"libvirt","thread":"25","timestamp":"2022-08-29T15:26:42.936000Z"}
>> > {"component":"virt-launcher","level":"info","msg":"a=0x7f8138153ba0,
>> >
>> b=0x7f818001c480","pos":"virCPUDataIsIdentical:1178","subcomponent":"libvirt","thread":"25","timestamp":"2022-08-29T15:26:42.939000Z"}
>> > {"component":"virt-launcher","level":"info","msg":"Outdated capabilities
>> > for '/usr/bin/qemu-system-x86_64': host cpuid
>> >
>> changed","pos":"virQEMUCapsIsValid:4993","subcomponent":"libvirt","thread":"25","timestamp":"2022-08-29T15:26:42.939000Z"}
>> > {"component":"virt-launcher","level":"info","msg":"Outdated cached
>> > capabilities
>> >
>> '/var/cache/libvirt/qemu/capabilities/926803a9278e445ec919c2b6cbd8c1c449c75b26dcb1686b774314180376c725.xml'
>> > for
>> >
>> '/usr/bin/qemu-system-x86_64'","pos":"virFileCacheLoad:163","subcomponent":"libvirt","thread":"25","timestamp":"2022-08-29T15:26:42.939000Z"}
>> > ```
>>
>> Can you capture the
>>
>>
>> /var/cache/libvirt/qemu/capabilities/926803a9278e445ec919c2b6cbd8c1c449c75b26dcb1686b774314180376c725.xml'
>
>
>> from the virt-handler and virt-launcher pods. It contains a <cpuid>
>> block that will show us the differences libvirt recored, /after/,
>> libvirt has done its filtering. This will show if there si more
>> we need to filter.
>>
>
> Done. Only attached the one from handler since it is 100% identical to the
> launcher.
>
> Let me know if you need more information.
>
> Thanks and best regards,
> Roman
>
>
>>
>> > I had the impression from the code that the `/dev/kvm` change (because
>> the
>> > containers are not created at the same time) does not invalidate it
>> either.
>> >
>> > I added the whole debug log, maybe I missed something obvious.
>> >
>> > Does it make a difference if the cache is created via `virsh
>> > domcapabilities` and `virsh capabilities` or via defining the first
>> domain?
>>
>> They'll all end up at the same caching scode so should not make any
>> difference.
>>
>> >
>> > Best regards,
>> > Roman
>> >
>> >
>> > >
>> > > With regards,
>> > > Daniel
>> > > --
>> > > |: https://berrange.com      -o-
>> > > https://www.flickr.com/photos/dberrange :|
>> > > |: https://libvirt.org         -o-
>> > > https://fstop138.berrange.com :|
>> > > |: https://entangle-photo.org    -o-
>> > > https://www.instagram.com/dberrange :|
>> > >
>> > >
>>
>>
>>
>> With regards,
>> Daniel
>> --
>> |: https://berrange.com      -o-
>> https://www.flickr.com/photos/dberrange :|
>> |: https://libvirt.org         -o-
>> https://fstop138.berrange.com :|
>> |: https://entangle-photo.org    -o-
>> https://www.instagram.com/dberrange :|
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20221012/720b4ff9/attachment.htm>


More information about the libvirt-users mailing list