Add option to skip cpu feature and model verification in libvirt and rely on qemu 'enforce' option

Wed Nov 23 08:10:21 UTC 2022

On 22/11/22 6:18 pm, Jiri Denemark wrote:
> On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote:
>> On 18/11/22 5:00 pm, Jiri Denemark wrote:
>>> On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote:
>>>> On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
>>>>
>>>>>>     * Qemu already provide an option 'enforce' to validate if features
>>>>>>     with which vm is started is exactly same as one provided and nothing
>>>>>>     is silently dropped.
>>>>> Right, but it's not enough. In addition to removed features libvirt also
>>>>> checks for unexpectedly added features. And you really need to do both.
>>>>> Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU
>>>>> says everything is fine, the guest might see more than what you asked.
>>>>> For example, if a feature is enabled only if a host supports it you may
>>>>> or may not get it without any complains from QEMU. But if you get it you
>>>>> really need to explicitly ask for it during migration, otherwise the
>>>>> feature can just silently disappear. Of course, this would be a really
>>>>> bad behavior from QEMU, but that does not mean it can't happen (I think
>>>>> SVM is a bit problematic in this way) and the whole point of libvirt's
>>>>> checks is to prevent this kind of issues.
>>>> Hi Jiri, I'm not following this very well. I think you're saying that qemu has
>>>> had bugs previously where features get silently enabled,
>>> Personally, I haven't actually witnessed any bug in this area (as far as
>>> I can remember, which is not that far :-)), but got some reports of at
>>> least one, even though without any proof.
>>>
>>> Specifically, SVM is quite strange as it is included in all AMD CPU
>>> models in QEMU and yet if you try to start it on a host without nesting
>>> enabled "enforce" does not complain. I saw the feature is enabled with
>>> older machine types, but I was told the magic behind this feature looks
>>> like not only machine type but even host configuration itself is
>>> involved. Anyway, although I saw reports of that the feature could be
>>> enabled without an explicit request even with new machine types I still
>>> haven't seen any proof of this happening. So I hope it just does not
>>> happen and users are only afraid of this possibility.
>>>
>>>> and it's libvirt's job/role to paper over those issues? Do you have
>>>> some specific cases of this?
>>> This is not about papering. It's actually the opposite, that is about
>>> detecting if something like this happens and reporting it as a failure
>>> rather than papering it and hoping everything goes well. So I think
>>> doing this is a good idea even though I don't think we actually saw any
>>> issue of this kind.
>>
>> Hi Jiri, I see now with libvirt master, with check=='full' we verify
>> both silently dropped as well as added features. But as you already
>> stated Qemu silently adding feature is a Qemu bug and libvirt just
>> reports that bug, so it should be very unlikely, i agree that is not a
>> good reasoning :). Our requirement is that we want to use CPU Models
>> and features which are defined in Qemu but not in libvirt for e.g if
>> we want to use Icelake-Server-V4 directly or newly added vmx feature,
>> libvirt does not allow. Currently we take help of qemu to do
>> validations but for cpu feature verfication and model definations we
>> still use libvirt defined definations which prevent us to use anything
>> which is not defined in libvirt. I see there are already efforts going
>> on to get all model and feature defination from qemu itself, but not
>> sure how much time it will take. Till that happens we thought safest
>> option is to have an option to remove all validations from libvirt and
>> rely on qemu 'enforce' for migration safetly. I understand
>> qemu-enforce does not check for silently added features, but that case
>> we can assume is very unlikely and Qemu should fix otherwise VMs will
>> not poweron anyway with check=='full'. Basically we want it as an
>> modification of check='none' but also skipping things like
>> virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if
>> silently adding features is that big concern we can have a check in
>> Qemu itself? I understand current qemu-enforce is not as migration
>> safe as check=='full' but probably suitable for our use case for time
>> being?
> I understand why you want this, but on the other hand adding just a
> plain pass-through for CPU model and features is quite dangerous as it
> can be used to set any -cpu option even if it's not a feature. And
> especially the parts that are configured in other areas of domain XML
> (such as hypervisor features). So I think instead of adding a new mode
> for <cpu> we should go another way. For things that are not yet
> supported by libvirt we support various elements in qemu namespace,
> e.g., setting QEMU command line options, environment variables, or even
> overriding options libvirt would normally set on the command line. So I
> guess we could have a special <qemu:cpu> element which would be used
> when a user wants full control over the -cpu option without any libvirt
> intervention.
>
> Jirka
>

Hi Jiri, Thanks, We discussed internally, yes something like that works for us. We wanted to dicusss how it will look like roughly before sending patch. Does something like below works.
<qemu:cpu> will be a sub-property within <cpu> and it will contains details only about cpu-model, cpu-features and check everything else can be as it is. So for CPU feature and model either user can define those in <cpu> itself as in example 1 or in <qemu:cpu> as in example 2. Also either example 1 or example 2 are valid, if user mixes up both, i mean define cpu feature or models in both <qemu:cpu> and <cpu> it will be a parsing error to avoid conflicts.
In feature policy we as of now need only require/disable for <qemu:cpu> case and no need of options like match, mode if <qemu:cpu> is defined. Please let us know if something like that works. It may vary a bit while working on patch, so probaly it will be more clear when patch is out, I just wanted to put our requirements roughly before starting working on patch :).

Example 1:

<cpu mode='custom' match='exact' check='partial'>
   <model fallback='forbid'>Icelake-Server</model>
   <vendor>Intel</vendor>
   <topology sockets='240' dies='1' cores='1' threads='1'/>
   <cache level='3' mode='emulate'/>
   <feature policy='require' name='smap'/>
   <feature policy='require' name='avx'/>
   <feature policy='require' name='xsaveopt'/>
   <feature policy='disable' name='tsc_adjust'/>
   <numa>
    <cell id='0' cpus='0-239' memory='8388608' unit='KiB' memAccess='shared'/>
   </numa>
  </cpu>

Example 2:

<cpu>
   <vendor>Intel</vendor>
   <topology sockets='240' dies='1' cores='1' threads='1'/>
   <cache level='3' mode='emulate'/>
   <qemu:cpu check='enfoce'>
   <model>Icelake-Server-V4</model>
   <feature policy='require' name='smap'/>
   <feature policy='require' name='avx'/>
   <feature policy='require' name='xsaveopt'/>
   <feature policy='disable' name='tsc_adjust'/>
   <qemu:cpu />
   <numa>
    <cell id='0' cpus='0-239' memory='8388608' unit='KiB' memAccess='shared'/>
   </numa>
  </cpu>

Thanks
Manish Mishra