[libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

Halil Pasic pasic at linux.vnet.ibm.com
Wed Oct 25 23:35:11 UTC 2017

On 10/20/2017 04:54 PM, Christian Borntraeger wrote:
> Starting a guest with
>    <os>
>     <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
>   </os>
>   <cpu mode='host-model'/>
> on an IBM z14 results in
> "qemu-system-s390x: Some features requested in the CPU model are not
> available in the configuration: gs"
> This is because guarded storage is fenced for compat machines that did not have
> guarded storage support, but libvirt expands the cpu model according to the
> latest available machine.
> While this prevents future migration abort (by not starting the guest at all),
> not being able to start a "host-model" guest is very much unexpected.  As it
> turns out, even if we would modify libvirt to not expand the cpu model to
> contain "gs" for compat machines, it cannot guarantee that a migration will
> succeed. For example if the kernel changes its features (or the user has
> nested=1 on one host but not on the other) the migration will fail
> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
> all machine types that support the CPU model. This will make "host-model"
> runnable all the time, while relying on the CPU model to reject invalid
> migration attempts.
> Suggested-by: David Hildenbrand <david at redhat.com>
> Signed-off-by: Christian Borntraeger <borntraeger at de.ibm.com>

I've tried to review this patch. Unfortunately I don't have access to a
z14, so I could not try the most interesting scenarios out.

The idea of the patch is very clear, but I don't understand the bigger gs
feature context fully.

>From what I read in the code, the attempt to enable the gs capability in
the kernel is made regardless of the cpu model. If the attempt was
successful kvm_s390_get_gs will keep returning true. That would in turn
affect migration, as far as I see (usages of kvm_s390_get_gs). I could
not figure out how does gs being turned off via cpu-model (-cpu
z14,gs=off) does turn of gs support -- at least not the details. I wanted
to give a timely review, so I've limited myself there. 

>From what I see, this patch does what it advertises, and since I think
it's the right thing to do in the current situation I gonna give it an:
Acked-by: Halil Pasic <pasic at linux.vnet.ibm.com>

At the same time, I would prefer the commit message being reworded. IMHO
this patch is a good stop-gap measure, but essentially it trades an
annoying and obvious bug for a subtle and hopefully painless one.

Let me explain this last statement. For starters, I  do share some of the
concerns Boris has voiced.  I won't repeat those. Same goes for the
example Christian paraphrased previously, and the the fear of an implicit
requirement for having to support a Cartesian product of the advertised
machine-types and cpu-models (for each qemu binary).

In my eyes, a cpu isn't all that different from the other devices which
get attached to a board (represented by machine-type). So I don't see why
should it be exempt from the usual compatibility requirements tied to
machine-types (for the sake of stability and compatibility). (I basically
mean: no new features are added to a device in the context of a given
(fully qualified) machine-type (with new QEMU -- binary -- versions). As
far as I understand all (other) devices shall respect these requirements.
Or am I wrong? If I'm not, please enlighten me, how is a CPU
fundamentally different than let's say a FLIC.

A related thing is, that to implement some features indicated/controlled
via cpu-model features, we need to add capabilities to certain devices.
Now if the devices shall obey the 'no new features for the same
machine-type' rule, but the cpu-model feature shall obey our new
'retroactively introduce/enable for all machines supporting cpu-models'
rule, I think we have a conflict.  An example for what I'm talking about
is zPCI, AIS and FLIC. In case of the AIS and the FLIC, AFAIK the
conflict was resolved so that the AIS feature/code of the FLIC is not
subject to usual compat-macro mechanism. Another example, AP facility in
not just about the CPU instructions, but also about a device.

It's also true for the last paragraph: I might very well be wrong, and if
I am please do tell (where is the hole in  my reasoning). I will try to
re-check my statements tomorrow -- again trying to deliver today along
the lines better a small bird today than a big one tomorrow).

And another question. If we adopt this introducing features for
machine-types retroactively, how should the machine-type versions be
understood like? My current understanding is, that the machine-type
(version) is supposed to limit the observable changes when upgrading
the binary to the bare minimum (basically possibly modified timings -- which
can't be avoided -- and bug-fixes) for the sake of updating the
binary being as safe as possible.

Last but not least, I have to say, I'm neither an domain expert for
cpu-models nor for libvirt and it's models. For that reason, I've
personally asked Jason to do a more detailed review on this -- and am
hoping to wiggle out with an weak ack. I do intend to keep on following
the discussion (despite of not feeling to be entitled to make any calls)
and contribute where I can.



> ---
>  hw/s390x/s390-virtio-ccw.c         | 8 --------
>  include/hw/s390x/s390-virtio-ccw.h | 3 ---
>  target/s390x/kvm.c                 | 2 +-
>  3 files changed, 1 insertion(+), 12 deletions(-)
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index fabe4a6..ae5b01a 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -414,7 +414,6 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
>      s390mc->ri_allowed = true;
>      s390mc->cpu_model_allowed = true;
>      s390mc->css_migration_enabled = true;
> -    s390mc->gs_allowed = true;
>      mc->init = ccw_init;
>      mc->reset = s390_machine_reset;
>      mc->hot_add_cpu = s390_hot_add_cpu;
> @@ -495,12 +494,6 @@ bool cpu_model_allowed(void)
>      return get_machine_class()->cpu_model_allowed;
>  }
> -bool gs_allowed(void)
> -{
> -    /* for "none" machine this results in true */
> -    return get_machine_class()->gs_allowed;
> -}
> -
>  static char *machine_get_loadparm(Object *obj, Error **errp)
>  {
>      S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
> @@ -740,7 +733,6 @@ static void ccw_machine_2_9_class_options(MachineClass *mc)
>  {
>      S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
> -    s390mc->gs_allowed = false;
>      ccw_machine_2_10_class_options(mc);
>      s390mc->css_migration_enabled = false;
> diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
> index a9a90c2..ac896e3 100644
> --- a/include/hw/s390x/s390-virtio-ccw.h
> +++ b/include/hw/s390x/s390-virtio-ccw.h
> @@ -40,15 +40,12 @@ typedef struct S390CcwMachineClass {
>      bool ri_allowed;
>      bool cpu_model_allowed;
>      bool css_migration_enabled;
> -    bool gs_allowed;
>  } S390CcwMachineClass;
>  /* runtime-instrumentation allowed by the machine */
>  bool ri_allowed(void);
>  /* cpu model allowed by the machine */
>  bool cpu_model_allowed(void);
> -/* guarded-storage allowed by the machine */
> -bool gs_allowed(void);
>  /**
>   * Returns true if (vmstate based) migration of the channel subsystem
> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> index 4c85ed8..020a7ea 100644
> --- a/target/s390x/kvm.c
> +++ b/target/s390x/kvm.c
> @@ -363,7 +363,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>              cap_ri = 1;
>          }
>      }
> -    if (gs_allowed()) {
> +    if (cpu_model_allowed()) {
>          if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
>              cap_gs = 1;
>          }

More information about the libvir-list mailing list