REST service for libvirt to simplify SEV(ES) launch measurement

Wed Feb 23 18:38:26 UTC 2022

+cc Tobin, James

On 23/02/2022 19:28, Daniel P. Berrangé wrote:
> Extending management apps using libvirt to support measured launch of
> QEMU guests with SEV/SEV-ES is unreasonably complicated today, both for
> the guest owner and for the cloud management apps. We have APIs for
> exposing info about the SEV host, the SEV guest, guest measurements
> and secret injections. This is a "bags of bits" solution. We expect
> apps to them turn this into a user facting solution. It is possible
> but we're heading to a place where every cloud mgmt app essentially
> needs to reinvent the same wheel and the guest owner will need to
> learn custom APIs for dealing with SEV/SEV-ES for each cloud mgmt
> app. This is pretty awful.  We need to do a better job at providing
> a solution that is more general purpose IMHO.
> 
> 
> Consider a cloud mgmt app, right now the flow to use the bag of
> bits libvirt exposes, looks something like
> 
>   * Guest owner tells mgmt app they want to launch a VM
> 
>   * Mgmt app decides what host the VM will be launched on
> 
>   * Guest owner requests cert chain for the virt host from mgmt app
> 
>   * Guest owner validates cert chain for the virt host
> 
>   * Guest owner generates launch blob for the VM
> 
>   * Guest owner provides launch blob to the mgmt app
> 
>   * Management app tells libvirt to launch VM with blob,
>     with CPUs in a paused state
> 
>   * Libvirt luanches QEMU with CPUs stopped
> 
>   * Guest owner requests launch measurement from mgmt app
> 
>   * Guest owner validates measurement
> 
>   * Guest owner generates secret blob
> 
>   * Guest owner sends secret blob to management app
> 
>   * Management app tells libvirt to inject secrets
> 
>   * Libvirt injects secrets to QEMU
> 
>   * Management app tells libvirt to start QEMU CPUs
> 
>   * Libvirt tells QEMU to start CPUs
> 
> 
> Compare to a non-confidental VM
> 
>   * Guest owner tells mgmt app they want to launch a VM
> 
>   * Mgmt app decides what host the VM will be launched on
> 
>   * Mgmt app tells libvirt to launch VM with CPUs in running state
> 
>   * Libvirt launches QEMU with CPUs running
> 
> Now, of course the guest owner wouldn't be manually performing the
> earlier steps, they would want some kind of software to take care
> of this. No matter what, it still involves a large number of back
> and forth operations between the guest owner & mgmt app, and between
> the mgmt app and libvirt.
> 
> 
> One of libvirt's key jobs is to isolate mgmt apps from differences
> in behaviour of underlying hypervisor technologies, and we're failing
> at that job with SEV/SEV-ES, because the mgmt app needs to go through
> a multi-stage dance on every VM start, that is different from what
> they do with non-confidential VMs.
> 
> 
> It is especially unpleasant because there needs to be a "wait state"
> between when the app selects a host to deploy a VM on, and when it
> can actually start a VM. In essence the app needs to reserve capacity
> on a host ahead of time for a VM that will be created some arbitrary
> time later. This can have significant implications for the mgmt app
> architectural design that are not neccessarily easy to address, when
> they expect to just call virDomainCreate have the VM running in one
> step.
> 
> 
> It also harms interoperability to libvirt tools. For example if
> a mgmt tool like virt-manager/OpenStack created a VM using SEV,
> and you want to start it manually using a different tool like
> 'virsh', you enter a world of complexity and pain, due to the
> multi step dance required.
> 
> 
> AFAICT, in all of this, the mgmt app is really acting as a conduit
> and is not implementing any interesting logic. The clever stuff is
> all the responsibility of the guest owner, and/or whatever software
> for attestation they are using remotely.
> 
> 
> I think there is scope for enhancing libvirt, such that usage of
> SEV/SEV-ES has little-to-no burden for the management apps, and
> much less burden for guest owners. The key to achieving this is
> to define a protocol for libvirt to connect to a remote service
> to handle the launch measurements & secret acquisition. The guest
> owner can provide the address of a service they control (or trust),
> and libvirt can take care of all the interactions with it.
> 
> This frees both the user and mgmt app from having to know much
> about SEV/SEV-ES, with VM startup process being essentially the
> same as it has always been.
> 
> The sequence would look like
> 
>   * Guest owner tells attestation service they intend to
>     create a VM with a given UUID, policy, and any other
>     criteria such as cert of the cloud owner, valid OVMF
>     firmware hashes, and providing any needed  LUKS keys.
> 
>   * Guest owner tells mgmt app they want to launch a VM,
>     using attestation service at https://somehost/and/url
> 
>   * Mgmt app decides what host the VM will be launched on
> 
>   * Mgmt app tells libvirt to launch VM with CPUs in running state
> 
> 
> The next steps involve solely libvirt & the attestation service.
> The mgmt app and guest owner have done their work.
> 
>   * Libvirt contacts the service providing certificate chain
>     for the host to be used, the UUID of the guest, and any
>     other required info about the host.
> 
>   * Attestation service validates the cert chain to ensure
>     it belongs to the cloud owner that was identified previously
> 
>   * Attestation service generates a launch blob and puts it in
>     the response back to libvirt
> 
>   * Libvirt launches QEMU with CPUs paused
> 
>   * Libvirt gets the launch measurement and sends it to the
>     attestation server, with any other required info about the
>     VM instance
> 
>   * Attestation service validates the measurement
> 
>   * Attestation builds the secret table with LUKS keys
>     and puts it in the response back to libvirt
> 
>   * Libvirt injects the secret table to QEMU
> 
>   * Libvirt tells QEMU to start CPUs
> 
> 
> All the same exchanges of information are present, but the management
> app doesn't have to get involved. The guest owner also doesn't have
> to get involved except for a one-time setup step. The software the
> guest owner uses for attestation also doesn't have to be written to
> cope with talking to OpenStack, CNV and whatever other vendor specific
> cloud mgmt apps exist today. This will significantly reduce the burden
> if supporting SEV/SEV-ES launch measurement in libvirt based apps, and
> make SEV/SEV-ES guests more "normal" from a mgmt POV.
> 
> 
> What could this look like from POV of an attestation server API, if
> we assume HTTPS REST service with a simple JSON payload ...
> 
> 
>   * Guest Owner: Register a new VM to be booted:
> 
>     POST /vm/<UUID>  
> 
>      Request body:
> 
>        {
>           "scheme": "amd-sev",
>           "cloud-cert": "certificate of the cloud owner that signs the PEK",
>           "policy": 0x3,
>           "cpu-count": 3,
>           "firmware-hashes": [
>               "xxxx",
>               "yyyy",
>           ],
>           "kernel-hash": "aaaa",
>           "initrd-hash": "bbbb",
>           "cmdline-hash": "cccc",
>           "secrets": [
>               {
>                  "type": "luks-passphrase",
>                  "passphrase": "<blah>"
>               }
>            ]
>        }
> 
> 
> 
>   * Libvirt: Request permission to launch a VM on a host
> 
>      POST /vm/<UUID>/launch
> 
>      Request body:
> 
>       {
>          "pdh": "<blah>",
>          "cert-chain": "<blah>",
>          "cpu-id": "<CPU ID>",
>          ...other relevant bits...
>       }
> 
>      Service decides if the proposed host is acceptable
> 
>      Response body (on success)
> 
>       {
>          "session": "<blah>",
>          "owner-cert": "<blah>",
> 	 "policy": 3,
>       }
> 
> 
> 
>   * Libvirt: Request secrets to inject to launched VM
> 
>      POST /vm/<UUID>/validate
> 
>      Request body:
> 
>        {
>           "api-minor": 1,
>           "api-major": 2,
>           "build-id": 241,
>           "policy": 3,
>           "measurement": "<blah>",
>           "firmware-hash": "xxxx",
>           "cpu-count": 3,
>           ....other relevant stuff....
>        }
> 
>      Service validates the measurement...
> 
>      Response body (on success):
> 
>        {
>            "secret-header": "<blah>",
>            "secret-table": "<blah>",
>        }
> 
> 
> 
> So we can see there are only a couple of REST API calls we need to be
> able to define. If we could do that then creating a SEV/SEV-ES enabled
> guest with libvirt would not involve anything more complicated for the
> mgmt app that providing the URI of the guest owner's attestation service
> and an identifier for the VM. ie. the XML config could be merely:
> 
>     <launchSecurity type="sev">
>        <attestation vmid="57f669c2-c427-4132-bc7a-26f56b6a718c"
>                     service="http://somehost/some/url"/>
>     </launchSecurity>
> 
> And then involve virDomainCreate as normal with any other libvirt / QEMU
> guest. No special workflow is required by the mgmt app. There is a small
> extra task for the guest owner to register existance of their VM with the
> attestation service. Aside from that the only change to the way they
> interact with the cloud mgmt app is to provide the VM ID and URI for the
> attestation service. No need to learn custom APIs for each different
> cloud vendor, for dealing with fetching launch measurements or injecting
> secrets.
> 
> 
> Finally this attestation service REST protocol doesn't have to be something
> controlled or defined by libvirt. I feel like it could be a protocol that
> is defined anywhere and libvirt merely be one consumer of it. Other apps
> that directly use QEMU may also wish to avail themselves of it.
> 
> 
> All that really matters from libvirt POV is:
> 
>    - The protocol definition exist to enable the above workflow,
>      with a long term API stability guarantee that it isn't going to
>      changed in incompatible ways
> 
>    - There exists a fully open source reference implementation of sufficient
>      quality to deploy in the real world
> 
> I know https://github.com/slp/sev-attestation-server exists, but its current
> design has assumptions about it being used with libkrun AFAICT. I have heard
> of others interested in writing similar servers, but I've not seen code.
> 

Tobin has just released kbs-rs which has similar properties to what
you're proposing above, aiming to solve similar issues.  Better talk
with him before running into building yet another attestation server.

-Dov

> We are at a crucial stage where mgmt apps are looking to support measured
> boot with SEV/SEV-ES and if we delay they'll all go off and do their own
> thing, and it'll be too late, leading to  https://xkcd.com/927/.
> 
> Especially for apps using libvirt to manage QEMU, I feel we have got a
> few months window of opportunity to get such a service available, before
> they all end up building out APIs for the tedious manual workflow,
> reinventing the wheel.
> 
> Regards,
> Daniel