[Date Prev][Date Next] [Thread Prev][Thread Next]
Re: [edk2-rfc] [edk2-devel] RFC: design review for TDVF in OVMF
- From: "Laszlo Ersek" <lersek redhat com>
- To: "Yao, Jiewen" <jiewen yao intel com>, "rfc edk2 groups io" <rfc edk2 groups io>, "devel edk2 groups io" <devel edk2 groups io>
- Cc: "jejb linux ibm com" <jejb linux ibm com>, Brijesh Singh <brijesh singh amd com>, Tom Lendacky <thomas lendacky amd com>, "erdemaktas google com" <erdemaktas google com>, "cho microsoft com" <cho microsoft com>, "bret barkelew microsoft com" <bret barkelew microsoft com>, Jon Lange <jlange microsoft com>, Karen Noel <knoel redhat com>, Paolo Bonzini <pbonzini redhat com>, Nathaniel McCallum <npmccallum redhat com>, "Dr. David Alan Gilbert" <dgilbert redhat com>, "Ademar de Souza Reis Jr." <areis redhat com>
- Subject: Re: [edk2-rfc] [edk2-devel] RFC: design review for TDVF in OVMF
- Date: Thu, 3 Jun 2021 18:11:55 +0200
On 06/03/21 15:51, Yao, Jiewen wrote:
> Hi, All
> We plan to do a design review for TDVF in OVMF package.
> The TDVF Design slides for TinaoCore Design Review Meeting (Jun 11) is
> now available in blow link:
> The Bugzilla is https://bugzilla.tianocore.org/show_bug.cgi?id=3429
> You can have an offline review first. You comments will be warmly
> welcomed and we will continuously update the slides based on the
Resending my earlier comments in this mailing list thread, with the
feedback inserted at the proper spots that has been given in the
off-list thread since then:
*** Slides 4, 6, 7: the "one binary requirement".
(1) The presentation refers to "OvmfPkgX64.dsc" as "the one" on slide#4,
but then the explanation for the requirement, given on slide 7, speaks
about "common attestation interface".
I think we have a misunderstanding here. The "OvmfPkgX64.dsc" platform
indeed contains SEV, SEV-ES, and (in the future, will contain) SEV-SNP
support. In that sense, adding TDX support to the same platform should
be (hopefully) possible, at the cost of ugly gymnastics in the reset
But "OvmfPkgX64.dsc" is *already* different from the remotely attested
OVMF platform, namely "OvmfPkg/AmdSev/AmdSevX64.dsc".
The latter has some additional modules (secret PEIM and DXE driver), it
has the Grub binary built in, and -- probably most importantly -- it
trusts host-originated information less than "OvmfPkgX64.dsc".
For example, "OvmfPkg/AmdSev/AmdSevX64.dsc" has a dedicated
PlatformBootManagerLib instance, one that ignores the non-volatile UEFI
variables Boot#### and BootOrder, and ignores (thus far) the fw_cfg
originated kernel/initrd/cmdline as well.
It remains an "area of research" to see what else should be removed from
the traditional host-guest integration (which integration is usually
desirable for management and convenience), in the remotely-attested boot
scenario. See e.g.
My point is that the "one binary" that the slide deck refers to (i.e.,
OvmfPkgX64.dsc) might prove OK for utilizing the Intel TDX *technology*
in itself. Simply enabling OVMF + a guest OS to boot in a TDX domain.
But "OvmfPkgX64.dsc" is *not* the "one binary" that is suitable for
remote attestation, which has a much broader scope, involving multiple
computers, networking, deployment, guest-owner/host-owner separation,
whatever. For the latter, we needed a separate platform anyway, even
with only SEV in mind; that's why "OvmfPkg/AmdSev/AmdSevX64.dsc" exists.
*** Slides 8-9: general boot flow -- TDVF; TDVF Flow
I'm likely missing a big bunch of background here, so just a few
(2) Not sure what RTMR is, but it's associated with "Enable TrustedBoot"
-- so is a virtual TPM a hard requirement?
... Back from slide 10: "TCG measurement and event log framework w/o
TPM" -- that's curious.
[response from Dave Gilbert:]
> My reading of this is that the RTMR (and another set of similar
> registers) are a TDX thing that is like the PCRs from a TPM but
> without the rest of the TPM features; so you can do the one-way
> measurement into the RTMRs just like you do into a TPM PCR, and the
> measurements pop out somewhere in the TDX quote. Just like a TPM you
> need the event log to make any sense of how the final hashed value
> supposedly got to where it did.
[response from Erdem Aktas:]
> +1 to David on this. TDX provides 2 kinds of measurement registers:
> MRTDs and RTMRs
> section 10.1.2) . MRTDs are build-time measurement registers which are
> updated when TDX is being created. Once TDX is finalized (before the
> first run), the MRTDs are finalized and cannot be updated anymore. On
> the other hand, while the TD is running, TD can extend RTMRs through
> TDCALLs which will provide TPM PCR kind of capabilities.
... Back from slide 43: feel free to skip this now; I will comment in
more detail below.
(3) Prepare AcpiTable -- OVMF fetches ACPI tables from QEMU; is this a
new (firmware originated) table that is installed in addition, or does
it replace QEMU's tables?
... Ignore for now, will comment on the MADT stuff later.
(4) Does DMA management mean a different IOMMU protocol? That is going
to conflict with the SEV IOMMU protocol. Depexes in OVMF expect one or
zero IOMMU protocols to be present.
... Back from slide 40: feel free to skip this now; I'll comment on this
(5) Enumerate VirtIO -- virtio enumeration is PCI based on x86. But I
see no mention of PCI. If you mean VirtioMmioDeviceLib, that's no good,
because it does not support virtio-1.0, only virtio-0.9.5.
... Back from slide 42: I got my answer to this on slide 42, so don't
worry about this point.
(6) The PEI phase is skipped as a whole. I don't see how that can be
reasonably brought together with either "OvmfPkgX64.dsc" or
"OvmfPkg/AmdSev/AmdSevX64.dsc". I guess you can always modify SEC to
jump into DXE directly, but then why keep the PEI core and a bunch of
PEIMs in the firmware binary?
Also touched on by slide 9, TDVF Flow -- PEI is eliminated but SEC
becomes more heavy-weight.
Wouldn't this deserve a dedicated, separate platform DSC? The
8-bit/32-bit branching at the front of the reset vector is a smaller
complication in comparison.
Slide 6 references the mailing list archive:
and in that message, I wrote:
I'm doubtful that this is a unique problem ("just fix the reset
vector") the likes of which will supposedly never return during the
integration of SEV and TDX
where I said:
It's not lost on me that we're talking about ~3 instructions.
Let's keep a close eye on further "polymorphisms" that would require
The first 9 slides in the presentation introduce much-much more
intrusive problems than the "polymorphism" of the reset vector. Would I
be correct to say that my concern in the above messages was right? I
think I was only given a fraction of the necessary information when I
was asked "do you agree 'one binary' is viable".
[response from Erdem Aktas:]
> Let's not worry about this for now. We want the one binary solution
> for practical reasons and also for simplicity. In the end, we want to
> do what is right and good for everyone.
> Those are legit concerns and I think what Intel is trying to do (sorry
> for mind reading) is to discuss all those concerns and questions to
> make the right decision. I really appreciate their effort on preparing
> those slides and bringing it to the community to review.
> I will also read your comments more carefully and provide my thoughts
> on them.
> Sorry for being a little slow on this.
*** Slide 10 -- Key impact to firmware
(7) "CPUs are left running in long mode after exiting firmware" -- what
kind of "pen" are we talking about? Does a HLT loop still work?
(8) "I/O, MMIO, MSR accesses are different" -- those are implemented by
low-level libraries in edk2; how can they be configured dynamically?
... Back from slide 53: I'll comment on slide 53 separately; ignore
*** Slide 11 -- TDVF Image (1)
(9) CFV -- Configuration Firmware Volume (VarStore.fdf.inc),
containing SB keys -- how is this firmware volume populated (at build
time)? Is this a hexdump?
... Back from slide 16: it seems like CFV is a raw hexdump indeed; how
is that managed when keys change (at build time)?
(10) This slide (slide 11) basically describes an intrusive
reorganization of "OvmfPkgX64.fdf". I don't think I can agree to that.
While confidential computing is important, it is not relevant for many
users. Even if we don't cause outright regressions for existent setups,
the maintenance cost of the traditional OVMF platform will skyrocket.
The big bunch of areas that SEV-ES introduced to MEMFD is already a big
complication. I'd feel much better if we could isolate all that to a
dedicated "remote attested boot" firmware platform, and not risk the
functionality and maintenance of the traditional platform. I think this
ties in with my comment (1).
For example, seeing a configuration firmware volume (CFV) with secure
boot keys embedded, in the "usual" FDF, will confuse lots of people, me
included. In the traditional OVMF use case, we use a different method:
namely OvmfPkg/EnrollDefaultKeys, for "priming" a variable store
template, in the build environment.
Edk2 (and PI and UEFI) are definitely flexible enough for accommodating
TDX, but the existent, traditional OVMF platforms are a bad fit. In my
*** Slide 12: TDVF Image (2)
(11) "Page table should support both 4-level and 5-level page table"
As a general development strategy, I would suggest building TDX support
in small, well-isolated layers. 5-level paging is not enabled (has never
been tested, to my knowledge) with OVMF on QEMU/KVM, regardless of
confidential computing, for starters. If 5-level paging is a strict
requirement for TDX, then it arguably needs to be implemented
independently of TDX, at first. So that the common edk2 architecture be
at least testable on QEMU/KVM with 5-level paging enabled.
*** Slide 13:
(12) My comment is that the GUID-ed structure chain already starts at a
fixed GPA (the "AMD SEV option"). Ordering between GUID-ed structures is
irrelevant, so any TDX-specific structures should be eligible for
appending to, prepending to, or intermixing with, other (possibly
SEV-specific) structures. There need not be a separate entry point, just
(13) Regarding "4G-0x20[0x10] is OVMF AP reset vector (used in OVMF
implementation)" -- I think this is a typo: this "AP reset vector" is
*not* used in OVMF. To my knowledge, it is a vestige from the UefiCpuPkg
reset vector. In OVMF, APs are booted via MpInitLib (in multiple
firmware phases), using INIT-SIPI-SIPI, and the reset vector for the
APs, posited through those IPIs, is prepared in low RAM.
*** Slides 14 through 16:
I consider these TDVF firmware image internals, implementation details
-- do whatever you need to do, just don't interfere with existing
platforms / use cases. See my comment (10) above.
*** Slides 17-21:
(14) Again, a very big difference from traditional OVMF: APs never enter
SEC in traditional OVMF. I assume this new functionality is part of
TdxStartupLib (from slide 18) -- will there be a Null instance of that?
Last week I posted a 43-part patch series to edk2-devel, for separating
out the dynamic Xen enlightenments from the IA32, IA32X64, X64
platforms, in favor of the dedicated OvmfXen platform. TDX seems to
bring in incomparably more complications than Xen, and the OvmfPkg
maintainers have found even the Xen complications troublesome in the
If I had had access to all this information when we first discussed "one
binary" on the mailing list, I'd have never agreed to "one binary". I'm
OK with attempting one firmware binary for "confidential computing", but
that "one platform" cannot be "OvmfPkgX64.dsc".
Even if I make a comparison with just the "technology" (not the
remotely-attested deployment) of SEV and SEV-ES, as it is included in
"OvmfPkgX64.dsc", TDX is hugely more complicated and intrusive than
that. SEV proved possible to integrate into existing modules, into the
existing boot flow, maybe through the addition of some new drivers (such
as a new IOMMU protocol implementation, and some "clever" depexes). But
we never had to restructure the FD layout, eliminate whole firmware
phases, or think about multiprocessing in the reset vector or the SEC
In order to bring an example from the ARM world, please note that
platforms that use a PEI phase, and platforms that don't, are distinct
platforms. In ArmVirtPkg, two examples are ArmVirtQemu and
ArmVirtQemuKernel. The latter does not include the PEI Core.
*** Slides 22 through 34:
(15) All these extra tasks and complications are perfectly fine, as long
as they exist peacefully *separately* from the traditional ("legacy")
Honestly, in the virtual world, picking your firmware binary is easy.
The approach here reminds me of a physical firmware binary that includes
everything possible from "edk2-platforms/Features/Intel", just so it can
be deployed to any physical board imaginable. That's not how Intel
builds physical firmware, right? We have "edk2-platforms/Platform/Intel"
and "edk2-platforms/Silicon/Intel" with many-many separate DSC files.
*** Slide 35-36: DXE phase
(16) "Some DXE Drivers not allowed to load/start in Td guest -- Network
stack, RNG, ..."
Same comment as (several times) above. The Linuxboot project is a good
example for eliminating cruft from DXEFV (in fact, for eliminating most
of the DXE phase). In a TDX environment, why include drivers in the
firmware binary that are never used? Meanwhile, DXEFV in OVMF grows by a
MB every 1.5 years or so. Again, remove these drivers from the DSC/FDF
then, and it needs to be a separate platform.
(17) "Other DXE Phase drivers -- [...] AcpiPlatformDxe"
I'm not sure what this section is supposed to mean. Other DXE phase
drivers included, or excluded? Without AcpiPlatformDxe, the guest OS
will not see QEMU's ACPI content, and will almost certainly malfunction.
... Back from slide 48: ignore this for now, I'll comment in more detail
*** Slide 37: DXE Core
(18) says "SMM is not supported in Td guest" -- how is the variable
store protected from direct hardware (pflash) access from the guest OS?
Without SMM, the guest OS need not go through gRT->SetVariable() to
update authenticated non-volatile UEFI variables, and that undermines
Note that, while SEV-ES has the same limitation wrt. SMM, the
"OvmfPkg/AmdSev/AmdSevX64.dsc" platform doesn't even include the Secure
Boot firmware feature. For another example, the OVMF firmware binary in
RHEL that's going to support SEV-ES is such a build of "OvmfPkgX64.dsc"
that does not include the Secure Boot feature either.
But in this TDX slide deck, Secure Boot certificates are embedded in the
CFV (configuration firmware volume) -- see slide 11 and slide 16 --,
which suggests that this platform does want secure boot.
... Back from slide 48: I'm going to make *additional* comments on this,
when I'm at slide 48, too.
The rest of this slide (slide 37) looks reasonable (generic DXE Core
changes -- possibly PI spec changes too).
*** Slides 38 through 39:
These seem reasonable (TdxDxe assumes some responsibilities of
*** Slides 40 through 42:
*If* you really can implement TDX support for the IOMMU driver *this*
surgically, then I'm OK with it. The general logic in the IOMMU driver
was truly difficult to write, and I'd be seriously concerned if those
parts would have to be modified. Customizing just the page
encryption/decryption primitives for TDX vs. SEV is OK.
*** Slides 43 through 47:
(19) Slide 46 and slide 47 are almost identical. Please consolidate them
into a single slide.
(20) the TPM2 infrastructure in edk2 is baroque (over-engineered), in my
opinion. It has so many layers that I can never keep them in mind. When
we added TPM support to OVMF, I required commit messages that would help
us recall the layering. In particular, please refer to commit
0c0a50d6b3ff ("OvmfPkg: include Tcg2Dxe module", 2018-03-09). Here's an
TPM 2 consumer driver
Tpm2DeviceLib class: Tpm2DeviceLibTcg2 instance
TCG2 protocol interface
TCG2 protocol provider: Tcg2Dxe.inf driver
Tpm2DeviceLib class: Tpm2DeviceLibRouter instance
NULL class: Tpm2InstanceLibDTpm instance
(via earlier registration)
TPM2 chip (actual hardware)
The slide deck says that EFI_TD_PROTOCOL is supposed to reuse the
EFI_TCG2_PROTOCOL definition. If that's the case, then why don't we push
the TDX specifics (more or less: the replacement of PCRs with RTMR) down
to the lowest possible level?
Can we have "Tpm2InstanceLibTdxRtmr", plugged into the same Tcg2Dxe.inf
If not, can we have a new TdTcg2Dxe.inf driver, but make it so that it
install the same protocol as before (EFI_TCG2_PROTOCOL -- same old
protocol GUID)? Then DxeTpmMeasurementLib doesn't have to change.
As long as there is *at most* one EFI_TCG2_PROTOCOL instance published
in the protocol database, DxeTpmMeasurementLib should be fine. In SEV*
guests, the standard Tcg2Dxe driver provides that protocol. In TDX
guests, TdTcg2Dxe.inf should provide the protocol. Arbitration between
the two can be implemented with the pattern seen in the following
1 05db0948cc60 EmbeddedPkg: introduce EDKII Platform Has ACPI GUID
2 786f476323a6 EmbeddedPkg: introduce PlatformHasAcpiLib
3 65a69b214840 EmbeddedPkg: introduce EDKII Platform Has Device Tree
4 2558bfe3e907 ArmVirtPkg: add PlatformHasAcpiDtDxe
The basic idea is that Tcg2Dxe can have a depex on a new "running in
SEV" protocol GUID, and TdTcg2Dxe can have a depex on a new "running in
TDX" protocol GUID. A separate platform driver can install the proper
GUID -- possibly *neither* of those GUIDs.
And, we don't have to change the depex section of
"SecurityPkg/Tcg/Tcg2Dxe/Tcg2Dxe.inf" for this; we can implement a
library instance with an empty constructor, but a non-empty depex, and
then hook this lib instance into Tcg2Dxe.inf via module scope NULL lib
class override in the DSC file. Basically we could forcibly restrict
Tcg2Dxe's DEPEX by making it inherit the new DEPEX from the library.
*** Slide 48: DXE Phase -- Other Modules
Regarding IncompatiblePciDeviceSupportDxe: the proposed update sounds
plausible and simple enough.
(21) "AcpiPlatformDxe: Support for MADT/ACPI addition to report Td
Firmware-owned tables must not be installed from this driver.
Please refer to my "Xen removal" patch set again, for
<https://bugzilla.tianocore.org/show_bug.cgi?id=2122>, which I mention
above in point (14). As part of the Xen removal, the AcpiPlatformDxe
driver in OvmfPkg is significantly trimmed: all unused (dead) cruft is
removed, including any ACPI table templates that are built into the
OvmfPkg/AcpiPlatformDxe is responsible *solely* for implementing the
client side of the QEMU ACPI linker/loader.
If you need to prepare & install different ACPI tables, please do it
elsewhere, in another DXE driver. A bunch of other firmware modules do
that (NFIT, IBFT, BGRT, ...).
For example, the OvmfPkg/TdxDxe DXE_DRIVER is supposed to be launched
early in the DXE phase, via APRIORI section -- please consider
registering a protocol notify in that driver, for
EFI_ACPI_TABLE_PROTOCOL, and when it becomes available, install whatever
*extra* tables you need.
Note that if you need to *customize* an ACPI table that QEMU already
provides, then you will have to modify the ACPI generator on the QEMU
side. It is a design tenet between QEMU and OVMF that OVMF include no
business logic for either parsing or fixing up ACPI tables that QEMU
provides. AcpiPlatformDxe contains the minimum (which is already a whole
lot, unfortunately) that's necessary for implementing the QEMU ACPI
linker/loader client in the UEFI environment.
The slide deck mentions MADT, which is also known as the "APIC" table --
and indeed, QEMU does provide that. (See acpi_build_madt()
[hw/i386/acpi-common.c].) So if TDX needs MADT customizations, that
should go into QEMU.
Ouch, this is an unpleasant surprise.
First, if you know for a fact that pflash is not part of the *board* in
any TDX setup, then pulling
into the firmware platform is useless, as it is mutually exclusive with
(via dynamic means -- a dynamic PCD).
Note that the FDF file places QemuFlashFvbServicesRuntimeDxe in APRIORI
DXE when SMM_REQUIRE is FALSE. This driver checks for pflash presence,
and lets EmuVariableFvbRuntimeDxe perform its in-RAM flash emulation
only in case pflash is not found.
So this is again in favor of a separate platform -- if we know pflash is
never part of the board, then QemuFlashFvbServicesRuntimeDxe is never
needed, but you cannot remove it from the traditional DSC/FDF files.
Second, EmuVariableFvbRuntimeDxe consumes the PlatformFvbLib class, for
the PlatformFvbDataWritten() API (among other things). This lib class is
implemented by two instances in OvmfPkg, PlatformFvbLibNull and
EmuVariableFvbLib. The latter instance allows Platform BDS to hook an
event (for signaling) via "PcdEmuVariableEvent" into the
In old (very old) QEMU board configurations, namely those without
pflash, this (mis)feature is used by OVMF's PlatformBootManagerLib to
write out all variables to the EFI system partition in a regular file
called \NvVars, with the help of NvVarsFileLib, whenever
EmuVariableFvbRuntimeDxe writes out an emulated "flash" block. For this
purpose, the traditional OVMF DSC files link EmuVariableFvbLib into
But it counts as an absolute disaster nowadays, and should not be
revived in any platform. If you don't have pflash in TDX guests, just
accept that you won't have non-volatile variables. And link
PlatformFvbLibNull into EmuVariableFvbRuntimeDxe. You're going to need a
separate PlatformBootManagerLib instance anyway.
(We should have removed EmuVariableFvbRuntimeDxe a long time ago from
the traditional OVMF platforms, i.e. made pflash a hard requirement,
even when SMM is not built into the platform -- but whenever I tried
that, Jordan always shot me down.)
My point is: using EmuVariableFvbRuntimeDxe in the TDX platform may be
defensible per se, but we must be very clear that it will never provide
a standards-conformant service for non-volatile UEFI variables, and we
must keep as much of the \NvVars mess out of EmuVariableFvbRuntimeDxe as
possible. This will require a separate PlatformBootManagerLib instance
for TDX anyway (or maybe share PlatformBootManagerLibGrub with
Apart from the volatility aspect, let's assume we have this in-RAM
emulated "flash device", storing authenticated UEFI variables for Secure
Boot purposes. And we don't have SMM.
What protects this in-RAM variable store from tampering by the guest OS?
It's all guest RAM only, after all. What provides the privilege barrier
between the guest firmware and the guest OS?
*** Slide 50: Library
(23) Should we introduce Null instances for all (or most) new lib
classes here? Code size is a concern (static linking). If we extend a
common OvmfPkg module with new hook points, it's one thing to return
from that hook point early *dynamically*, but it's even better (given
separate platforms) to allow the traditional platform firmware to use a
Null lib instance, to cut out all the dead code statically.
*** Slides 51 through 52
*** Slide 53:
(24) It might be worth noting that BaseIoLibIntrinsic already has some
SEV enlightenment, as the FIFO IO port operations (which normally use
the REP prefix) are not handled on SEV. I don't have an immediate idea
why this might matter, we should just minimize code duplication if
*** Slides 54-56:
No comments, this stuff seems reasonable.
*** Slide 57: MpInitLib
I don't know enough to give a summary judgement.
All in all, I see the controversial / messy parts in the platform
bringup, and how all that differs from the traditional ("legacy") OVMF
platforms. I admit I *may* be biased in favor of SEV, possibly because
SEV landed first -- if you find signs of such a bias in my comments,
please don't hesitate to debunk those points. Yet my general impression
is that the early bringup stuff is significantly different from
everything before, and because of this, a separate platform is
Definitely separate from the traditional OVMF IA32, IA32X64, and X64
platforms, and *possibly* separate from the "remote attestation"
AmdSevX64.dsc platform. I would approach the TDX feature-set in complete
isolation (exactly how Intel commenced the work, if I understand
correctly), modulo obviously shareable / reusable parts, and then slowly
& gradually work on extracting / refactoring commonalities.
(But, given my stance on Xen for example, I could disagree even with the
latter, retroactive kind of unification -- it all boils down to shared
developer and user base. Component sharing should reflect the community
structure, otherwise maintenance will be a nightmare.)
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#76023): https://edk2.groups.io/g/devel/message/76023
Mute This Topic: https://groups.io/mt/83283616/1813853
Group Owner: devel+owner edk2 groups io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive redhat com]
[Date Prev][Date Next] [Thread Prev][Thread Next]