[edk2-devel] [PATCH V1 0/5] UEFI Variable SMI Reduction

Kubacki, Michael A michael.a.kubacki at intel.com
Thu Sep 26 22:35:36 UTC 2019


After thinking a bit more, this test is based on a side effect of implementation. I'd prefer to avoid this being the sole cause of additional complexity in an already complex driver. Are one of these two options acceptable?

1. Use the Runtime Services QueryVariableInfo () API

In this patch series, this is still implemented to trigger an SMI as invocations occur very rarely outside mainstream GetVariable () and GetNextVariableName () usage so performance is not a major concern. This allows it to continue to use the validated path to the converged implementation in Variable.c

2. Include a SMM driver in OVMF specifically for exercising the edk2 SMM flow.

A SMM driver in OVMF (included if SMM_REQUIRE is TRUE) could register a handler that simply returns or has something like a fixed 1ms delay.

Thanks,
Michael

> -----Original Message-----
> From: Kubacki, Michael A
> Sent: Thursday, September 26, 2019 1:29 PM
> To: Laszlo Ersek <lersek at redhat.com>; devel at edk2.groups.io
> Cc: Bi, Dandan <dandan.bi at intel.com>; Ard Biesheuvel
> <ard.biesheuvel at linaro.org>; Dong, Eric <eric.dong at intel.com>; Gao, Liming
> <liming.gao at intel.com>; Kinney, Michael D <michael.d.kinney at intel.com>;
> Ni, Ray <ray.ni at intel.com>; Wang, Jian J <jian.j.wang at intel.com>; Wu, Hao
> A <hao.a.wu at intel.com>; Yao, Jiewen <jiewen.yao at intel.com>
> Subject: RE: [PATCH V1 0/5] UEFI Variable SMI Reduction
> 
> Hi Laszlo,
> 
> In short, it would not complicate the code beyond a reasonable level to
> support the runtime cache with a FeaturePCD.
> 
> I had considered this but dismissed it with lack of a practical use case (which
> you provided) that could justify adding another configuration option to the
> variable driver. The SMM communication buffer preparation code and SMI
> handler code for GetVariable () and GetNextVariableName () would have to
> be added back which I'd prefer eliminating maintenance of if possible.
> 
> The SMM cache is used to serve SMM callers.
> 
> Do others have an opinion or suggestion? If not, I'll proceed with adding a
> FeaturePCD.
> 
> Thanks,
> Michael
> 
> > -----Original Message-----
> > From: Laszlo Ersek <lersek at redhat.com>
> > Sent: Thursday, September 26, 2019 11:24 AM
> > To: Kubacki, Michael A <michael.a.kubacki at intel.com>;
> > devel at edk2.groups.io
> > Cc: Bi, Dandan <dandan.bi at intel.com>; Ard Biesheuvel
> > <ard.biesheuvel at linaro.org>; Dong, Eric <eric.dong at intel.com>; Gao,
> > Liming <liming.gao at intel.com>; Kinney, Michael D
> > <michael.d.kinney at intel.com>; Ni, Ray <ray.ni at intel.com>; Wang, Jian J
> > <jian.j.wang at intel.com>; Wu, Hao A <hao.a.wu at intel.com>; Yao, Jiewen
> > <jiewen.yao at intel.com>
> > Subject: Re: [PATCH V1 0/5] UEFI Variable SMI Reduction
> >
> > Hello Michael,
> >
> > On 09/26/19 06:50, Michael Kubacki wrote:
> > > REF:https://bugzilla.tianocore.org/show_bug.cgi?id=2220
> > >
> > > Overview
> > > ---------
> > > This patch series reduces SMM usage when using
> > VariableSmmRuntimeDxe
> > > with VariableSmm. It does so by eliminating SMM usage for runtime
> > > service GetVariable () and GetNextVariableName () invocations. Most
> > > UEFI variable usage in typical systems after the variable store is
> > > initialized (e.g. manufacturing boots) is due to GetVariable ( ) and
> > > GetNextVariableName () not SetVariable (). GetVariable () calls can
> > > regularly exceed 100 per boot while SetVariable () calls typically
> > > remain less than 10 per boot. By focusing on the common case, the
> > > majority of overhead associated with SMM can be avoided while still
> > > using existing and proven code for operations such as variable
> > > authentication that require an isolated execution environment.
> > >
> > >  * Advantage: Reduces overall system SMM usage
> > >  * Disadvantage: Requires more Runtime data memory usage
> >
> > In a perfect world, I would carefully scrutinize this patch set, and
> > respond with comments. In the real world, I have hardly enough time to
> > read the blurb :/ So I'll have to defer to the other reviewers on this patch.
> >
> > I'd like to spell out another "disadvantage" however. Admittedly it's
> > quite a corner case.
> >
> > The disadvantage in my case is that, by eliminating SMM from variable
> > *reads*, OVMF will lose its simplest method to exercise the SMM driver
> > stack. Namely, right now, if you boot OVMF (built with -D
> > SMM_REQUIRE), and at root prompt in the Linux guest, you run:
> >
> > # taskset -c 0 efibootmgr
> > # taskset -c 1 efibootmgr
> >
> > then you very easily test the SMM machinery (through Boot####,
> > BootOrder, BootNext variable *reads*).
> >
> > In addition, the "taskset" commands above force the guest Linux kernel
> > to initiate the GetVariable runtime service call -- and therefore
> > entering SMM -- on CPU#0 (BSP) vs. CPU#1 (AP).
> >
> > This difference (that is, BSP vs. AP being used for the runtime
> > service
> > call) used to expose *extreme* timing and stability differences in the
> > edk2 SMM stack, dependent on the SMI delivery method used.
> >
> > To this day, the above two commands remain part of our stock
> > regression- tests for the health of the SMM stack -- the commands are
> > executed in the guest OS both after normal boot, and after S3 resume.
> >
> > https://github.com/tianocore/tianocore.github.io/wiki/Testing-SMM-with
> > - QEMU,-KVM-and-libvirt#uefi-variable-access-test
> >
> > Therefore, would it be possible to make the feature dependent on a new
> > FeaturePCD?
> >
> > Or, if that would complicate the code too much, perhaps new module INF
> > files could be introduced (library instances, or even drivers) that a
> > platform could choose to select in DSC files, perhaps dependent on a -D
> build flag.
> > There could be a set of INF files for the current behavior, and
> > another set of INF files for the "read cache" behavior. And the
> > related C source files would not have to be littered with
> >
> >   if (FeaturePcdGet (...)) {
> >     //
> >     // read cache
> >     //
> >   } else {
> >     //
> >     // traditional behavior
> >     //
> >   }
> >
> > Of course, with the read cache feature, SMM entry could still be
> > forced in OVMF through non-volatile variable *writes* -- but writes
> > are not without side-effects on the varstore, and they depend on extra
> > conditions relative to reads.
> >
> > You mention "SMM cache" below, and I'm not entirely clear when exactly
> > that would be used, in favor of the "runtime cache". It seems that the
> > "SMM cache" would primarily serve SMM callers. If the FeaturePCD could
> > be used for forcing the use of "SMM cache" for the normal GetVariable
> > runtime service too, I think that might cover my use case. I don't
> > need GetVariable to access flash, i.e. caching per se is fine; I'd
> > just need GetVariable to continue exercising the SMM stack *in OVMF*
> > -- even if that path is slower than desirable, for physical platforms.
> >
> > Thanks!
> > Laszlo
> >
> > > Initial Performance Observations
> > > ---------------------------------
> > >  * With these proposed changes, an Intel Atom based SoC saw
> > > GetVariable
> > ( )
> > >    time for an existing variable reduce from ~220us to ~5us.
> > >
> > > Major Changes
> > > --------------
> > >  1. Two UEFI variable caches will be maintained.
> > >      a. "Runtime Cache" - Maintained in VariableSmmRuntimeDxe. Used
> > > to
> > serve
> > >          runtime service GetVariable () and GetNextVariableName () callers.
> > >      b. "SMM cache" - Maintained in VariableSmm to service SMM
> > GetVariable ()
> > >          and GetNextVariableName () callers.
> > >          i. A cache in SMRAM is retained so SMM modules do not
> > > operate on
> > data
> > >             outside SMRAM.
> > >  2. A new UEFI variable read and write flow will be used as described
> below.
> > >
> > > At any given time, the two caches would be coherent. On a variable
> > > write, the runtime cache is only updated after validation in SMM
> > > and, in the case of a non-volatile UEFI variable, the variable must
> > > also be successfully written to non-volatile storage.
> > >
> > > Prior RFC Feedback Addressed
> > > -----------------------------
> > > RFC sent Sept. 5, 2019: https://edk2.groups.io/g/devel/message/46939
> > >
> > > 1. UEFI variable data retrieval from a ring 0 buffer
> > >
> > >    A common concern with this proposed set of changes is the
> > > potential
> > security
> > >    threat presented by serving runtime services callers from a ring 0
> memory
> > >    buffer of EfiRuntimeServicesData type. This conclusion was that
> > > this
> > change
> > >    does not fundamentally alter the attack surface. The UEFI
> > > variable
> > Runtime
> > >    Services are invoked from ring 0 and the data already travels through
> ring
> > >    0 buffers (such as the SMM communicate buffer) to reach the caller.
> Even
> > >    today if ring 0 is assumed to be malicious, the malicious code
> > > may keep
> > one
> > >    AP in a loop to monitor the communication data, when the BSP gets an
> > >    (authenticated) variable. When the communication buffer is
> > > updated and
> > the
> > >    status is set to EFI_SUCCESS, the AP may modify the communication
> > buffer
> > >    contents such the tampered data is returned to the BSP caller. Or an
> > >    interrupt handler on the BSP may alter the communication buffer
> > contents
> > >    before the data is returned to the caller. In summary, this was
> > > not found
> > to
> > >    introduce any attack not possible today.
> > >
> > > 2. VarCheckLib impact
> > >
> > >    VarCheckLib plays a role in SetVariable () calls. This patch series only
> > >    changes GetVariable () behavior. Therefore, VarCheckLib is expected to
> > >    have no impact due to these changes.
> > >
> > > Testing Performed
> > > ------------------
> > > This code was tested with the master branch of edk2 on an Intel Kaby
> > > Lake U Intel Whiskey Lake U reference validation platform. The set
> > > of tests performed
> > > included:
> > >
> > > 1. Boot from S5 to Windows 10 OS with SMM variables enabled.
> > > 2. Boot from S5 to Ubuntu 18.04.1 LTS with SMM variable enabled.
> > > 3. Boot from S5 to EFI shell with DXE variables enabled.
> > > 4. Dump UEFI variable store at shell with dmpstore to verify contents.
> > > 5. Dump NvStorage FV from SPI flash after boot to verify contents
> written.
> > > 6. Dump UEFI variable statistics with VariableInfo at shell.
> > > 7. Boot with emulated variables enabled.
> > > 8. Cycles of adding and deleting a UEFI variable to verify cache results.
> > > 9. Set OsIndications to stop at FW UI to verify cache load of non-volatile
> > >    contents.
> > >
> > > Why Keep SMM on Variable Writes
> > > --------------------------------
> > >  * SMM provides a ubiquitous isolated execution environment in x86 for
> > >    authenticated UEFI variables.
> > >  * BIOS region SPI flash write restrictions to SMM in platforms today can
> > >    be retained.
> > >
> > > Today's UEFI Variable Cache (for reference)
> > > --------------------------------------------
> > >  * Maintained in SMRAM via VariableSmm.
> > >  * A "write-through" cache of variable data in the form of a UEFI variable
> > >    store.
> > >  * Non-volatile and volatile variables are maintained in separate buffers
> > >   (variable stores).
> > >
> > > Runtime & SMM Cache Coherency
> > > ------------------------------
> > > The non-volatile cache should always accurately reflect non-volatile
> > > storage contents (done today) and the "SMM cache" and "Runtime
> cache"
> > > should always be coherent on access. The runtime cache is updated by
> > VariableSmm.
> > >
> > > Updating both caches from within a SMM SetVariable () operation is
> > > fairly straightforward but a race condition can occur if an SMI
> > > occurs during the execution of runtime code reading from the runtime
> cache.
> > > To handle this case, a runtime cache read lock is introduced that
> > > explicitly moves pending updates from SMM to the runtime cache if an
> > > SMM update occurs while the runtime cache is locked. Note that it is
> > > not expected a Runtime services call will interrupt SMM processing
> > > since all
> > CPU cores rendezvous in SMM.
> > >
> > > New Key Elements for Coherence
> > > -------------------------------
> > > Runtime DXE (VariableSmmRuntimeDxe)
> > >  1. RuntimeCacheReadLock - A global lock used to lock read access to the
> > >                            runtime cache.
> > >  2. RuntimeCachePendingUpdate - A global flag used to notify runtime
> > > code
> > of a
> > >                                 pending cache update in SMM.
> > >
> > > SMM (VariableSmm)
> > >  1. FlushRuntimeCachePendingUpdate SMI - A SW SMI handler that
> > synchronizes
> > >                                          the runtime cache buffer with the SMM
> > >                                          cache buffer.
> > >
> > > Proposed Runtime DXE Read Flow
> > > -------------------------------
> > >  1. Wait for RuntimeCacheReadLock to be free  2. Acquire
> > > RuntimeCacheReadLock  3. If RuntimeCachePendingUpdate flag (rare) is
> > > set then:
> > >      3.a. Trigger FlushRuntimeCachePendingUpdate SMI
> > >      3.b. Verify RuntimeCachePendingUpdate flag is cleared  4.
> > > Perform read from RuntimeCache  5. Release RuntimeCacheReadLock
> > >
> > > Proposed FlushRuntimeCachePendingUpdate SMI
> > > --------------------------------------------
> > >  1. If RuntimeCachePendingUpdate flag is not set:
> > >      1.a. Return
> > >  2. Copy the data at RuntimeCachePendingOffset of
> > RuntimeCachePendingLength to
> > >     RuntimeCache
> > >  3. Clear the RuntimeCachePendingUpdate flag
> > >
> > > Proposed SMM Write Flow
> > > ------------------------
> > >  1. Perform variable authentication and non-volatile write. If either fail,
> > >     return an error to the caller.
> > >  2. If RuntimeCacheReadLock is set then:
> > >      2.a. Set RuntimeCachePendingUpdate flag
> > >      2.b. Update RuntimeCachePendingOffset and
> > RuntimeCachePendingLength to
> > >           cover the a superset of the pending chunk (for simplicity, the
> > >           entire variable store is currently synchronized).
> > > 3. Else:
> > >      3.a. Update RuntimeCache
> > > 4. Update SmmCache
> > >      - Note: RT read cannot occur during SMI processing since all cores are
> > >              locked in SMM.
> > >
> > > Cc: Dandan Bi <dandan.bi at intel.com>
> > > Cc: Ard Biesheuvel <ard.biesheuvel at linaro.org>
> > > Cc: Eric Dong <eric.dong at intel.com>
> > > Cc: Laszlo Ersek <lersek at redhat.com>
> > > Cc: Liming Gao <liming.gao at intel.com>
> > > Cc: Michael D Kinney <michael.d.kinney at intel.com>
> > > Cc: Ray Ni <ray.ni at intel.com>
> > > Cc: Jian J Wang <jian.j.wang at intel.com>
> > > Cc: Hao A Wu <hao.a.wu at intel.com>
> > > Cc: Jiewen Yao <jiewen.yao at intel.com>
> > > Signed-off-by: Michael Kubacki <michael.a.kubacki at intel.com>
> > >
> > > Michael Kubacki (5):
> > >   MdeModulePkg/Variable: Consolidate common parsing functions
> > >   MdeModulePkg VariableInfo: Always consider RT DXE and SMM stats
> > >   MdeModulePkg/Variable: Add RT GetVariable() cache support
> > >   MdeModulePkg/Variable: Add RT GetNextVariableName() cache
> support
> > >   MdeModulePkg/VariableSmm: Remove unused SMI handler functions
> > >
> > >
> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeDxe.inf
> > |   6 +
> > >  MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmm.inf
> |
> > 6 +
> > >
> >
> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmmRuntimeDxe.i
> > nf
> > > |  32 +-
> >
> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableStandaloneMm.inf
> > |  11 +
> > >  MdeModulePkg/Include/Guid/SmmVariableCommon.h                        |  33
> +-
> > >  MdeModulePkg/Universal/Variable/RuntimeDxe/Variable.h                |
> 158
> > +---
> > >  MdeModulePkg/Universal/Variable/RuntimeDxe/VariableNonVolatile.h
> > |  25 +
> > >  MdeModulePkg/Universal/Variable/RuntimeDxe/VariableParsing.h
> |
> > 342 ++++++++
> > >
> > MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeCache.h
> > |  47 ++
> > >  MdeModulePkg/Application/VariableInfo/VariableInfo.c                 |  37 +-
> > >  MdeModulePkg/Universal/Variable/RuntimeDxe/Variable.c                | 828
> > ++------------------
> > >  MdeModulePkg/Universal/Variable/RuntimeDxe/VariableExLib.c           |
> > 11 +-
> > >  MdeModulePkg/Universal/Variable/RuntimeDxe/VariableNonVolatile.c
> > |  28 +
> > >  MdeModulePkg/Universal/Variable/RuntimeDxe/VariableParsing.c         |
> > 816 +++++++++++++++++++
> > >
> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeCache.c
> > | 153 ++++
> > >  MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmm.c             |
> > 213 +++--
> > >
> >
> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmmRuntimeDxe.
> > c   | 726 +++++++++++++----
> > >  17 files changed, 2298 insertions(+), 1174 deletions(-)  create
> > > mode
> > > 100644
> > > MdeModulePkg/Universal/Variable/RuntimeDxe/VariableNonVolatile.h
> > >  create mode 100644
> > > MdeModulePkg/Universal/Variable/RuntimeDxe/VariableParsing.h
> > >  create mode 100644
> > >
> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeCache.h
> > >  create mode 100644
> > > MdeModulePkg/Universal/Variable/RuntimeDxe/VariableNonVolatile.c
> > >  create mode 100644
> > > MdeModulePkg/Universal/Variable/RuntimeDxe/VariableParsing.c
> > >  create mode 100644
> > >
> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeCache.c
> > >


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48141): https://edk2.groups.io/g/devel/message/48141
Mute This Topic: https://groups.io/mt/34295308/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-





More information about the edk2-devel-archive mailing list