[EXTERNAL] Re: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
Bret Barkelew via groups.io
bret.barkelew=microsoft.com at groups.io
Wed Oct 7 17:48:18 UTC 2020
Laszlo,
Thanks! We did see your proposal and discussed it internal, but concluded it was too complex a pattern to replicate throughout the code and would rather settle on a consistent solution. We’d rather keep pursuing this TPL discussion and it sounds like there’s some appetite outside of our team, and a possible path forward with PCDs.
I do really appreciate the detailed proposal and write-up, though!
Mike,
Regarding your PCD approach, is that something I could help you write up a PoC for?
- Bret
From: Laszlo Ersek<mailto:lersek at redhat.com>
Sent: Wednesday, October 7, 2020 6:05 AM
To: Bret Barkelew<mailto:Bret.Barkelew at microsoft.com>
Cc: devel at edk2.groups.io<mailto:devel at edk2.groups.io>; Kinney, Michael D<mailto:michael.d.kinney at intel.com>; Andrew Fish<mailto:afish at apple.com>
Subject: [EXTERNAL] Re: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
Hi Bret,
On 10/07/20 02:23, Bret Barkelew via groups.io wrote:
> As many will be aware, Im in the final stages of having Variable
> Policy ready for commit. However, after moving the
Lock event
> back to EndOfDxe, this exposed a race condition in variable services.
>
> A quick synopsis of the problem:
>
> * Previously, MorLock abused a privileged position by being
> tightly coupled to Variable Services, and its lock callback was
> called directly so that it could be strongly ordered with the
> internal property lock that disables future RequestToLock calls.
>
> * VariablePolicy attempted to decouple this (without realizing it
> was a problem) and go through formalized interfaces that could
> ultimately be broken out of Variable Services altogether.
>
> * However, doing so triggered the race condition, causing an
> ASSERT when MorLock attempted to lock its variables.
>
> * I current have a reimplementation of the strong ordering
> workaround that can be previewed at the link below. I have
> tested that it works the same as the OLD system.
>
> * Take a stab at solving the lock ordering problem ·
> corthon/edk2 at e7d0164
> (github.com)<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcorthon%2Fedk2%2Fcommit%2Fe7d0164c8263b1fbfb8b4e289851fbedaa8997f1&data=04%7C01%7Cbret.barkelew%40microsoft.com%7C3ab943e1b943407df03308d86ac1a467%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637376727204151506%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oslzgrrWjmtYotlmSnTgiB74JVogjttS3Rvmn0s7%2BGY%3D&reserved=0>
>
> However, replacing one bad design with another is not what this
> community is about (when we can help it), so wed like to take a
> moment to revisit a conversation that has come up before: expanding
> the number of supported TPL levels.
Have you considered my suggestion at
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fedk2.groups.io%2Fg%2Fdevel%2Fmessage%2F65576&data=04%7C01%7Cbret.barkelew%40microsoft.com%7C3ab943e1b943407df03308d86ac1a467%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637376727204151506%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2Fhgd6pDv69ME2QFzqECVwU6utTzVbqXaVrWs2DOZJTw%3D&reserved=0> (alternate link:
<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmid.mail-archive.com%2F645bb333-c3f4-0acd-f0e8-b783149bec18%40redhat.com&data=04%7C01%7Cbret.barkelew%40microsoft.com%7C3ab943e1b943407df03308d86ac1a467%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637376727204151506%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=RuymdfGtLkMcp%2FGq31VFRg9IePLTlrB4ZC3AAzMlUfo%3D&reserved=0>)?
The idea needs no services or guarantees other than what are already
specified in UEFI.
Originally, I attempted to write up my proposal in generic terms. But
here it is again, with the specifics substituted:
(1) The driver that provides the Variable Policy Engine -- that is, the
Variable Services Driver, AIUI -- defines (standardizes) a new,
well-known protocol GUID.
If the protocol is present in the protocol database, that means that
LockVariablePolicy() must not be performed by the Variable Services
Driver in its EndOfDxe handler.
Let's call this protocol
EDKII_KEEP_VARIABLE_POLICY_UNLOCKED_PROTOCOL.
There is no need for an actual protocol structure; it can be a null
protocol.
(2) The driver that intends to lock down the
"MemoryOverwriteRequestControlLock" variable at EndOfDxe, through
the Variable Policy Engine, produces an instance of
EDKII_KEEP_VARIABLE_POLICY_UNLOCKED_PROTOCOL, in its entry point
function.
(Note that this "MorLock owner" driver may be different from, or
identical to, the driver that implements the Variable Policy Engine
itself, i.e., the Variable Services Driver.)
The protocol instance can be the sole protocol on a new handle, or
it can be installed on a long-term handle (such as "gImageHandle" of
the MorLock owner).
(3) In its EndOfDxe handler, the MorLock owner uninstalls
EDKII_KEEP_VARIABLE_POLICY_UNLOCKED_PROTOCOL (potentially destroying
the handle as well).
This uninstallation is safe because the highest TPL that said
EndOfDxe handler may run at is TPL_NOTIFY, and uninstalling
protocols is safe at <=TPL_NOTIFY. Memory allocation services are
also explicitly safe at <=TPL_NOTIFY.
(4) The Variable Services Driver factors the LockVariablePolicy()
invocation out of its EndOfDxe handler to a new function.
The new function -- consisting of locking the variable policy only
-- has the function signature of a typical UEFI event notification
function.
Let's call this function LockVariablePolicyWhenSignaled().
(5) In its entry point function, the Variable Services Driver creates a
new event (outside of any event group), for which
LockVariablePolicyWhenSignaled() is the notification function. The
TPL is TPL_CALLBACK. Let's call the event LockVariablePolicyEvent.
(6) In the Variable Services Driver's EndOfDxe handler, the original,
open-coded LockVariablePolicy() call is replaced with the following
logic:
- If EDKII_KEEP_VARIABLE_POLICY_UNLOCKED_PROTOCOL exists in the
protocol database (according to gBS->LocateProtocol()), then the
Variable Services Driver's EndOfDxe handler *signals*
LockVariablePolicyEvent.
(This is safe because gBS->LocateProtocol() is safe to call at up
to and including TPL_NOTIFY, and gBS->SignalEvent() is safe even
at TPL_HIGH_LEVEL.)
- Otherwise, the Variable Services Driver's EndOfDxe *calls*
LockVariablePolicyWhenSignaled() with a normal function call.
It's supposed to work like this:
- If the MorLock owner is not built into the firmware at all, it won't
install EDKII_KEEP_VARIABLE_POLICY_UNLOCKED_PROTOCOL. Thus, the
Variable Services Driver directly calls LockVariablePolicy() from its
original EndOfDxe handler. That is, no change in behavior.
- If the MorLock owner's EndOfDxe handler completes first (locking down
the MorLock variable through the variable policy, and then
uninstalling EDKII_KEEP_VARIABLE_POLICY_UNLOCKED_PROTOCOL), then the
Variable Services Driver's EndOfDxe handler will again not see
EDKII_KEEP_VARIABLE_POLICY_UNLOCKED_PROTOCOL. So goto previous point.
- Multiple drivers similar to the MorLock owner may coexist; multiple
EDKII_KEEP_VARIABLE_POLICY_UNLOCKED_PROTOCOL instances may coexist
(all NULL interfaces, but on different handles). If there is at least
one such protocol instance, gBS->LocateProtocol() in the Variable
Services Driver's EndOfDxe handler will find the first one, and
postpone the LockVariablePolicy() call -- it will signal
LockVariablePolicyEvent.
- When the Variable Services Driver's EndOfDxe handler signals
LockVariablePolicyEvent, LockVariablePolicyWhenSignaled() will be
enqueued at TPL_CALLBACK. Note that, given that the Variable Services
Driver's EndOfDxe handler is running at the moment, the EndOfDxe event
group has been signaled previously. And that means that *all* events
in the EndOfDxe event group have been signaled, and *all* their
notification functions have been enqueued too (in some unspecified
order, except for the specified dispatch ordering between TPL_NOTIFY
and TPL_CALLBACK).
So when LockVariablePolicyEvent is signaled,
LockVariablePolicyWhenSignaled() is enqueued *after* all the other --
already enqueued -- EndOfDxe notification functions, *including* the
EndOfDxe handler of the MorLock owner.
That's because
(a) LockVariablePolicyEvent uses TPL_CALLBACK, so the already pending
TPL_NOTIFY notifications will be dispatched before it of course,
and
(b) regarding notify functions enqueued previously at the same TPL
(=TPL_CALLBACK), functions in any given TPL queue are queued /
dispatched in FIFO order.
This means that LockVariablePolicyWhenSignaled() will be dispatched
*after* the MorLock owner's EndOfDxe handler, *regardless* of the
relative dispatch order between the Variable Services Driver's
EndOfDxe handler and the MorLock owner's EndOfDxe handler.
- By the time LockVariablePolicyWhenSignaled() runs, it could (and
should) ASSERT() that EDKII_KEEP_VARIABLE_POLICY_UNLOCKED_PROTOCOL
does not exist.
Thanks
Laszlo
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#65995): https://edk2.groups.io/g/devel/message/65995
Mute This Topic: https://groups.io/mt/77367547/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/edk2-devel-archive/attachments/20201007/9116c160/attachment.htm>
More information about the edk2-devel-archive
mailing list