[edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering

Laszlo Ersek lersek at redhat.com
Mon Oct 19 18:44:29 UTC 2020


On 10/15/20 19:03, Bret Barkelew via groups.io wrote:
> I looked over it and I think that DPC solves the mirror image to the problem we’re trying to solve (and behaves in a fashion very similar to DelayedDispatch for PEI).
> 
> However, while studying it, I thought of a proposal that may be a balance between the complexity of Laszlo’s and the current “fixed” ordering.
> 
> What if we register (and encourage anyone who needs it to register) the MorLock callback on both EndOfDxe AND a new VarPolReadyToLock event. VarPolReadyToLock is signaled in the EndOfDxe callback which locks the VarPolicy. In the MorLock callback, we close both even registrations whenever the first one completes. In this pattern, we may have to register the MorLock as TPL_NOTIFY for the VarPolReadyToLock callback.

Right; unconditionally deferring the policy locking (rather than
conditionally, per my proposal) does the same job, and it's simpler to code.

You shouldn't even need to register MorLock for VarPolReadyToLock:
- At EndOfDxe, lock the MorLock variable and signal VarPolReadyToLock
(should be queued at TPL_CALLBACK),
- in the VarPolReadyToLock callback, only lock the policy.

AIUI, this is basically a special case / simplification of my proposal,
with the "delay the locking" protocol instance always *assumed* present
at EndOfDxe (and therefore never actually *produced* in the protocol
database).

Thanks
Laszlo


> 
> Or, we could formalize the registration with an official protocol interface and even pass a pointer to the VarPol interface in the callback when invoked.
> 
> Upon further review, I could also be open to Laszlo’s approach if the whole design could be reduced to a library for reuse. It’s not something I would want to be copying and pasting around the code.
> 
> - Bret
> 
> From: Bret Barkelew via groups.io<mailto:bret.barkelew=microsoft.com at groups.io>
> Sent: Tuesday, October 13, 2020 11:37 AM
> To: Kinney, Michael D<mailto:michael.d.kinney at intel.com>; devel at edk2.groups.io<mailto:devel at edk2.groups.io>; Andrew Fish<mailto:afish at apple.com>
> Subject: [EXTERNAL] Re: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> 
> I’ll try to take a look at that today. Thanks!
> 
> - Bret
> 
> From: Kinney, Michael D<mailto:michael.d.kinney at intel.com>
> Sent: Tuesday, October 13, 2020 11:22 AM
> To: Bret Barkelew<mailto:Bret.Barkelew at microsoft.com>; devel at edk2.groups.io<mailto:devel at edk2.groups.io>; Andrew Fish<mailto:afish at apple.com>; Kinney, Michael D<mailto:michael.d.kinney at intel.com>
> Subject: [EXTERNAL] RE: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> 
> Hi Bret,
> 
> The concept of making the platform lock point configurable seems like a good idea from a platform porting perspective.  If we had that feature would that make this better?
> 
> Have you studied the features DPC provides?  It is functionally similar to having NOTIFY-1 and CALLBACK-1 TPL levels.
> 
> https://github.com/tianocore/edk2/blob/master/NetworkPkg/DpcDxe/Dpc.c<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftianocore%2Fedk2%2Fblob%2Fmaster%2FNetworkPkg%2FDpcDxe%2FDpc.c&data=04%7C01%7Cbret.barkelew%40microsoft.com%7C22b77b20547945404a0208d86fa707f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637382110480463333%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=CZHMkcqc81%2BnNovzWaS3uiUiv5i7HVWUDt3dc88%2Fkbw%3D&reserved=0>
> 
> Thanks,
> 
> Mike
> 
> 
> From: Bret Barkelew <Bret.Barkelew at microsoft.com>
> Sent: Tuesday, October 13, 2020 10:57 AM
> To: devel at edk2.groups.io; Kinney, Michael D <michael.d.kinney at intel.com>; Andrew Fish <afish at apple.com>
> Subject: RE: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> 
> The problem with the other solution is that it doesn’t give the platform any flexibility to move which event the lock is attached to. For EDK2 default, we’re making it EndOfDxe, because that matches most platforms’ threat models. For Mu, we use ReadyToBoot.
> 
> The current “hack” makes that more difficult without multiple patches that we have to carry to remove the “fixed” ordering of the events.
> 
> That said, I do think I’ll clean that up and submit it, unless Sean reads this and stops me. 😉
> 
> - Bret
> 
> From: Michael D Kinney via groups.io<mailto:michael.d.kinney=intel.com at groups.io>
> Sent: Tuesday, October 13, 2020 8:05 AM
> To: Bret Barkelew<mailto:Bret.Barkelew at microsoft.com>; Andrew Fish<mailto:afish at apple.com>; edk2-devel-groups-io<mailto:devel at edk2.groups.io>; Kinney, Michael D<mailto:michael.d.kinney at intel.com>
> Subject: [EXTERNAL] Re: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> 
> Hi Bret,
> 
> If the other solution more directly expresses the dependency and guarantees ordering, then it may not be considered a hack.
> 
> I prefer solutions that minimize the chances of failures and the need for platform developers to do extra work to integrate a feature or do extra work when they add an unrelated feature.
> 
> I see Laszlo’s proposal to add a new tag GUID protocol to detect and enforce ordering for this specific use case.
> 
> The network stack added a DPC (Deferred Procedure Call) Protocol to help with ordering issues and remove the introduction of extra TPL levels in an earlier version of the network stack.  I am wondering of there is a more generic solution to this specific problem through the use of DPC.
> 
> Mike
> 
> From: Bret Barkelew <Bret.Barkelew at microsoft.com<mailto:Bret.Barkelew at microsoft.com>>
> Sent: Monday, October 12, 2020 3:14 PM
> To: Kinney, Michael D <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>>; Andrew Fish <afish at apple.com<mailto:afish at apple.com>>; edk2-devel-groups-io <devel at edk2.groups.io<mailto:devel at edk2.groups.io>>
> Subject: RE: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> 
> Like I said, I’m also happy to go with the lesser solution of replacing the hack that was already in the code. The last person didn’t care to solve this problem, and I’m good to not solve it, too. I mean, I think it’s turtles all the way down no matter what.
> 
> It was actually the ASSERT in the code that highlighted this problem to being with, so I would say that it’s doing its job. It’s incumbent upon the code author to determine what resource they’re trying to access and whether they’ve accessed it successfully, and I agree that it seems like an appropriate use of ASSERTs so long as it’s backed up with some OTHER appropriate action (even if that action is ignoring it).
> 
> - Bret
> 
> From: Kinney, Michael D<mailto:michael.d.kinney at intel.com>
> Sent: Monday, October 12, 2020 9:44 AM
> To: Bret Barkelew<mailto:Bret.Barkelew at microsoft.com>; Andrew Fish<mailto:afish at apple.com>; edk2-devel-groups-io<mailto:devel at edk2.groups.io>; Kinney, Michael D<mailto:michael.d.kinney at intel.com>
> Subject: [EXTERNAL] RE: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> 
> Bret,
> 
> How to platform creators know for the complete set of drivers if there is anything then need to worry about and why and what they need to address the concern?  This is about order that events are signaled for a given event trigger.  When a platform adds more driver that may use the same event triggers, how do they know if there is a potential for a race condition or not?  If event notification functions are design to be independent of signaling order, then there is no issue.  As soon as there are requirements for event notification functions to be executed in a specific order at a specific event trigger, we have to make sure the platform creator knows and preferably, the FW can tell them if they got it wrong.
> 
> Can your data/device manipulators and data/device protectors use case generate an ASSERT() if they are signaled in the wrong order?
> 
> Mike
> 
> From: Bret Barkelew <Bret.Barkelew at microsoft.com<mailto:Bret.Barkelew at microsoft.com>>
> Sent: Wednesday, October 7, 2020 9:39 AM
> To: Kinney, Michael D <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>>; Andrew Fish <afish at apple.com<mailto:afish at apple.com>>; edk2-devel-groups-io <devel at edk2.groups.io<mailto:devel at edk2.groups.io>>
> Subject: RE: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> 
> Agreed with your concern, Mike. This mechanism (and we can document it as such) should NOT be used to accomplish an explicit ordering (a la the “apriori list”). It’s just to provide a little separation for two patterns that we’ve seen time and again in our code: data/device manipulators and data/device protectors. It does not eliminate the necessity for platform creators to understand things like driver ordering if they have one driver that requires a protocol be installed or a bus connected.
> 
> - Bret
> 
> From: Kinney, Michael D<mailto:michael.d.kinney at intel.com>
> Sent: Wednesday, October 7, 2020 9:21 AM
> To: Andrew Fish<mailto:afish at apple.com>; edk2-devel-groups-io<mailto:devel at edk2.groups.io>; Kinney, Michael D<mailto:michael.d.kinney at intel.com>
> Cc: Bret Barkelew<mailto:Bret.Barkelew at microsoft.com>
> Subject: [EXTERNAL] RE: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> 
> Hi Andrew,
> 
> I agree DXE drivers could use a PCD to make it configurable and prevent collisions with UEFI defined TPL levels.
> 
> Bret’s suggestion of adding a DXE scoped services to create events using non-UEFI defined TPL levels could be used with these TPL levels from PCDs.  Would also allow DXE drivers to use TPL levels associated with the firmware interrupts in the range 17..30.  Perhaps extensions to the DXE Services Table?
> 
> Still does not address my concern that many DXE drivers using these extra TPL levels may run into race conditions if more than one DXE driver selects the same TPL level.  Platform integrators will need to understand the relative priorities of all DXE drivers that use extra TPL levels so they can assign values that both avoid collisions with future UEFI specs and prevent race conditions among DXE drivers.
> 
> Mike
> 
> From: Andrew Fish <afish at apple.com<mailto:afish at apple.com>>
> Sent: Tuesday, October 6, 2020 7:18 PM
> To: edk2-devel-groups-io <devel at edk2.groups.io<mailto:devel at edk2.groups.io>>; Kinney, Michael D <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>>
> Cc: bret.barkelew at microsoft.com<mailto:bret.barkelew at microsoft.com>
> Subject: Re: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> Mike,
> 
> When I’ve done things like this in the past I think of making them configurable like via a PCD.
> 
> In terms of the #defines I think it makes more sense to just do math on the spec defined values. So TPL_CALLBACK + 1 or TPL_CALLBACK - 1 etc.  I’ve got an lldb type formatter for TPL and it prints out <UEFI Spec TPL> [+ <extra>] as I think this is the clearest way to do it.
> 
> Thanks,
> 
> Andrew Fish
> 
> On Oct 6, 2020, at 6:54 PM, Michael D Kinney <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>> wrote:
> 
> Bret,
> 
> It is likely best to go with the first approach.  The discussion on TPL levels can continue and you could adopt it in the future if a general solution is identified.
> 
> TPL 17..30 are reserved by the UEFI Spec for firmware interrupts.  So TPL_NOTIFY_HIGH as defined would not be allowed.
> 
> I agree that the use of TPL values other than those defined by the UEFI Spec can potentially be used by DXE.  However, that DXE usage must be flexible enough to handle a future extension to the UEFI Spec for new TPL levels without a collision.
> 
> Instead of defining specific TPL values, you could add a DXE scoped service to allocate the use of a new TPL level that is not being used by UEFI or other DXE drivers.  I will point out that these approaches (defining new TPL levels or allocating unused TPL levels) just moves the same problem.  You can solve it for the first driver that needs something special.  As soon as there is more than one driver that need something special at the same TPL level, the potential for a race condition for ordering will show up again.  So I do not consider adding TPL levels to be a good general solution to this problem.
> 
> Best regards,
> 
> Mike
> 
> From: devel at edk2.groups.io<mailto:devel at edk2.groups.io> <devel at edk2.groups.io<mailto:devel at edk2.groups.io>> On Behalf Of Bret Barkelew via groups.io<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgroups.io%2F&data=04%7C01%7Cbret.barkelew%40microsoft.com%7C22b77b20547945404a0208d86fa707f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637382110480473324%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=K5uUtMvoqgI3j61HWeYnmLXre3%2FcTnfIoN0rnivaRVI%3D&reserved=0>
> Sent: Tuesday, October 6, 2020 5:24 PM
> To: devel at edk2.groups.io<mailto:devel at edk2.groups.io>
> Subject: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
> 
> As many will be aware, I’m in the final stages of having Variable Policy ready for commit. However, after moving the “Lock” event back to EndOfDxe, this exposed a race condition in variable services.
> 
> A quick synopsis of the problem:
> 
>   *   Previously, MorLock abused a privileged position by being tightly coupled to Variable Services, and its lock callback was called directly so that it could be strongly ordered with the internal property lock that disables future RequestToLock calls.
>   *   VariablePolicy attempted to decouple this (without realizing it was a problem) and go through formalized interfaces that could ultimately be broken out of Variable Services altogether.
>   *   However, doing so triggered the race condition, causing an ASSERT when MorLock attempted to lock its variables.
>   *   I current have a reimplementation of the strong ordering workaround that can be previewed at the link below. I have tested that it works the same as the OLD system.
> 
>      *   Take a stab at solving the lock ordering problem · corthon/edk2 at e7d0164 (github.com)<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcorthon%2Fedk2%2Fcommit%2Fe7d0164c8263b1fbfb8b4e289851fbedaa8997f1&data=04%7C01%7Cbret.barkelew%40microsoft.com%7C22b77b20547945404a0208d86fa707f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637382110480483324%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WZeCatbRVhpZZWx%2FuACtpUes00OjXWFEoQZsmSDXNAc%3D&reserved=0>
> 
> However, replacing one bad design with another is not what this community is about (when we can help it), so we’d like to take a moment to revisit a conversation that has come up before: expanding the number of supported TPL levels.
> 
> Now, I know that the current TPL levels are defined in the UEFI spec and we don’t want to have to change those, but there’s no reason that we can come up with not to add some more granularity in the PI spec, dedicated to platform and implementation ordering (the UEFI spec events will have to remain on UEFI spec TPLs). Basically there would be a set of DXE Services that allow WaitForEvent, CheckEvent, Event registration at TPLs other than notify/callback.  The UEFI system table versions of the functions would still have this restriction but code built with the platform could use the DXE Services. Right now, any attempt to use a non-UEFI TPL will ASSERT, so we can keep that in place on the SystemTable interface, but allow the platform to go around it with DXE Services. Similar functionality would have to be provided by the Mmst, but that’s already platform-specific and can probably allow it in all cases.
> 
> We’re suggesting something like the below:
> 
> //
> // Task priority level
> //
> #define TPL_APPLICATION       4
> #define TPL_CALLBACK_LOW      7
> #define TPL_CALLBACK          8
> #define TPL_CALLBACK_HIGH     9
> #define TPL_NOTIFY_LOW        15
> #define TPL_NOTIFY            16
> #define TPL_NOTIFY_HIGH       17
> #define TPL_HIGH_LEVEL        31
> 
> There’s already a long-in-the-tooth bug tracking a similar issue:
> https://bugzilla.tianocore.org/show_bug.cgi?id=1676<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.tianocore.org%2Fshow_bug.cgi%3Fid%3D1676&data=04%7C01%7Cbret.barkelew%40microsoft.com%7C22b77b20547945404a0208d86fa707f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637382110480483324%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=746gH1KiXtZNbqmvfS3ZDfIa1m7q67dBfv%2F345F%2FH6c%3D&reserved=0>
> 
> This proposal is simpler than what’s in that bug, and would greatly simplify some of our event ordering (and code).
> 
> Thoughts?
> 
> If this conversation takes too long, I will publish a set of patches that just go with the lesser solution posted above, but I’d much rather work the problem this way.
> 
> - Bret
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#66396): https://edk2.groups.io/g/devel/message/66396
Mute This Topic: https://groups.io/mt/77353474/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-





More information about the edk2-devel-archive mailing list