[edk2-devel] [RFC] Incorrect memory ordering in ReleaseSpinLock()

Bin, Sung-Uk (Bin) via groups.io sunguk-bin=hp.com at groups.io
Thu Jan 7 00:02:56 UTC 2021


According to Ard's explanation, it seems that we do not need to worry about multi-core issues. However, it is assumed that spinlocks are not shared with DMA masters.

Maugan, Sean,
If you have any other comments or concerns about this, please leave a comment.

Following Ard's suggestion, we may improve the code as below if needed.

SPIN_LOCK *
EFIAPI
ReleaseSpinLock (
  IN OUT  SPIN_LOCK                 *SpinLock
  )
{
  SPIN_LOCK    LockValue;

  ASSERT (SpinLock != NULL);

  LockValue = *SpinLock;
  ASSERT (SPIN_LOCK_ACQUIRED == LockValue || SPIN_LOCK_RELEASED == LockValue);

  InterlockedCompareExchangePointer (
    (VOID**)SpinLock,
    (VOID*)SPIN_LOCK_ACQUIRED,
    (VOID*)SPIN_LOCK_RELEASED
);

return SpinLock;
}

--Bin

From: Ard Biesheuvel <ard.biesheuvel at arm.com>
Sent: Wednesday, January 6, 2021 10:28 PM
To: Bin, Sung-Uk (Bin) <sunguk-bin at hp.com>; devel at edk2.groups.io
Cc: gaoliming at byosoft.com.cn; Villatel, Maugan <maugan.villatel at hp.com>; Collison, Sean <scollison at hp.com>
Subject: Re: [RFC] Incorrect memory ordering in ReleaseSpinLock()

On 1/6/21 12:29 PM, Bin, Sung-Uk (Bin) wrote:
> Dear, Ard and maintainers
>
>
>
> We are concerning that ReleaseSpinLock() does not have a memory barrier.
> This is reported to https://bugzilla.tianocore.org/show_bug.cgi?id=3005<https://bugzilla.tianocore.org/show_bug.cgi?id=3005>.
>  We’d like to hear from you whether current implementation needs
> improvement or not.
>

I think you are correct that the current implementation is insufficient.
However, I would prefer for someone to do a comprehensive audit of all
the locking primitives for concurrency problems.


>
>
> The concern comes from *'weak memory ordering' and multi-core.* (we are
> using AARCH64) And the scenario that we’re concerning is like below:
>

When does UEFI run multi-core on a AArch64 system? The UEFI spec does
not permit SMP at boot time, and at runtime, the runtime services are
not reentrant, in which case we should be able to rely on barriers in
the OS's critical section code to ensure visibility when several cores
compete for the UEFI runtime services from the OS.


>
>
> AcquireSpinLock(); // contains ‘dmb sy’ and prevents "a = *b" from
> moving up (and unnecessarily prevents other things from moving down)
>
> a = *b;
>
> a = a + 1;
>
> *b = a;
>
> *ReleaseSpinLock(); // No write barrier here, so "*b = a" can move down.
> Another core acquires the spinlock and can read stale data*
>
>
>
>
>
> Please let me know if it would be helpful to add MemoryFence like below:
>

For symmetry, I'd prefer it if we could simply implement the release
side in terms of InterlockedCompareExchangePointer(), and ASSERT() on
the output.

*However*, looking at the current code, there seems to be something
seriously wrong: ReleaseSpinLock() has

ASSERT (SPIN_LOCK_ACQUIRED == LockValue || SPIN_LOCK_RELEASED == LockValue);


which means you can release a released spinlock even on a DEBUG build
without a diagnostic being printed - that seems like a bug to me.

>
>
> SPIN_LOCK *
>
> EFIAPI
>
> ReleaseSpinLock (
>
>   IN OUT  SPIN_LOCK                 *SpinLock
>
>   )
>
> {
>
>   SPIN_LOCK    LockValue;
>
>
>
>   ASSERT (SpinLock != NULL);
>
>
>
> *  MemoryFence(); *
>
>
>
>   LockValue = *SpinLock;
>
>   ASSERT (SPIN_LOCK_ACQUIRED == LockValue || SPIN_LOCK_RELEASED ==
> LockValue);
>
>
>
>   *SpinLock = SPIN_LOCK_RELEASED;
>
>   return SpinLock;
>
> }
>
> * *
>
> *MemoryFence is implemented with 'dmb', but I just wonder if it is okay
> to not implement it with 'dsb'.*
>

DSB is for cache and TLB maintenance, not for memory ordering. DMB
should be sufficient here. And actually, we don't need a system wide DMB
here, an inner shareable DMB should be sufficient (given that we don't
share spinlocks with DMA masters)




>
>
> * Attaching linux documentation describing SMP barrier pairing
>
> https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt<https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt>
>
>
>
> SMP BARRIER PAIRING
>
>
>
> -------------------
>
>
>
>
>
>
>
> When dealing with CPU-CPU interactions, certain types of memory barrier
> should
>
>
>
> always be paired.  A lack of appropriate pairing is almost certainly an
> error.
>
>
>
>
>
>
>
> General barriers pair with each other, though they also pair with most
>
>
>
> other types of barriers, albeit without multicopy atomicity.  An acquire
>
>
>
> barrier pairs with a release barrier, but both may also pair with other
>
>
>
> barriers, including of course general barriers.  A write barrier pairs
>
>
>
> with a data dependency barrier, a control dependency, an acquire barrier,
>
>
>
> a release barrier, a read barrier, or a general barrier.  Similarly a
>
>
>
> read barrier, control dependency, or a data dependency barrier pairs
>
>
>
> with a write barrier, an acquire barrier, a release barrier, or a
>
>
>
> general barrier:
>
>
>
>
>
>
>
>        CPU 1               CPU 2
>
>
>
>        ===============            ===============
>
>
>
>        WRITE_ONCE(a, 1);
>
>
>
>        <write barrier>
>
>
>
>        WRITE_ONCE(b, 2);     x = READ_ONCE(b);
>
>
>
>                            <read barrier>
>
>
>
>                            y = READ_ONCE(a);
>
>
>
>
>
>
>
> Or:
>
>
>
>
>
>
>
>        CPU 1               CPU 2
>
>
>
>        ===============            ===============================
>
>
>
>        a = 1;
>
>
>
>        <write barrier>
>
>
>
>        WRITE_ONCE(b, &a);    x = READ_ONCE(b);
>
>
>
>                            <data dependency barrier>
>
>
>
>                            y = *x;
>
>
>
>
>
>
>
> Or even:
>
>
>
>
>
>
>
>        CPU 1               CPU 2
>
>
>
>        ===============            ===============================
>
>
>
>        r1 = READ_ONCE(y);
>
>
>
>        <general barrier>
>
>
>
>        WRITE_ONCE(x, 1);     if (r2 = READ_ONCE(x)) {
>
>
>
>                               <implicit control dependency>
>
>
>
>                               WRITE_ONCE(y, 1);
>
>
>
>                            }
>
>
>
>
>
>
>
>        assert(r1 == 0 || r2 == 0);
>
>
>
>
>
>
>
> Basically, the read barrier always has to be there, even though it can be of
>
>
>
> the "weaker" type.
>
>
>
>
>
>
>
> [!] Note that the stores before the write barrier would normally be
> expected to
>
>
>
> match the loads after the read barrier or the data dependency barrier,
> and vice
>
>
>
> versa:
>
>
>
>
>
>
>
>        CPU 1                               CPU 2
>
>
>
>        ===================                 ===================
>
>
>
>        WRITE_ONCE(a, 1);    }----   --->{  v = READ_ONCE(c);
>
>
>
>        WRITE_ONCE(b, 2);    }    \ /    {  w = READ_ONCE(d);
>
>
>
>        <write barrier>            \        <read barrier>
>
>
>
>        WRITE_ONCE(c, 3);    }    / \    {  x = READ_ONCE(a);
>
>
>
>        WRITE_ONCE(d, 4);    }----   --->{  y = READ_ONCE(b);
>
>
>
>
>
>
>
>
>
>
>
> Thanks,
>
> Bin
>
>
>
> *From:* bugzilla-daemon at bugzilla.tianocore.org<mailto:bugzilla-daemon at bugzilla.tianocore.org>
> <bugzilla-daemon at bugzilla.tianocore.org<mailto:bugzilla-daemon at bugzilla.tianocore.org>>
> *Sent:* Wednesday, November 4, 2020 10:44 AM
> *To:* Bin, Sung-Uk (Bin) <sunguk-bin at hp.com<mailto:sunguk-bin at hp.com>>
> *Subject:* [Bug 3005] ReleaseSpinLock() requires a barrier at the beginning
>
>
>
> https://bugzilla.tianocore.org/show_bug.cgi?id=3005<https://bugzilla.tianocore.org/show_bug.cgi?id=3005>
>
> gaoliming at byosoft.com.cn<mailto:gaoliming at byosoft.com.cn> <mailto:gaoliming at byosoft.com.cn> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Priority|Lowest |Normal
> Status|UNCONFIRMED |CONFIRMED
> CC| |leif at nuviainc.com<mailto:|leif at nuviainc.com> <mailto:|leif at nuviainc.com>
> Assignee|unassigned at tianocore.org<mailto:Assignee|unassigned at tianocore.org>
> <mailto:Assignee|unassigned at tianocore.org> |ard.biesheuvel at arm.com<mailto:|ard.biesheuvel at arm.com>
> <mailto:|ard.biesheuvel at arm.com>
> Ever confirmed|0 |1
>
> --- Comment #5 from gaoliming at byosoft.com.cn<mailto:gaoliming at byosoft.com.cn>
> <mailto:gaoliming at byosoft.com.cn> ---
> Ard: can you help check it? This issue in AARCH64.
>
> --
> You are receiving this mail because:
> You reported the bug.
>


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#69881): https://edk2.groups.io/g/devel/message/69881
Mute This Topic: https://groups.io/mt/79474106/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/edk2-devel-archive/attachments/20210107/e9158c4e/attachment.htm>


More information about the edk2-devel-archive mailing list