[edk2-devel] [Patch V2] UefiCpuPkg/CpuExceptionHandlerLib: Fix #AC split lock

Fri Sep 13 22:24:11 UTC 2019

Hi Laszlo,
2. Yes, I can change commit message/comments to separate split lock and #AC.
3. Yes it’s a close platform that is enabling #AC which hits double fault because split lock inside CpuExceptionHandlerLib.

Code:	
I was wondering same thing, why are they using locking mechanism. I wasn’t sure so that’s why keep xchg instead of changing to mov.  I have yet to think of any reasons.  I think it's a good idea to use mov instead it accomplishes the same thing and easier to understand.

Thank you,
John

>-----Original Message-----
>From: Laszlo Ersek [mailto:lersek at redhat.com]
>Sent: Friday, September 13, 2019 9:32 AM
>To: devel at edk2.groups.io; Lofgren, John E <john.e.lofgren at intel.com>
>Cc: Ni, Ray <ray.ni at intel.com>; Gao, Liming <liming.gao at intel.com>; Dong,
>Eric <eric.dong at intel.com>
>Subject: Re: [edk2-devel] [Patch V2] UefiCpuPkg/CpuExceptionHandlerLib: Fix
>#AC split lock
>
>On 09/09/19 20:40, John E Lofgren wrote:
>> REF: https://bugzilla.tianocore.org/show_bug.cgi?id=2150
>>
>> V2 changes:
>> Add xchg 16 bit instructions to handle sgdt and sidt base
>> 63:48 bits and 47:32 bits.
>> Add comment to explain why xchg 64bit isnt being used
>>
>> Fix #AC split lock's caused by seperating base and limit from sgdt and
>> sidt by changing xchg operands to 32-bit to stop from crossing
>> cacheline.
>>
>> Signed-off-by: John E Lofgren <john.e.lofgren at intel.com>
>> ---
>>
>>
>UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nas
>m
>> | 20 ++++++++++++++------
>>  1 file changed, 14 insertions(+), 6 deletions(-)
>
>The commit message (and the bug report) are very difficult to understand.
>
>This is the first time I hear about "split lock" and "alignment check exception",
>so please bear with me. This is my take on the problem.
>
>(1) "split lock" is explained well here:
>
>https://lwn.net/Articles/784864/
>
>In short, we can consider it a performance anti-pattern. A locking instruction
>(such as XCHG) is invoked on incorrectly aligned data, which causes
>performance degradation for the whole system. In some cases this can be a
>security issue even, because less privileged code can block (slow down) more
>privileged code running on a different CPU.
>
>(2) Alignment Check Exception is a way to detect the occurrence of "split
>lock". It must be configured explicitly, when the system software wishes to be
>informed explicitly about a split lock event.
>
>Therefore, the "#AC split lock" expression in the commit message is very
>confusing. Those are two different things. One is the problem ("split lock"),
>and the other is the (kind of) solution ("alignment check exception").
>
>We don't care about #AC (the exception) here. My understanding is that the
>open source edk2 tree does not enable #AC. The following include file:
>
>  MdePkg/Include/Register/Intel/Msr/P6Msr.h
>
>defines MSR_P6_TEST_CTL (value 0x33), which the above LWN article
>references as MSR_TEST_CTL. The article refers to bit 29, but that seems to be
>part of the "Reserved1" bit-field in the MSR_P6_TEST_CTL_REGISTER.
>
>There seems to be a bit field that could be related (Disable_LOCK, bit#31), but
>again, there is no reference to Disable_LOCK in the edk2 codebase.
>
>So my whole point here is that we should clearly separate "#AC" from "split
>lock" in the commit message (and in the code comment). Those are separate
>things. And "split lock" may apply to edk2, but "#AC" does not (to the open
>source tree anyway).
>
>(3) OK, assuming this code indeed triggers a "split lock" event. Why is that a
>problem? Yes, it may degrade performance for the system. Why do we care?
>We are *already* handling an exception. That should not be a very frequent
>event, for any processor (BSP or AP) in the system.
>
>Is the problem that a closed source platform enables #AC, and then the fault
>handler in CpuExceptionHandlerLib -- which gets invoked due to an
>independent fault -- runs into a *double* fault, due to the split lock raising
>#AC?
>
>This should be clearly explained in the commit message. I'm not prying at
>proprietary platform details, but the circumstances / symptoms of the issue
>should be clearly described.
>
>More comments below, regarding the original code:
>
>>
>> diff --git
>>
>a/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.na
>> sm
>>
>b/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.na
>> sm
>> index 4db1a09f28..7b7642b290 100644
>> ---
>>
>a/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.na
>> sm
>> +++
>b/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAs
>> +++ m.nasm
>> @@ -180,21 +180,29 @@ HasErrorCode:
>>      push    qword [rbp + 24]
>>
>>  ;; UINT64  Gdtr[2], Idtr[2];
>> +    ; sidt and sgdt saves 10 bytes to memory, 8 bytes = base and 2 bytes =
>limit.
>> +    ; To avoid #AC split lock when separating base and limit into their
>> +    ; own separate 64 bit memory, we can’t use 64 bit xchg since base [63:48]
>bits
>> +    ; may cross the cache line.
>>      xor     rax, rax
>>      push    rax
>>      push    rax
>
>So, the contents of RAX is 0 now, and we've made 16 bytes (filled with
>0) room on the stack.
>
>>      sidt    [rsp]
>
>This instruction writes 10 bytes at the base of that "room" on the stack.
>Offsets 0 and 1 contain the "limit", offsets 2-9 (inclusive) contain the "base
>address".
>
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  |L|L|B|B|B|B|B|B|B|B|0|0|0|0|0|0|
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>
>> -    xchg    rax, [rsp + 2]
>
>(I guess this is the instruction that splits the lock.)
>
>Now RAX has the base address, and the stack looks like:
>
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  |L|L|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>
>> -    xchg    rax, [rsp]
>
>Now RAX has the limit, and the stack is:
>
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  |B|B|B|B|B|B|B|B|0|0|0|0|0|0|0|0|
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>
>> -    xchg    rax, [rsp + 8]
>
>Now RAX is zero again, and the stack is:
>
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  |B|B|B|B|B|B|B|B|L|L|0|0|0|0|0|0|
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>
>This is very "clever" and all, but why do we have to use a locking instruction?
>We save RBX on the stack, earlier in CommonInterruptEntry, and we restore it
>in the end. So what's wrong with:
>
>  sidt    [rsp]
>  mov     bx, word [rsp]       ; read limit into bx
>  mov     rax, qword [rsp + 2] ; read base address into rax
>  mov     qword [rsp], rax     ; write base address to qword#0
>  mov     word [rsp + 8], bx   ; write limit to qword#1
>
>The second MOV instruction may straddle a cache line, yes, but there is no
>locking at all.
>
>Thanks,
>Laszlo

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#47232): https://edk2.groups.io/g/devel/message/47232
Mute This Topic: https://groups.io/mt/34083544/1813853
Mute #ac: https://groups.io/mk?hashtag=ac&subid=3943202
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-