[edk2-devel] [PATCH v8 02/10] OvmfPkg/CpuHotplugSmm: collect hot-unplug events

Ankur Arora ankur.a.arora at oracle.com
Mon Feb 22 22:03:02 UTC 2021


On 2021-02-22 4:27 a.m., Laszlo Ersek wrote:
> On 02/22/21 08:19, Ankur Arora wrote:
>> Process fw_remove events in QemuCpuhpCollectApicIds() and collect
>> corresponding APIC IDs for CPUs that are being hot-unplugged.
> 
> (1) We also collect selectors for those; please mention the fact here.
> 
> 
>>
>> In addition, we now ignore CPUs which only have remove set. These
>> CPUs haven't been processed by OSPM yet.
>>
>> This is based on the QEMU hot-unplug protocol documented here:
>>    https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/
>>
>> Cc: Laszlo Ersek <lersek at redhat.com>
>> Cc: Jordan Justen <jordan.l.justen at intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel at arm.com>
>> Cc: Igor Mammedov <imammedo at redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky at oracle.com>
>> Cc: Aaron Young <aaron.young at oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora at oracle.com>
>> ---
>>
>> Notes:
>>      Addresses the following review comments from v6:
>>       (1,4) Move (and also rename) QEMU_CPUHP_STAT_EJECTED to patch 8,
>>        where we actually use it.
>>       (2) Downgrade debug mask from DEBUG_INFO to DEBUG_VERBOSE.
>>       (3a,3b,3c) Keep the CurrentSelector increment operation at
>>        the tail of the loop.
>>       () As discussed elsewhere we also need to get the CpuSelector while
>>        collecting ApicIds in QemuCpuhpCollectApicIds(). This patch adds a
>>        separate parameter for the CpuSelector values, because that works
>>        better alongside the hotplug ExtendIds logic.
>>
>>   OvmfPkg/CpuHotplugSmm/QemuCpuhp.h                 |  1 +
>>   OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h |  1 +
>>   OvmfPkg/CpuHotplugSmm/CpuHotplug.c                | 21 +++++-
>>   OvmfPkg/CpuHotplugSmm/QemuCpuhp.c                 | 84 ++++++++++++++++-------
>>   4 files changed, 79 insertions(+), 28 deletions(-)
>>
>> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
>> index 8adaa0ad91f0..1e23b150910e 100644
>> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
>> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
>> @@ -55,6 +55,7 @@ QemuCpuhpCollectApicIds (
>>     OUT APIC_ID                      *PluggedApicIds,
>>     OUT UINT32                       *PluggedCount,
>>     OUT APIC_ID                      *ToUnplugApicIds,
>> +  OUT UINT32                       *ToUnplugSelector,
>>     OUT UINT32                       *ToUnplugCount
>>     );
>>   
> 
> (2) For consistency with the parameter names "PluggedApicIds" (plural)
> and "ToUnplugApicIds" (plural), please rename this parameter to
> "ToUnplugSelectors".
> 
> 
>> diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
>> index a34a6d3fae61..2ec7a107a64d 100644
>> --- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
>> +++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
>> @@ -34,6 +34,7 @@
>>   #define QEMU_CPUHP_STAT_ENABLED                BIT0
>>   #define QEMU_CPUHP_STAT_INSERT                 BIT1
>>   #define QEMU_CPUHP_STAT_REMOVE                 BIT2
>> +#define QEMU_CPUHP_STAT_FW_REMOVE              BIT4
>>   
>>   #define QEMU_CPUHP_RW_CMD_DATA               0x8
>>   
>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> index bf68fcd42914..3192bfea1f15 100644
>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> @@ -52,6 +52,7 @@ STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
>>   //
>>   STATIC APIC_ID *mPluggedApicIds;
>>   STATIC APIC_ID *mToUnplugApicIds;
>> +STATIC UINT32  *mToUnplugSelector;
> 
> (3) For consistency with the other two global variable identifiers,
> please call this "mToUnplugSelector" (plural), too.
> 
> (4) The comment above these global variables only talks about APIC IDs;
> please update the comment with a reference / explanation of the selectors.
> 
> 
>>   //
>>   // Address of the non-SMRAM reserved memory page that contains the Post-SMM Pen
>>   // for hot-added CPUs.
>> @@ -289,6 +290,7 @@ CpuHotplugMmi (
>>                mPluggedApicIds,
>>                &PluggedCount,
>>                mToUnplugApicIds,
>> +             mToUnplugSelector,
>>                &ToUnplugCount
>>                );
>>     if (EFI_ERROR (Status)) {
>> @@ -333,7 +335,9 @@ CpuHotplugEntry (
>>     )
>>   {
>>     EFI_STATUS Status;
>> +  UINTN      Len;
>>     UINTN      Size;
>> +  UINTN      SizeSel;
>>   
>>     //
>>     // This module should only be included when SMM support is required.
>> @@ -387,8 +391,9 @@ CpuHotplugEntry (
>>     //
>>     // Allocate the data structures that depend on the possible CPU count.
>>     //
>> -  if (RETURN_ERROR (SafeUintnSub (mCpuHotPlugData->ArrayLength, 1, &Size)) ||
>> -      RETURN_ERROR (SafeUintnMult (sizeof (APIC_ID), Size, &Size))) {
>> +  if (RETURN_ERROR (SafeUintnSub (mCpuHotPlugData->ArrayLength, 1, &Len)) ||
>> +      RETURN_ERROR (SafeUintnMult (sizeof (APIC_ID), Len, &Size))||
> 
> (5) Missing space character before "||".
> 
> 
>> +      RETURN_ERROR (SafeUintnMult (sizeof (UINT32), Len, &SizeSel))) {
>>       Status = EFI_ABORTED;
>>       DEBUG ((DEBUG_ERROR, "%a: invalid CPU_HOT_PLUG_DATA\n", __FUNCTION__));
>>       goto Fatal;
>> @@ -405,6 +410,12 @@ CpuHotplugEntry (
>>       DEBUG ((DEBUG_ERROR, "%a: MmAllocatePool(): %r\n", __FUNCTION__, Status));
>>       goto ReleasePluggedApicIds;
>>     }
>> +  Status = gMmst->MmAllocatePool (EfiRuntimeServicesData, SizeSel,
>> +                    (VOID **)&mToUnplugSelector);
>> +  if (EFI_ERROR (Status)) {
>> +    DEBUG ((DEBUG_ERROR, "%a: MmAllocatePool(): %r\n", __FUNCTION__, Status));
>> +    goto ReleaseToUnplugApicIds;
>> +  }
>>   
>>     //
>>     // Allocate the Post-SMM Pen for hot-added CPUs.
>> @@ -412,7 +423,7 @@ CpuHotplugEntry (
>>     Status = SmbaseAllocatePostSmmPen (&mPostSmmPenAddress,
>>                SystemTable->BootServices);
>>     if (EFI_ERROR (Status)) {
>> -    goto ReleaseToUnplugApicIds;
>> +    goto ReleaseToUnplugSelector;
>>     }
>>   
>>     //
>> @@ -472,6 +483,10 @@ ReleasePostSmmPen:
>>     SmbaseReleasePostSmmPen (mPostSmmPenAddress, SystemTable->BootServices);
>>     mPostSmmPenAddress = 0;
>>   
>> +ReleaseToUnplugSelector:
> 
> (6) Please call this label "ReleaseToUnplugSelectors" (plural).
> 
> 
>> +  gMmst->MmFreePool (mToUnplugSelector);
>> +  mToUnplugSelector = NULL;
>> +
>>   ReleaseToUnplugApicIds:
>>     gMmst->MmFreePool (mToUnplugApicIds);
>>     mToUnplugApicIds = NULL;
>> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> index 8d4a6693c8d6..36372a5e6193 100644
>> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> @@ -164,6 +164,9 @@ QemuCpuhpWriteCommand (
>>     @param[out] ToUnplugApicIds  The APIC IDs of the CPUs that are about to be
>>                                  hot-unplugged.
>>   
>> +  @param[out] ToUnplugSelector The QEMU Selectors of the CPUs that are about to
>> +                               be hot-unplugged.
>> +
>>     @param[out] ToUnplugCount    The number of filled-in APIC IDs in
>>                                  ToUnplugApicIds.
>>   
> 
Acking the comments above.

> (7) Please (a) call the parameter "ToUnplugSelectors" (plural), and (b)
> make sure that there are two space characters between the variable name
> "column" and the documentation "column". (All in all, please move the
> RHS column to the right by two spaces.)

That would make the RHS of ToUnplugSelectors not line up with the other
two out params. (Even though this mail does not seem to show that, they
do line up in the code.) Is that okay?

> 
>> @@ -187,6 +190,7 @@ QemuCpuhpCollectApicIds (
>>     OUT APIC_ID                      *PluggedApicIds,
>>     OUT UINT32                       *PluggedCount,
>>     OUT APIC_ID                      *ToUnplugApicIds,
>> +  OUT UINT32                       *ToUnplugSelector,
>>     OUT UINT32                       *ToUnplugCount
>>     )
>>   {
>> @@ -204,6 +208,7 @@ QemuCpuhpCollectApicIds (
>>       UINT32  PendingSelector;
>>       UINT8   CpuStatus;
>>       APIC_ID *ExtendIds;
>> +    UINT32  *ExtendSel;
> 
> (8) Please use plural -- "ExtendSels" --, consistently with "ExtendIds".
> 
> 
>>       UINT32  *ExtendCount;
>>       APIC_ID NewApicId;
>>   
>> @@ -245,10 +250,10 @@ QemuCpuhpCollectApicIds (
>>       if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
>>         //
>>         // The "insert" event guarantees the "enabled" status; plus it excludes
>> -      // the "remove" event.
>> +      // the "fw_remove" event.
>>         //
>>         if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
>> -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>> +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>>           DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>>             "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>>             CpuStatus));
>> @@ -259,40 +264,69 @@ QemuCpuhpCollectApicIds (
>>           CurrentSelector));
>>   
>>         ExtendIds   = PluggedApicIds;
>> +      ExtendSel   = NULL;
>>         ExtendCount = PluggedCount;
>> -    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>> -      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
>> -        CurrentSelector));
>> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>> +      //
>> +      // "fw_remove" event guarantees "enabled".
>> +      //
>> +      if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
>> +        DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>> +          "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>> +          CpuStatus));
>> +        return EFI_PROTOCOL_ERROR;
>> +      }
>> +
>> +      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
>> +        __FUNCTION__, CurrentSelector));
>>   
>>         ExtendIds   = ToUnplugApicIds;
>> +      ExtendSel   = ToUnplugSelector;
>>         ExtendCount = ToUnplugCount;
>> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>> +      //
>> +      // Let the OSPM deal with the "remove" event.
>> +      //
>> +      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove (ignored)\n",
>> +        __FUNCTION__, CurrentSelector));
>> +
>> +      ExtendIds   = NULL;
>> +      ExtendSel   = NULL;
>> +      ExtendCount = NULL;
>>       } else {
>>         DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
>>           __FUNCTION__, CurrentSelector));
>>         break;
>>       }
>>   
>> -    //
>> -    // Save the APIC ID of the CPU with the pending event, to the corresponding
>> -    // APIC ID array.
>> -    //
>> -    if (*ExtendCount == ApicIdCount) {
>> -      DEBUG ((DEBUG_ERROR, "%a: APIC ID array too small\n", __FUNCTION__));
>> -      return EFI_BUFFER_TOO_SMALL;
>> -    }
>> -    QemuCpuhpWriteCommand (MmCpuIo, QEMU_CPUHP_CMD_GET_ARCH_ID);
>> -    NewApicId = QemuCpuhpReadCommandData (MmCpuIo);
>> -    DEBUG ((DEBUG_VERBOSE, "%a: ApicId=" FMT_APIC_ID "\n", __FUNCTION__,
>> -      NewApicId));
>> -    ExtendIds[(*ExtendCount)++] = NewApicId;
>> +    ASSERT ((ExtendIds == NULL) == (ExtendCount == NULL));
> 
> (9) Please follow this ASSERT with another ASSERT(), on a separate line:
> 
>      ASSERT (ExtendSels == NULL || ExtendIds != NULL);
> 
> Explanation:
> 
> - I'd like us to capture: if ExtendSels is not NULL, then ExtendIds is
> also not NULL.
> 
> - Furthermore, express the A-->B implication using its equivalent
> formulation (!A)||B.

This is a good check. Will add.

> 
> 
>> +    if (ExtendIds != NULL) {
>> +      //
>> +      // Save the APIC ID of the CPU with the pending event, to the
>> +      // corresponding APIC ID array.
>> +      // For unplug events, also save the CurrentSelector.
>> +      //
>> +      if (*ExtendCount == ApicIdCount) {
>> +        DEBUG ((DEBUG_ERROR, "%a: APIC ID array too small\n", __FUNCTION__));
>> +        return EFI_BUFFER_TOO_SMALL;
>> +      }
>> +      QemuCpuhpWriteCommand (MmCpuIo, QEMU_CPUHP_CMD_GET_ARCH_ID);
>> +      NewApicId = QemuCpuhpReadCommandData (MmCpuIo);
>> +      DEBUG ((DEBUG_VERBOSE, "%a: ApicId=" FMT_APIC_ID "\n", __FUNCTION__,
>> +        NewApicId));
>> +      if (ExtendSel != NULL) {
>> +        ExtendSel[(*ExtendCount)] = CurrentSelector;
>> +      }
>> +      ExtendIds[(*ExtendCount)++] = NewApicId;
> 
> OK thus far...
> 
>>   
>> -    //
>> -    // We've processed the CPU with (known) pending events, but we must never
>> -    // clear events. Therefore we need to advance past this CPU manually;
>> -    // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
>> -    // selected CPU.
>> -    //
>> -    CurrentSelector++;
>> +      //
>> +      // We've processed the CPU with (known) pending events, but we must never
>> +      // clear events. Therefore we need to advance past this CPU manually;
>> +      // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
>> +      // selected CPU.
>> +      //
>> +      CurrentSelector++;
>> +    }
> 
> (10) ... but this is a logic bug: the comment and the CurrentSelector
> increment must not depend on (ExtendIds != NULL).
> 
> As written, if we ever see a QEMU_CPUHP_STAT_REMOVE event (which we're
> supposed to ignore), we'll never move past that CPU, but will be stuck
> in an infinite loop. The QEMU_CPUHP_CMD_GET_PENDING command will keep
> returning the CPU with the QEMU_CPUHP_STAT_REMOVE event pending.
> 
> If this was unclear from my previous feedback, then I apologize, I
> should have been clearer.

No, my bad. Don't quite know how I missed that -- that was the whole
reason why I had a CurrentSelector++ before the continue statement
in v6.

Will fix.

Thanks
Ankur

> 
> Thanks,
> Laszlo
> 
>>     } while (CurrentSelector < PossibleCpuCount);
>>   
>>     DEBUG ((DEBUG_VERBOSE, "%a: PluggedCount=%u ToUnplugCount=%u\n",
>>
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#71998): https://edk2.groups.io/g/devel/message/71998
Mute This Topic: https://groups.io/mt/80819858/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-





More information about the edk2-devel-archive mailing list