[edk2-devel] 1024 VCPU limitation
Paweł Poławski
ppolawsk at redhat.com
Wed Nov 30 00:00:18 UTC 2022
Hi all
I did fast experiment already and it looks like this Itanium data
structure removal solves my issue. Later this week I should have access
to machine with 200 physical CPU so I will be able to run test against
1600 vCPU using Qemu.
To reproduce test 3 things are needed:
1) Change in the edk2: Removal of Itanium data structure in
MdePkg/Include/Ppi/SecPlatformInformation.
2) Change in Qemu: Default limit is 288 / 255 in
hw/i386/pc_q35.c and hw/i386/pc.c
3) Change in KVM: Default limit is 1024 in
arch/x86/include/asm/kvm_host.h
At the end I am using Qemu to run modified firmware:
$QEMU \
-accel kvm \
-m 4G -M q35,kernel-irqchip=on,smm=on \
-smp cpus=1024,maxcpus=1024 \
-global mch.extended-tseg-mbytes=128 \
\
-drive if=pflash,format=raw,file=${CODE},readonly=on \
-drive if=pflash,format=raw,file=${VARS} \
\
-chardev stdio,id=fwlog \
-device isa-debugcon,iobase=0x402,chardev=fwlog \
\
"$@"
W dniu 15.11.2022 o 01:29, Kinney, Michael D pisze:
> Hi Pedro,
>
> After Pawel runs an experiment, we can think about how to address.
>
> 1. Code First process for spec change to remove ItaniumHealthFlags field
> 2. Code first process for spec change for a new GUID HIB value that
> does not have ItaniumHealthFlags field
> 3. Code First process to allow multiple instances of the GUIDed HOB.
After checking the code it looks like this ITANIUM_HANDOFF_STATUS is
only referenced in this header file
(MdePkg/Include/Ppi/SecPlatformInformation.h) and nowhere else.
I would like to prepare pull request for the mailing list with this data
structure removed. From what I saw the whole Itanium support has been
already removed back in 2019.
You mentioned code first approach, so can I just remove unused part and
submit a patch? Or should I introduce intermediate data structure with
new GUID to not break any compatibility, and then at some point in the
future remove old one?
Please advise
Best,
Pawel
>
> I prefer option (1) or (2) because is reduces the temp RAM usage in HOBs
> as the number of IA32/X64 CPUs increase.
>
> (3) may be required to scale above 64KB HOB size limit even with the
> reduced per CPU size.
>
> Mike
>
> *From:*devel at edk2.groups.io <devel at edk2.groups.io> *On Behalf Of *Pedro
> Falcato
> *Sent:* Monday, November 14, 2022 3:30 PM
> *To:* devel at edk2.groups.io; Kinney, Michael D <michael.d.kinney at intel.com>
> *Cc:* Pawel Polawski <ppolawsk at redhat.com>; Dong, Eric
> <eric.dong at intel.com>; Laszlo Ersek <lersek at redhat.com>; Ni, Ray
> <ray.ni at intel.com>; Kumar, Rahul R <rahul.r.kumar at intel.com>
> *Subject:* Re: [edk2-devel] 1024 VCPU limitation
>
> On Mon, Nov 7, 2022 at 5:28 PM Michael D Kinney
> <michael.d.kinney at intel.com <mailto:michael.d.kinney at intel.com>> wrote:
>
> Hi Pawel,
>
> I see the following union involved in the size of this structure.
>
> typedefunion{
>
> IA32_HANDOFF_STATUS IA32HealthFlags;
>
> X64_HANDOFF_STATUS x64HealthFlags;
>
> ITANIUM_HANDOFF_STATUS ItaniumHealthFlags;
>
> } EFI_SEC_PLATFORM_INFORMATION_RECORD;
>
> IA32 is 4 bytes per CPU
>
> X64 is 4 bytes per CPU
>
> Itanium is 56 bytes per CPU
>
> We have removed the Itanium content from edk2 repo and it look like
> we missed this
>
> union.
>
> Hi Mike,
>
> I just want to note that I don't think you can remove
> ITANIUM_HANDOFF_STATUS (in an upstreamable way at least) since it's
> specified in the EFI PI spec, and it would also break any sort of ABI.
>
> Maybe once you update the spec? Or maybe we could find a way to pass
> these handoff statuses in multiple HOBs, or have v2 HOBs with UINT32
> lengths (with an appropriate spec update).
>
> If you comment out the following line from the union does it resolve
> the issue?
>
> https://github.com/tianocore/edk2/blob/7c0ad2c33810ead45b7919f8f8d0e282dae52e71/MdePkg/Include/Ppi/SecPlatformInformation.h#L137 <https://github.com/tianocore/edk2/blob/7c0ad2c33810ead45b7919f8f8d0e282dae52e71/MdePkg/Include/Ppi/SecPlatformInformation.h#L137>
>
> I know this only increases the total number of CPUs that can be
> handled by a single 64kb HOB, so we would run into
>
> it again at a higher number of CPUs. However, I think this gets the
> overhead per CPU down to 8 bytes, which should
>
> scale to about 8091 CPUs.
>
> Thanks,
>
> Mike
>
> *From:* Pawel Polawski <ppolawsk at redhat.com
> <mailto:ppolawsk at redhat.com>>
> *Sent:* Monday, November 7, 2022 3:52 AM
> *To:* devel at edk2.groups.io <mailto:devel at edk2.groups.io>
> *Cc:* Dong, Eric <eric.dong at intel.com <mailto:eric.dong at intel.com>>;
> Laszlo Ersek <lersek at redhat.com <mailto:lersek at redhat.com>>; Ni, Ray
> <ray.ni at intel.com <mailto:ray.ni at intel.com>>; Kumar, Rahul R
> <rahul.r.kumar at intel.com <mailto:rahul.r.kumar at intel.com>>; Kinney,
> Michael D <michael.d.kinney at intel.com
> <mailto:michael.d.kinney at intel.com>>
> *Subject:* [edk2-devel] 1024 VCPU limitation
>
> Hi All,
>
> I am trying to run edk2 with more than 1024 VCPU. It looks like it
> is not possible
>
> at the moment and results in an ASSERT trigger.
>
> In the past the topic has been analyzed by Laszlo Ersek [1]. It
> turns out that the limit
>
> is result of HOB default allocation being limited to ~64KB, quoting
> original email thread:
>
> """
>
> If "NumberOfProcessors" is large enough, such as ~1024, then
> "BistInformationSize" will exceed ~64KB, and PeiServicesAllocatePool()
> will fail with EFI_OUT_OF_RESOURCES. The reason is that pool allocations
> in PEI are implemented with memory alloaction HOBs, and HOBs can't be
> larger than ~64KB. (See PeiAllocatePool() in
> "MdeModulePkg/Core/Pei/Memory/MemoryServices.c".)
>
> """
>
> Even with HOB allocation being changed, I am afraid it may break some
>
> compatibility on the DXE level. This is the reason I am looking for
> a more universal solution.
>
> I believe the same limitation exists for the physical x86 platforms
> with more than 1024 CPU.
>
> If someone has encountered the same issue or has knowledge that
> workaround / solution for
>
> this already exists or is being developed?
>
> [1]
> https://listman.redhat.com/archives/edk2-devel-archive/2021-June/msg01493 <https://listman.redhat.com/archives/edk2-devel-archive/2021-June/msg01493>
>
> Best regards,
>
> Pawel
>
>
> --
>
> *Paweł Poławski*
>
> Red Hat <https://www.redhat.com/> Virtualization
>
> ppolawsk at redhat.com <mailto:ppolawsk at redhat.com>
>
> @RedHat <https://twitter.com/redhat> Red Hat
> <https://www.linkedin.com/company/red-hat> Red Hat
> <https://www.facebook.com/RedHatInc>
>
> <https://red.ht/sig>
>
>
>
> --
>
> Pedro Falcato
>
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#96717): https://edk2.groups.io/g/devel/message/96717
Mute This Topic: https://groups.io/mt/94864072/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-
More information about the edk2-devel-archive
mailing list