[edk2-devel] [Patch 0/9] Enable multiple process AutoGen
Steven Shi
steven.shi at intel.com
Sun Jul 21 15:55:17 UTC 2019
> This is the problem: the checksum does not match the one from
> (3c). And, the binary from (3f) is in fact mis-built, it
> crashes immediately when launched on QEMU/KVM, with "emulation
> failure".
It looks an incremental build error. This multi-process build tool fails to re-generate the new autogen code (CreateCodeFile) and makefile (CreateMakeFile). The new build tool wrongly skip these steps in incremental build.
Thanks
Steven Shi
> -----Original Message-----
> From: devel at edk2.groups.io [mailto:devel at edk2.groups.io] On Behalf Of
> Laszlo Ersek
> Sent: Sunday, July 21, 2019 8:26 PM
> To: Feng, Bob C <bob.c.feng at intel.com>
> Cc: devel at edk2.groups.io; Philippe Mathieu-Daudé <philmd at redhat.com>
> Subject: Re: [edk2-devel] [Patch 0/9] Enable multiple process AutoGen
>
> Hi Bob,
>
> On 07/18/19 08:14, Bob Feng wrote:
> > BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1875
> >
> > In order to improve the build performance, we implemented
> > multiple-processes AutoGen. This change will reduce 20% time
> > for AutoGen phase.
> >
> > The design document can be got from:
> > https://edk2.groups.io/g/devel/files/Designs/2019/0627/Multiple-thread-
> AutoGen.pdf
> >
> > This patch serial pass the build of Ovmf, MinKabylake, MinPurley,
> > packages under Edk2 repository and intel client and server platforms.
>
> I've done some basic regression-testing with this set, applied on top of
> current master (= commit 8ff68cd5e4c9, "ShellPkg: acpiview: DBG2: Remove
> redundant forward declarations", 2019-07-19).
>
> (1) The build process now seems to produce files called
>
> GlobalVar_37d7e986-f7e9-45c2-8067-e371421a626c_AARCH64.bin
> GlobalVar_37d7e986-f7e9-45c2-8067-e371421a626c_ARM.bin
> GlobalVar_5a9e7754-d81b-49ea-85ad-69eaa7b1539b_IA32.bin
> GlobalVar_5a9e7754-d81b-49ea-85ad-69eaa7b1539b_X64.bin
>
> in the project root directory. (The GUIDs match the PLATFORM_GUID
> values from ArmVirtQemu.dsc and OvmfPkg*.dsc.)
>
> They seem to come from patch "BaseTools: Enable Multiple Process
> AutoGen".
>
> Can we place these files under the Build/ directory somewhere,
> instead?
>
> (2) The QEMU project now bundles edk2 (as a git submodule for the edk2
> tree, and some firmware binaries built from that). The build scripts
> carried by QEMU set PYTHON_COMMAND to python2, in order to work
> around TianoCore BZ#1607.
>
> For regression-testing this set, I ran
>
> $ make -C roms efi
>
> with QEMU's edk2 submodule advanced to commit 8ff68cd5e4c9, plus
> this set applied (see above), and with QEMU itself being at
> e2b47666fe15.
>
> The build itself passes fine, so that's good.
>
> But, some (not all) of the firmware binaries are *misbuilt*. I'll
> elaborate below (independently of QEMU).
>
> (3) Consider the following test. (Note that this is independent of the
> QEMU project entirely.)
>
> (3a) Set up a pristine build environment:
>
> # Open a new shell, then:
> $ git clean -ffdx
> $ git reset --hard
> $ git submodule deinit --force --all
> $ git checkout master
> $ git submodule update --init
> $ export PYTHON_COMMAND=python2
> $ source edksetup.sh
> $ nice make -C "$EDK_TOOLS_PATH" \
> -j $(getconf _NPROCESSORS_ONLN)
>
> (3b) Build OvmfPkgIA32 with one set of features, and capture a
> checksum of the resultant firmware binary:
>
> $ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
> -t GCC48
> $ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
> 8d56575dd3b5c89ac6f1829ae4f41b55
>
> (3c) *Re*build the same platform with a different set of features,
> and capture a checksum again:
>
> $ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
> -t GCC48 -D SECURE_BOOT_ENABLE -D SMM_REQUIRE
> $ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
> ed79052e9add242174ccefc78a5bddb3
>
> Okay, now repeat all three steps from zero, but with the present
> feature patch set applied:
>
> (3d) Repeat (3a), but rather than checking out the master branch
> with "git checkout", check out the topic branch that is the
> same "master" commit, *plus* the present series applied on top.
>
> Don't forget to start a new shell first!
>
> (3e) Repeat (3b):
>
> $ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
> -t GCC48
> $ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
> 8d56575dd3b5c89ac6f1829ae4f41b55
>
> Notice that the checksum is identical to that from (3b), so
> all's good.
>
> (3f) Repeat (3c):
>
> $ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
> -t GCC48 -D SECURE_BOOT_ENABLE -D SMM_REQUIRE
> $ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
> baf08c5a9ecb0adc754f5b48410b6476
>
> This is the problem: the checksum does not match the one from
> (3c). And, the binary from (3f) is in fact mis-built, it
> crashes immediately when launched on QEMU/KVM, with "emulation
> failure".
>
> Thus:
>
> - Without the patch set, if I build a platform, then *rebuild* it
> with different -D flags, then both builds produce proper outputs.
>
> - With the patch series applied however, only the initial build
> produces good results. The *rebuild*, with different -D flags, is
> "poisoned" by the artifacts still laying around from the first
> build.
>
> In other words, this is a "stale cache" problem.
>
> (4) Not a show stopper, but a divergence from previous behavior:
>
> When one presses Ctrl-S (the "stop" character) in a terminal, the
> terminal output / autoscrolling is suspended, until Ctrl-Q (the
> "start" character) is pressed. This is generally used for two
> things: scrolling back the terminal history manually, for reading
> something that scrolled off the screen, and for briefly pausing
> whatever processing the program is doing.
>
> The idea being, if the program is blocked in a terminal write
> operation, it cannot do anything else.
>
> The present patch set changes the lastly mentioned behavior, because
> the logging is now decoupled to a separate thread ("BaseTools: Add
> LogAgent to support multiple process Autogen"). As a result, if the
> *output* of LogAgent is blocked, that does not block the *input* of
> LogAgent, and so the queue / buffer in LogAgent grows without bound,
> and the build continues even though the terminal output is stopped.
>
> This is a common initial symptom of programs with internal queues --
> it's usually good to implement internal flow control, under which
> any given queue has a limit, and if that limit is reached, the queue
> blocks the producer threads feeding it.
>
> Again, this is not a show stopper, just a divergence from previous
> behavior which we should be aware of, and possibly document
> somewhere.
>
> Importantly, Ctrl-Z (the "suspend" character) works just fine, for
> *both* purposes described above. (And then people can resume the
> build with "fg", like always.)
>
> (5) Two questions out of interest:
>
> - Has this series been tested for restarting builds that were
> interrupted with Ctrl-C? Can the new build "pick up" where the
> previous build was interrupted?
>
> - Do we intend to offer a flag for disabling this feature? I'm not
> implying that the old code should be preserved as an alternative,
> but perhaps some internal constants related to parallelization can
> be tweaked such that the ultimate behavior becomes serial. Is the
> "-n 1" option supposed to have that effect?
>
> The second question is relevant because people might want to disable
> the new feature until it stabilizes (see issue (3) for example).
>
> Thanks,
> Laszlo
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#44074): https://edk2.groups.io/g/devel/message/44074
Mute This Topic: https://groups.io/mt/32512451/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-
More information about the edk2-devel-archive
mailing list