[edk2-devel] [Patch 0/9] Enable multiple process AutoGen

Laszlo Ersek lersek at redhat.com
Mon Jul 22 19:50:08 UTC 2019


On 07/22/19 11:11, Feng, Bob C wrote:
> Hi Laszlo,
> 
> Thanks for your detailed testing and comments.
> 
> I send out patch serial V2 to resolve comment 1#.  V2 also fixed some issues found by others and was generated based on master latest version.
> 
> For 3#, I can reproduce the issue, I'll fix it in patch serial V3.  My understanding is that when I fix 3#, 2# will be resolved accordingly. Right? 

Yes. I first discovered the problem with (2), but then I wanted to
reproduce the issue in isolation from QEMU. That is section (3). So if
you fix (3), I expect the build will work just fine as part of the QEMU
tree (2) as well.

> For 5#-2,  When using "-n 1", Autogen should start one worker subprocess to generate autogen files that will be the same as run in serial.
>                     I'll make the change in patches V3

OK, sounds good.

Thank you,
Laszlo

> 
> For the keyboard interrupt handling mentioned in 4# and 5#-1, I'll do more testing on it. And do the fix in V3.  
> Before sending the patches, I tested Ctrl+C to stop the build but not the "pick up".
> 
> Thanks,
> Bob
> 
> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek at redhat.com] 
> Sent: Sunday, July 21, 2019 8:26 PM
> To: Feng, Bob C <bob.c.feng at intel.com>
> Cc: devel at edk2.groups.io; Philippe Mathieu-Daudé <philmd at redhat.com>
> Subject: Re: [edk2-devel] [Patch 0/9] Enable multiple process AutoGen
> 
> Hi Bob,
> 
> On 07/18/19 08:14, Bob Feng wrote:
>> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1875
>>
>> In order to improve the build performance, we implemented 
>> multiple-processes AutoGen. This change will reduce 20% time for 
>> AutoGen phase.
>>
>> The design document can be got from:
>> https://edk2.groups.io/g/devel/files/Designs/2019/0627/Multiple-thread
>> -AutoGen.pdf
>>
>> This patch serial pass the build of Ovmf, MinKabylake, MinPurley, 
>> packages under Edk2 repository and intel client and server platforms.
> 
> I've done some basic regression-testing with this set, applied on top of current master (= commit 8ff68cd5e4c9, "ShellPkg: acpiview: DBG2: Remove redundant forward declarations", 2019-07-19).
> 
> (1) The build process now seems to produce files called
> 
>       GlobalVar_37d7e986-f7e9-45c2-8067-e371421a626c_AARCH64.bin
>       GlobalVar_37d7e986-f7e9-45c2-8067-e371421a626c_ARM.bin
>       GlobalVar_5a9e7754-d81b-49ea-85ad-69eaa7b1539b_IA32.bin
>       GlobalVar_5a9e7754-d81b-49ea-85ad-69eaa7b1539b_X64.bin
> 
>     in the project root directory. (The GUIDs match the PLATFORM_GUID
>     values from ArmVirtQemu.dsc and OvmfPkg*.dsc.)
> 
>     They seem to come from patch "BaseTools: Enable Multiple Process
>     AutoGen".
> 
>     Can we place these files under the Build/ directory somewhere,
>     instead?
> 
> (2) The QEMU project now bundles edk2 (as a git submodule for the edk2
>     tree, and some firmware binaries built from that). The build scripts
>     carried by QEMU set PYTHON_COMMAND to python2, in order to work
>     around TianoCore BZ#1607.
> 
>     For regression-testing this set, I ran
> 
>     $ make -C roms efi
> 
>     with QEMU's edk2 submodule advanced to commit 8ff68cd5e4c9, plus
>     this set applied (see above), and with QEMU itself being at
>     e2b47666fe15.
> 
>     The build itself passes fine, so that's good.
> 
>     But, some (not all) of the firmware binaries are *misbuilt*. I'll
>     elaborate below (independently of QEMU).
> 
> (3) Consider the following test. (Note that this is independent of the
>     QEMU project entirely.)
> 
>     (3a) Set up a pristine build environment:
> 
>          # Open a new shell, then:
>          $ git clean -ffdx
>          $ git reset --hard
>          $ git submodule deinit --force --all
>          $ git checkout master
>          $ git submodule update --init
>          $ export PYTHON_COMMAND=python2
>          $ source edksetup.sh
>          $ nice make -C "$EDK_TOOLS_PATH" \
>              -j $(getconf _NPROCESSORS_ONLN)
> 
>     (3b) Build OvmfPkgIA32 with one set of features, and capture a
>          checksum of the resultant firmware binary:
> 
>          $ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
>              -t GCC48
>          $ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
>          8d56575dd3b5c89ac6f1829ae4f41b55
> 
>     (3c) *Re*build the same platform with a different set of features,
>          and capture a checksum again:
> 
>          $ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
>              -t GCC48 -D SECURE_BOOT_ENABLE -D SMM_REQUIRE
>          $ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
>          ed79052e9add242174ccefc78a5bddb3
> 
>     Okay, now repeat all three steps from zero, but with the present
>     feature patch set applied:
> 
>     (3d) Repeat (3a), but rather than checking out the master branch
>          with "git checkout", check out the topic branch that is the
>          same "master" commit, *plus* the present series applied on top.
> 
>          Don't forget to start a new shell first!
> 
>     (3e) Repeat (3b):
> 
>          $ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
>              -t GCC48
>          $ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
>          8d56575dd3b5c89ac6f1829ae4f41b55
> 
>          Notice that the checksum is identical to that from (3b), so
>          all's good.
> 
>     (3f) Repeat (3c):
> 
>          $ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
>              -t GCC48 -D SECURE_BOOT_ENABLE -D SMM_REQUIRE
>          $ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
>          baf08c5a9ecb0adc754f5b48410b6476
> 
>          This is the problem: the checksum does not match the one from
>          (3c). And, the binary from (3f) is in fact mis-built, it
>          crashes immediately when launched on QEMU/KVM, with "emulation
>          failure".
> 
>     Thus:
> 
>     - Without the patch set, if I build a platform, then *rebuild* it
>       with different -D flags, then both builds produce proper outputs.
> 
>     - With the patch series applied however, only the initial build
>       produces good results. The *rebuild*, with different -D flags, is
>       "poisoned" by the artifacts still laying around from the first
>       build.
> 
>     In other words, this is a "stale cache" problem.
> 
> (4) Not a show stopper, but a divergence from previous behavior:
> 
>     When one presses Ctrl-S (the "stop" character) in a terminal, the
>     terminal output / autoscrolling is suspended, until Ctrl-Q (the
>     "start" character) is pressed. This is generally used for two
>     things: scrolling back the terminal history manually, for reading
>     something that scrolled off the screen, and for briefly pausing
>     whatever processing the program is doing.
> 
>     The idea being, if the program is blocked in a terminal write
>     operation, it cannot do anything else.
> 
>     The present patch set changes the lastly mentioned behavior, because
>     the logging is now decoupled to a separate thread ("BaseTools: Add
>     LogAgent to support multiple process Autogen"). As a result, if the
>     *output* of LogAgent is blocked, that does not block the *input* of
>     LogAgent, and so the queue / buffer in LogAgent grows without bound,
>     and the build continues even though the terminal output is stopped.
> 
>     This is a common initial symptom of programs with internal queues --
>     it's usually good to implement internal flow control, under which
>     any given queue has a limit, and if that limit is reached, the queue
>     blocks the producer threads feeding it.
> 
>     Again, this is not a show stopper, just a divergence from previous
>     behavior which we should be aware of, and possibly document
>     somewhere.
> 
>     Importantly, Ctrl-Z (the "suspend" character) works just fine, for
>     *both* purposes described above. (And then people can resume the
>     build with "fg", like always.)
> 
> (5) Two questions out of interest:
> 
>     - Has this series been tested for restarting builds that were
>       interrupted with Ctrl-C? Can the new build "pick up" where the
>       previous build was interrupted?
> 
>     - Do we intend to offer a flag for disabling this feature? I'm not
>       implying that the old code should be preserved as an alternative,
>       but perhaps some internal constants related to parallelization can
>       be tweaked such that the ultimate behavior becomes serial. Is the
>       "-n 1" option supposed to have that effect?
> 
>     The second question is relevant because people might want to disable
>     the new feature until it stabilizes (see issue (3) for example).
> 
> Thanks,
> Laszlo
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#44150): https://edk2.groups.io/g/devel/message/44150
Mute This Topic: https://groups.io/mt/32512451/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-





More information about the edk2-devel-archive mailing list