[edk2-devel] [PATCH EDK2 v2 1/1] SecurityPkg/DxeImageVerificationLib:Enhanced verification of Offset

Tue Aug 18 12:21:36 UTC 2020

As a follow-up to Marvin’s e-mail and in the reference of the thread. I should add that to make the result more reliable we not only hope for the reviewer experience and testing, but also write formal proofs for several code properties. This is the primary reason the C code has not yet left the facility, despite being mostly written. If you feel interested, we use an opensource inhouse-written tool, AstraVer[1][2], which is an extension of Frama-C[3].

Best regards,
Vitaly

[1] https://arxiv.org/pdf/1809.00626.pdf <https://arxiv.org/pdf/1809.00626.pdf>
[2] https://www.isprasopen.ru/2018/docs/Volkov.pdf <https://www.isprasopen.ru/2018/docs/Volkov.pdf>
[3] https://frama-c.com <https://frama-c.com/>

> 18 авг. 2020 г., в 13:24, Marvin Häuser <mhaeuser at posteo.de> написал(а):
> 
> 
> Good day everyone,
> 
> First off, for your information, I'm sending from my new e-mail address
> from now on.
> 
> Please excuse me, I cannot read your entire thread right now, I will
> definitely make sure to catch up as soon as time permits, but I just
> wanted to confirm we are indeed working on a reimplementation of the PE
> loader.
> It involves correcting several security issues (which will be detailed
> as the patches are sent as anything else would be too much work for us
> right now), reducing code duplication (how often is the hashing
> algorithm duplicated across edk2? :) ) and a more or less experimental
> approach to formal verification. We plan to submit it this year, however
> please note that this is a low priority project and is not being worked
> on on a full-time basis.
> 
> Please let us know about your own plans so we do not end up duplicating
> work.
> 
> Best regards,
> Marvin
> 
> Am 18.08.2020 um 12:17 schrieb Laszlo Ersek:
>> Hi Jiewen,
>> 
>> (+Marvin, +Vitaly)
>> 
>> On 08/18/20 01:23, Yao, Jiewen wrote:
>>>> -----Original Message-----
>>>> From: devel at edk2.groups.io <devel at edk2.groups.io> On Behalf Of Laszlo
>>>> Ersek
>>>> Sent: Tuesday, August 18, 2020 12:53 AM
>>>> To: Yao, Jiewen <jiewen.yao at intel.com>; devel at edk2.groups.io;
>>>> xiewenyi2 at huawei.com; Wang, Jian J <jian.j.wang at intel.com>
>>>> Cc: huangming23 at huawei.com; songdongkuang at huawei.com
>>>> Subject: Re: [edk2-devel] [PATCH EDK2 v2 1/1]
>>>> SecurityPkg/DxeImageVerificationLib:Enhanced verification of Offset
>> 
>> [...]
>> 
>>> However, I do think the producer is mandatory for a fix or at least a
>>> security fix.
>>> The owner to fix the issue should guarantee the patch is good.
>>> The owner shall never rely on the code reviewer to figure out if the
>>> patch is good and complete.
>>> 
>>> I have some bad experience that bug owner just wrote a patch and tried
>>> to fix a problem, without any test.
>>> And it happened passed code review from someone who does not well
>>> understand the problem, but give rb based upon the time pressure.
>>> Later, the fix was approved to be useless.
>>> 
>>> In my memory, at least 3 cases were security fix. They are found, just
>>> because they are sensitive, more people took a look later.
>>>     It was simple. It was one-line change.
>>>    But it has not test, and it was wrong.
>>> "It was ridiculous" -- commented by the people who find the so-called
>>> security fix does not fix the issue.
>> 
>> Just because sloppy/rushed reviews exist, and just because reviewers
>> operate under time pressure, we should not automatically reject security
>> fixes that come without a reproducer.
>> 
>> Some organizations do develop reproducers, but they never share them
>> publicly (for fear of abuse by others).
>> 
>> But more importantly, in an open development project, a developer could
>> have time and expertise to contribute a fix, but not to create a
>> reproducer.
>> 
>> - If we make contributing harder, fewer people will upstream their
>>   fixes.
>> 
>> - If we make contributing harder, then contributions that do make it to
>>   the tree will be of higher quality.
>> 
>> Both statements ring true to me -- so it's a tradeoff.
>> 
>> (By "we", I mean the edk2 community.)
>> 
>>>> Additionally, the exact statement that the bug report does make,
>>>> namely
>>>> 
>>>>   it's possible to overflow Offset back to 0 causing an endless loop
>>>> 
>>>> is wrong (as far as I can tell anyway). It is not "OffSet" that can
>>>> be overflowed to zero, but the *addend* that is added to OffSet can
>>>> be overflowed to zero. Therefore the infinite loop will arise because
>>>> OffSet remains stuck at its present value, and not because OffSet
>>>> will be re-set to zero.
>>>> 
>>>> For the reasons, we can only speculate as to what the actual problem
>>>> is, unless Jian decides to join the discussion and clarifies what he
>>>> had in mind originally.
>>> 
>>> [Jiewen] Would you please clarify what do you mean "we" here?
>>> If "we" means the bug dispatcher, it is totally OK. The dispatcher
>>> just assign the bug.
>>> If "we" means the developer assigned to fix the bug, it is NOT OK. The
>>> developer should take the responsibility to understand the problem.
>> 
>> By "we", I mean the edk2 community.
>> 
>>>> We can write a patch based on code analysis. It's possible to
>>>> identify integer overflows based on code analysis, and it's possible
>>>> to verify the correctness of fixes by code review. Obviously testing
>>>> is always good, but many times, constructing reproducers for such
>>>> issues that were found by code review, is difficult and time
>>>> consuming. We can say that we don't fix vulnerabilities without
>>>> reproducers, or we can say that we make an effort to fix them even if
>>>> all we have is code analysis (and not a reproducer).
>>> 
>>> [Jiewen] I would say: yes and no.
>>> Yes, I agree with you that it might be difficult and time consuming to
>>> construct the reproducer.
>>> However, "obviously" is a subject term. Someone may think something is
>>> obvious, but other people does not.
>>> We should be clear the responsibility of the patch provider is to
>>> provide high quality patch.
>>> Having basic unit test is the best way to prove that the fix is good.
>>> 
>>> I have seen bad cases when I ask for the test for patch, then the
>>> answer I got is: "I test the windows boot".
>>> But the test - windows boot - has nothing related to the patch. It
>>> only proves no regression, but cannot prove the issue described is
>>> resolved.
>> 
>> Right. It would be ideal if every patch came with a unit test. But that
>> also means some folks will contribute less.
>> 
>> Consider normal (not security) patches. We require that all function
>> return values be checked (unless it really doesn't matter if a function
>> call fails). If a function call fails, we need to roll back the actions
>> taken thus far. Release resources and so on. This is why we have the
>> "cascade of error handling labels" pattern.
>> 
>> But, of course, we don't test every possible error path in the code. So
>> what's the solution there:
>> 
>> - reject such patches that carefully construct the error paths, but do
>>   not provide unit tests with complete error path coverage?
>> 
>> - say that we don't care about thorough error paths, so let's just hang,
>>   or leak resources, whenever something fails?
>> 
>> Personally I prefer the detailed error paths. They need to be written
>> and reviewed carefully. And they can be accepted even if they are not
>> tested with complete coverage.
>> 
>> Some people think otherwise; they say no untested (untestable) code
>> should ever be merged.
>> 
>> Back to security patches -- creating reproducers usually requires a
>> setup (tools, expertise, time allocation etc) that is different from a
>> "normal" setup. It may require specialized binary format editors,
>> expertise in "penetration testing", and so on.
>> 
>> I mostly know the C language rules that pertain to integer and buffer
>> overflows, so I can usually spot their violations in C code, and propose
>> fixes for them too. But I'm not a security researcher, so I don't write
>> exploits as a norm -- I don't even investigate, generally speaking, the
>> potential practical impact of "undefined behavior". When there's a
>> buffer overflow or integer overflow in the code, that's the *end* of the
>> story for me, while it's the *start* of the work for a security
>> researcher.
>> 
>> When you require reproducers for all security patches, you restrict the
>> potential contributor pool to security researchers.
>> 
>>> Let's think again in this case, if the patch provider does some basic
>>> unit test, he/she may find out the problem by himself/herself.
>>> That can save other people's time to review.
>>> 
>>> I don't prefer to move the responsibility from patch provider to the
>>> code reviewer to check if the fix is good.
>>> Otherwise, the code reviewer may be overwhelmed.
>>> 
>>> We may clarify and document the role and responsibility in EDKII
>>> clearly. Once that is ready, we can follow the rule.
>>> Before that is ready, in this particular case, I still prefer we have
>>> producer to prove the patch is good.
>> 
>> OK, thanks for explaining.
>> 
>> Given that I'm unable to create such a PE file (from scratch or by
>> modifying another one), I won't post the patches stand-alone.
>> 
>>>> So the above paragraph concerns "correctness". Regarding
>>>> "completeness", I guarantee you that this patch does not fix *all*
>>>> problems related to PE parsing. (See the other BZ tickets.) It does
>>>> fix *one* issue with PE parsing. We can say that we try to fix such
>>>> issues gradually (give different CVE numbers to different issues, and
>>>> address them one at a time), or we can say that we rewrite PE parsing
>>>> from the ground up. (BTW: I have seriously attempted that in the
>>>> past, and I gave up, because the PE format is FUBAR.)
>>> 
>>> [Jiewen] Maybe there is misunderstanding.
>>> I do not mean to let patch provider to fix all issue in PE parsing.
>>> Just like we cannot file one Bugzilla to fix all issue in EDKII - it
>>> is unfair.
>>> 
>>> What I mean is that the patch provider should guarantee the
>>> correctness and completeness of the issue in the bug report.
>>> 
>>> One faked bad example of correctness is:
>>>     A bug report is file to say: the code has overflow class A.
>>>     The factor is: the code has overflow class A at line X and line Y.
>>>     The patch only modified some code at line X, but the overflow
>>>     class A at line X still exists.
>>> 
>>> One faked bad example of completeness is:
>>>     A bug report is file to say: the code has overflow class A.
>>>     The factor is: the code has overflow class A at line X and line Y.
>>>     The patch only fixed the overflow class A at line X but not line
>>>     Y.
>>> 
>>> The patch provider should take responsibility to do that work
>>> seriously to find out issue in line X and line Y and fix them.
>>> He/she may choose to just fix line X and line Y. Rewrite is whole
>>> module is NOT required.
>> 
>> I agree completely.
>> 
>> My point was that we need the bug report to be precise, in the first
>> place. If the bug report doesn't clearly identify lines X and Y, we will
>> likely not get the completeness part right.
>> 
>> "Clearly identify" may mean spelling out lines X and Y specifically. Or
>> it may mean defining "class A" sufficiently clearly that someone else
>> reading the affected function can find X and Y themselves.
>> 
>>> If I can give some comment, I would think about the provide the fix in
>>> BasePeCoffLib.
>> 
>> From a software design perspective, you are 100% right.
>> 
>> Unfortunately, I can't do it.
>> 
>> That's what I mentioned before -- I had tried rewriting BasePeCoffLib,
>> because in my opinion, BasePeCoffLib is unsalvageable in its current
>> form. And I gave up on the rewrite.
>> 
>> Please see the following email. I sent it to some people off-list, on
>> 2020-Feb-14:
>> 
>>> There are currently four (4) TianoCore security BZs (1957, 1990, 1993,
>>> 2215), embargoed, that describe various ways in which cunningly
>>> crafted PE images can evade Secure Boot verification.
>>> 
>>> [...]
>>> 
>>> Primarily, I just couldn't find my peace with the idea that fixing
>>> such PE/COFF parsing mistakes (integer overflows, buffer overflows)
>>> *must* be a one-by-one fixing game. I wanted an approach that would
>>> fix these *classes* of vulnerabilities, in PE/COFF parsing.
>>> 
>>> So, beginnning of this February I returned to this topic, and spent
>>> two days on prototyping / researching a container / interval based
>>> approach. Here's one of the commit messages, as a way of explaining:
>>> 
>>>     OvmfPkg/DxePeCoffValidatorLib: introduce CONTAINER type and helper funcs
>>> 
>>>     For validating the well-formedness of a PE/COFF file, introduce the
>>>     CONTAINER type, and some workhorse functions. (The functions added in this
>>>     patch will not be called directly from the code that will process PE/COFF
>>>     structures.)
>>> 
>>>     The CONTAINER type describes a contiguous non-empty interval in a PE/COFF
>>>     file (on-disk representation, or in-memory representation). Containers can
>>>     be nested. The data from scalar-sized containers can be read out, as part
>>>     of their creation. For on-disk representations of PE/COFF files, scalar
>>>     reads are permitted; for in-memory representations, no data access is
>>>     permitted (only CONTAINER tracking / nesting).
>>> 
>>>     The goals of CONTAINER are the following:
>>> 
>>>     - enforce the proper nesting of PE/COFF structures (structure boundaries
>>>       must not be crossed by runs of data);
>>> 
>>>     - prevent integer overflows and buffer overflows;
>>> 
>>>     - prevent zero-size structures;
>>> 
>>>     - prevent infinite nesting by requiring proper sub-intervals;
>>> 
>>>     - prevent internal PE/COFF pointers from aliasing each other (unless they
>>>       point at container and containee structures);
>>> 
>>>     - terminate nesting at scalar-sized containers;
>>> 
>>>     - assuming an array of pointers is processed in increasing element order,
>>>       enforce that the pointed-to objects are located at increasing offsets
>>>       too;
>>> 
>>>     - assign human-readable names to PE/COFF structures and fields, for
>>>       debugging PE/COFF malformations.
>>> 
>>> Because, several of the vulnerabilities exploited cross-directed and
>>> aliased internal pointers in PE/COFF files.
>>> 
>>> Two days of delirious spec reading and coding later, and 2000+ lines
>>> later, I decided that my idea was unviable. The PE/COFF spec was so
>>> incredibly mis-designed and crufty that enforcing the *internal*
>>> consistency of all the size fields and the internal pointers would
>>> inevitably fall into one of the following categories:
>>> 
>>> - the checks wouldn't be strict enough, and some nasty images would
>>>   slip through,
>>> 
>>> - the checks would be too strict, and some quirky, but valid, images
>>>   would be unjustifiedly caught.
>>> 
>>> So I gave up and I've accepted that it remains a whack-a-mole game.
>>> [...]
>>> 
>>> (NB: I don't claim that ELF is not similarly brain-damaged.)
>> 
>> So now, I've only considered contributing patches for bug#2215 because
>> the code in question resides in DxeImageVerificationLib, and *not* in
>> BasePeCoffLib. I'm not going to touch BasePeCoffLib -- in my opinion,
>> BasePeCoffLib is unfixable without a complete rewrite.
>> 
>> I would *like* if BasePeCoffLib were fixable incrementally, but I just
>> don't see how that's possible.
>> 
>> In support of my opinion, please open the following bugzilla ticket:
>> 
>>   https://bugzilla.tianocore.org/show_bug.cgi?id=2643
>> 
>> and search the comments (with the browser's in-page search feature, such
>> as Ctrl+F) for the following expression:
>> 
>>   new PE loader
>> 
>> I understand exactly what Vitaly and Marvin meant in those comments. :(
>> 
>> Thanks,
>> Laszlo
>> 

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#64377): https://edk2.groups.io/g/devel/message/64377
Mute This Topic: https://groups.io/mt/76165658/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/edk2-devel-archive/attachments/20200818/a98d8290/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <http://listman.redhat.com/archives/edk2-devel-archive/attachments/20200818/a98d8290/attachment.sig>