[edk2-devel] [PATCH v1 2/2] EmbeddedPkg: Add cast from (void*) for VS2017 build

Thu Jul 23 10:52:22 UTC 2020

On Wed, Jul 22, 2020 at 23:11:40 +0200, Laszlo Ersek wrote:
> >> diff --git a/EmbeddedPkg/Library/AndroidBootImgLib/AndroidBootImgLib.c b/EmbeddedPkg/Library/AndroidBootImgLib/AndroidBootImgLib.c
> >> index e1036954ee586dfc30266eec2897d71bfc949038..bbe0d41018b3d5665c72ee61efe737ae57b1b2eb 100644
> >> --- a/EmbeddedPkg/Library/AndroidBootImgLib/AndroidBootImgLib.c
> >> +++ b/EmbeddedPkg/Library/AndroidBootImgLib/AndroidBootImgLib.c
> >> @@ -1,6 +1,6 @@
> >>  /** @file
> >>
> >> -  Copyright (c) 2013-2014, ARM Ltd. All rights reserved.<BR>
> >> +  Copyright (c) 2013-2020, ARM Ltd. All rights reserved.<BR>
> >>    Copyright (c) 2017, Linaro. All rights reserved.
> >>
> >>    SPDX-License-Identifier: BSD-2-Clause-Patent
> >> @@ -97,7 +97,7 @@ AndroidBootImgGetKernelInfo (
> >>    ASSERT (IS_VALID_ANDROID_PAGE_SIZE (Header->PageSize));
> >>
> >>    *KernelSize = Header->KernelSize;
> >> -  *Kernel = BootImg + Header->PageSize;
> >> +  *Kernel = (UINT8*)BootImg + Header->PageSize;
> >
> > The reason I prefer my version, although this one would also solve the
> > compilation error, is that I really don't like casts to char * (which
> > this effectively is) as a workaround.
> >
> > The problem I have with it is that a cast is a signal of intent (this
> > thing that we have been viewing as an X should now be seen as a Y) -
> > and the intent here is simply to get the side effect that a char has a
> > known size of 1 (whereas a void doesn't).
> > I will admit it is the first time I have seen it used for this purpose
> > :)
> 
> Personally I'd be OK with either fix (yours or Pierre's); to me both
> express the exact same thing -- move Header->PageSize bytes past
> BootImg.
> 
> Viewed from a portability perspective, Pierre's patch is actually more
> portable (per C standard); as UINT8 stands for "unsigned char", and that
> is called "object representation" by the C standard:

Because there was a time when C didn't have void pointers.

> > 6.2.6 Representations of types
> > 6.2.6.1 General
> >
> > 3 Values stored in unsigned bit-fields and objects of type unsigned
> >   char shall be represented using a pure binary notation. (Footnote
> >   40.)
> >
> > 4 Values stored in non-bit-field objects of any other object type
> >   consist of n * CHAR_BIT bits, where n is the size of an object of
> >   that type, in bytes. The value may be copied into an object of type
> >   unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
> >   called the object representation of the value.
> >
> > Footnote 40: [...] A byte contains CHAR_BIT bits, and the values of
> >              type unsigned char range from 0 to (2^CHAR_BIT)-1.
> 
> Further references:
> 
> > 6.2.5 Types
> >
> > 27 A pointer to void shall have the same representation and alignment
> >    requirements as a pointer to a character type. Footnote 39 [...]
> >
> > Footnote 39: The same representation and alignment requirements are
> >              meant to imply interchangeability as arguments to
> >              functions, return values from functions, and members of
> >              unions.
> 
> > 6.3.2.3 Pointers
> >
> > 7 [...] When a pointer to an object is converted to a pointer to a
> >   character type, the result points to the lowest addressed byte of
> >   the object. Successive increments of the result, up to the size of
> >   the object, yield pointers to the remaining bytes of the object.
> 
> > 6.5 Expressions
> >
> > 6 The effective type of an object for an access to its stored value is
> >   the declared type of the object, if any. (Footnote 75.) [...] If a
> >   value is copied into an object having no declared type using memcpy
> >   or memmove, or is copied as an array of character type, then the
> >   effective type of the modified object for that access and for
> >   subsequent accesses that do not modify the value is the effective
> >   type of the object from which the value is copied, if it has one.
> 
> > Footnote 75: Allocated objects have no declared type.
> 
> (This is one of the most exciting parts of the standard BTW: it means
> that, if you malloc() 4 bytes, and copy a local uint32_t variable into
> it, with an open-coded (unsigned char*) loop, then the effective type of
> the dynamically allocated object becomes uint32_t for subsequent reads!
> Which means for example, considering the effective type rules strictly,
> that reading the dynamically allocated object post-copy, even via
> *correctly aligned* (uint16_t*) pointers, is undefined behavior.)

Which is only made funnier by the fact that you cannot implement a
compliant malloc that returns a pointer with less than your largest
data type alignment requirement.

> > 7 An object shall have its stored value accessed only by an lvalue
> >   expression that has one of the following types (Footnote 76):
> >   - [...]
> >   - a character type.
> >
> > Footnote 76: The intent of this list is to specify those circumstances
> >              in which an object may or may not be aliased.
> 
> Whereas converting a (VOID*) to UINTN (and maybe back) is only covered
> here (if I remember correctly):
> 
> > 6.3 Conversions
> > 6.3.2 Other operands
> > 6.3.2.3 Pointers
> >
> > 5 An integer may be converted to any pointer type. Except as
> >   previously specified, the result is implementation-defined, might
> >   not be correctly aligned, might not point to an entity of the
> >   referenced type, and might be a trap representation. (Footnote 56)
> >
> > 6 Any pointer type may be converted to an integer type. Except as
> >   previously specified, the result is implementation-defined. If the
> >   result cannot be represented in the integer type, the behavior is
> >   undefined. The result need not be in the range of values of any
> >   integer type.
> >
> > Footnote 56: The mapping functions for converting a pointer to an
> >              integer or an integer to a pointer are intended to be
> >              consistent with the addressing structure of the execution
> >              environment.
> 
> So my only point is that (void*) --> (unsigned char *) has no
> "implementation-defined" dependencies, while (VOID*) --> (UINTN) is
> implementation-defined.

Sure. But (unsigned char *) still reads as "pointer to region
containing stuff with 1-byte alignment requirements", whereas (void *)
reads as pointer to region containing
...well-let's-not-worry-about-that-for now...

See also https://martinfowler.com/bliki/CodeSmell.html

> (The dependency exploited in edk2 is that UINTN
> is always just as wide as (VOID*).)

Yes. I wouldn't like to to try to port EDK2 to CHERI.

I actually find most of these things easier to reason about in the
context of fat pointers. But that case also emphasises the danger
presented by the special treatment of unsigned char * (for legacy
purposes) in the standard is.
In a system where pointers contained information about the
size/alignment of the element pointed to, that would now in hardware
encode "1-byte alignment" instead of "unknown alignment". And would be
dereferenced without raising an exception.

> But, given both patches are for edk2, where UINTN <-> (VOID*)
> interchangeability is guaranteed, I'm equally fine with both patches.

Thanks!

/
    Leif

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#63187): https://edk2.groups.io/g/devel/message/63187
Mute This Topic: https://groups.io/mt/75211231/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-