[edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier

Michael D Kinney michael.d.kinney at intel.com
Fri Jul 15 16:30:26 UTC 2022


Hi Pedro,

I would like to see feedback from Andrew Fish on this topic too.

Thanks,

Mike

From: Pedro Falcato <pedro.falcato at gmail.com>
Sent: Tuesday, July 12, 2022 3:30 PM
To: edk2-devel-groups-io <devel at edk2.groups.io>; Kinney, Michael D <michael.d.kinney at intel.com>
Cc: Andrew Fish (afish at apple.com) <afish at apple.com>; Gao, Liming <gaoliming at byosoft.com.cn>; Liu, Zhiguang <zhiguang.liu at intel.com>
Subject: Re: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier



On Tue, Jul 12, 2022 at 11:02 PM Michael D Kinney <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>> wrote:
Hi Pedro,

All good points. I agree that there are some assumptions in implementation code that sizeof(size_t) == sizeof(UINTN) == sizeof(VOID *).

When the PrintLib does use a format specifier defined by the ANSI C Spec and the behavior is not identical to ANSI C, then we state that in the format string description in PrintLib.h.  Since we do not define or use size_t, we are not strictly conformant.
So, do you agree on %z for UINTN? Stating that %z prints UINTN and not size_t in the description could work (and indeed, my original patch does that), even though it's a bit redundant since we have no size_t. Strict conformance is not something I'm looking for (and indeed, I believe that ship has sailed for UEFI), I just want an intuitive interface that resembles ANSI C as best as it can.

Mike


From: devel at edk2.groups.io<mailto:devel at edk2.groups.io> <devel at edk2.groups.io<mailto:devel at edk2.groups.io>> On Behalf Of Pedro Falcato
Sent: Tuesday, July 12, 2022 2:33 PM
To: edk2-devel-groups-io <devel at edk2.groups.io<mailto:devel at edk2.groups.io>>; Kinney, Michael D <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>>
Cc: Andrew Fish (afish at apple.com<mailto:afish at apple.com>) <afish at apple.com<mailto:afish at apple.com>>; Gao, Liming <gaoliming at byosoft.com.cn<mailto:gaoliming at byosoft.com.cn>>; Liu, Zhiguang <zhiguang.liu at intel.com<mailto:zhiguang.liu at intel.com>>
Subject: Re: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier



On Tue, Jul 12, 2022 at 9:46 PM Michael D Kinney <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>> wrote:
Hi Pedro,

There is a good brief description of size_t on this page:
https://en.wikipedia.org/wiki/C_data_types

sizeof() is built into the C compiler, but size_t is defined by the std C lib being used.  It can be as small as 16-bits.

We do not use the std C lib, so size_t is not defined when we do EDK II builds.  This means we do not know how large the value sizeof() returns is.
Hi Mike,

The C standard library and the compiler usually work together and need to agree on how large each type is (uint8_t, uint16_t, uint32_t, uint64_t, size_t), just like EDK2 does for its types. The definition of "implementation" (which encompasses both the compiler and the C standard library) is intentionally vague (as is a good chunk of the C standard) but essentially maps to "whatever makes your program run on a standard C environment".

However, your request is to support printing values of type UINTN in the PrintLib.  I agree this feature is not present today.  Adding in a new format specifier would be required.

Using %z may be confusing because size_t defined by a C lib we are not using and UINTN defined in the UEFI Spec are not the same.

Using %z makes sense because in practice (and almost by definition, although not technically) UINTN will hold the native word size, and so will size_t. UINTN is used all over the place in UEFI to represent lengths of things in memory, and so is size_t in standard C codebases. Even if we go by both standards' definition of size_t and UINTN ( "unsigned integer type of the result of the sizeof operator" vs "Unsigned value of native width" ), since the two types in every platform we will ever support[1] map to the same concept and the same type, I would defend the use of %z for UINTN.

Do you have a suggestion for a different specifier that perhaps is not used in the ANSI C printf() format string?
I believe we have quite a good amount of specifiers that are not reserved but I think those would be better used for really non-standard types instead of UINTN (which is effectively a size_t).

If we do want a 100% ANSI C conformant version of a print library, then I agree this would require a new library class and we would need to use different API names so the caller knows which format string style to use.
Agreed.

[1] In fact, I would dare to say that if there ever was a modern machine that we ported UEFI to, that had addresses/sizeof() > native word size, a lot of code would implicitly break as UINTN takes the role of a size_t in UEFI. As a quick example, all interfaces in boot services and runtime services that deal with memory length/size take a UINTN. This should be clarified in the spec (unless it's written down somewhere and I don't see it). My interpretation of both standards leads me to conclude that both size_t and UINTN are intended to be used to represent addresses and lengths in memory.

All the best,
Pedro


There is a long history associated with the format string defined in PrintLib today.  Mostly related to size optimizations of the PrintLib when FLASH devices were much, much smaller.

Mike

From: devel at edk2.groups.io<mailto:devel at edk2.groups.io> <devel at edk2.groups.io<mailto:devel at edk2.groups.io>> On Behalf Of Pedro Falcato
Sent: Wednesday, July 6, 2022 3:27 PM
To: Kinney, Michael D <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>>
Cc: devel at edk2.groups.io<mailto:devel at edk2.groups.io>; Andrew Fish (afish at apple.com<mailto:afish at apple.com>) <afish at apple.com<mailto:afish at apple.com>>; Gao, Liming <gaoliming at byosoft.com.cn<mailto:gaoliming at byosoft.com.cn>>; Liu, Zhiguang <zhiguang.liu at intel.com<mailto:zhiguang.liu at intel.com>>
Subject: Re: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier

On Wed, Jul 6, 2022 at 7:22 PM Kinney, Michael D <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>> wrote:
Hi Pedro,

This is an interesting feature.

It is backwards compatible since you are adding a format specifier to the PrintLib class.

There is a 2nd lib instance that needs to be updated, and that is DxePrintLibPrint2Protocol in MdeModulePkg.

I think using ANSI C %z specifier syntax assumes that sizeof(size_t) == sizeof(UINTN) for all supported compilers/CPU archs.
The ANSI C 99 specification does not state that this is always true.  If may be true in practice for the compiler/CPU archs
currently supported by Tianocore.

It would be good to add a STATIC_ASSERT() to check that this is true and break the build if a compiler+CPU combination
is used where that assumption is false.  Not sure if this should go in Base.h with other STATIC_ASSERT()s for
types.  Or if it should go in PrintLib.h where this assumption is being made.


Hi Mike,
How can sizeof(size_t) != sizeof(UINTN)? It seems to be quite a stretch to interpret the standard in a way that the native word size will not be able to store "the maximum size of a theoretically possible object of any type". In any case, getting the size_t type is non-trivial and as far as I can tell would either depend on a C library or C extension trickery to try to get the type of sizeof(Type).

I would like to see some more background on the motivation for adding this feature.
I think the current workaround would be to typecast and integer of type size_t to UINT64 and use %Ld or
typecase to UINT32 and use %d.

Ok, some background: I essentially starting looking at DebugLib and PrintLib and I found myself very surprised by a good chunk of decisions. The API is very similar to standard printf() but it lacks support for a good portion of specifiers; some specifiers it implements don't have the C99 standard meaning, like the ones it implements for integers (%x and %lx for hex, for instance) which were non-obviously overridden to mean print UINT32 and print UINT64. Because of that, I feel that maybe reworking this library (creating PrintLib2 or something, since we can't break the existing uses) would be a good idea to have a "principle of least surprise" compliant API and would help smoothen the confusion curve between userspace code and EDK2 firmware.

Since reworking the whole of PrintLib is a non-trivial amount of work, I settled for this very tiny feature that doesn't break anyone's code and is very handy to avoid odd workarounds like casting everything to a UINT64 so you can print it with %lx. Obviously, if there's interest in getting a more standard environment[1] in EDK2 (and I personally, although possibly naiively, don't see the downside), going down a PrintLib2 in the long haul is a good idea. But this patch took me around 10 minutes to write up and is probably useful in its current form.

Thanks,

Pedro

[1] Why do I want to get a more standardized environment? Well, I simply believe it smoothens the learning curve between userspace -> kernel -> bootloader -> firmware. A good chunk of kernels and bootloaders already have relatively standard C APIs, but firmware, at least in our case, is still lacking. Also, makes you less prone to mistakes.

Thanks,

Mike


> -----Original Message-----
> From: devel at edk2.groups.io<mailto:devel at edk2.groups.io> <devel at edk2.groups.io<mailto:devel at edk2.groups.io>> On Behalf Of Pedro Falcato
> Sent: Monday, July 4, 2022 6:16 PM
> To: devel at edk2.groups.io<mailto:devel at edk2.groups.io>
> Cc: Kinney, Michael D <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>>; Gao, Liming <gaoliming at byosoft.com.cn<mailto:gaoliming at byosoft.com.cn>>; Liu, Zhiguang
> <zhiguang.liu at intel.com<mailto:zhiguang.liu at intel.com>>
> Subject: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier
>
> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3977
>
> %z is used in standard C99 as the printf specifier for size_t types.
> Add support for it so we can portably print UINTN.
>
> Cc: Michael D Kinney <michael.d.kinney at intel.com<mailto:michael.d.kinney at intel.com>>
> Cc: Liming Gao <gaoliming at byosoft.com.cn<mailto:gaoliming at byosoft.com.cn>>
> Cc: Zhiguang Liu <zhiguang.liu at intel.com<mailto:zhiguang.liu at intel.com>>
> Signed-off-by: Pedro Falcato <pedro.falcato at gmail.com<mailto:pedro.falcato at gmail.com>>
> ---
>  MdePkg/Include/Library/PrintLib.h              | 13 ++++++++-----
>  MdePkg/Library/BasePrintLib/PrintLibInternal.c |  9 +++++++++
>  MdePkg/Library/BasePrintLib/PrintLibInternal.h |  1 +
>  3 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/MdePkg/Include/Library/PrintLib.h b/MdePkg/Include/Library/PrintLib.h
> index 8d523cac52..0d67f62d3f 100644
> --- a/MdePkg/Include/Library/PrintLib.h
> +++ b/MdePkg/Include/Library/PrintLib.h
> @@ -42,6 +42,9 @@ SPDX-License-Identifier: BSD-2-Clause-Patent
>      - L, l
>        - The number being printed is size UINT64.  Only valid for types X, x, and d.
>          If this flag is not specified, then the number being printed is size int.
> +    - z
> +      - The number being printed is of size UINTN. Only valid for types X, x and d.
> +        If this flag is not specified, then the number being printed is size int.
>      - NOTE: All invalid flags are ignored.
>
>    [width]:
> @@ -73,18 +76,18 @@ SPDX-License-Identifier: BSD-2-Clause-Patent
>          using this type too by making sure bits 8..15 of the argument are set to 0.
>      - x
>        - The argument is an unsigned hexadecimal number.  The characters used are 0..9 and
> -        A..F.  If the flag 'L' is not specified, then the argument is assumed
> +        A..F.  If the flags 'L', 'z' are not specified, then the argument is assumed
>          to be size int.  This does not follow ANSI C.
>      - X
>        - The argument is an unsigned hexadecimal number and the number is padded with
> -        zeros.  This is equivalent to a format string of "0x". If the flag
> -        'L' is not specified, then the argument is assumed to be size int.
> +        zeros.  This is equivalent to a format string of "0x". If the flags
> +        'L', 'z' are not specified, then the argument is assumed to be size int.
>          This does not follow ANSI C.
>      - d
> -      - The argument is a signed decimal number.  If the flag 'L' is not specified,
> +      - The argument is a signed decimal number.  If the flags 'L', 'z' are not specified,
>          then the argument is assumed to be size int.
>      - u
> -      - The argument is a unsigned decimal number.  If the flag 'L' is not specified,
> +      - The argument is a unsigned decimal number.  If the flags 'L'. 'z' are not specified,
>          then the argument is assumed to be size int.
>      - p
>        - The argument is a pointer that is a (VOID *), and it is printed as an
> diff --git a/MdePkg/Library/BasePrintLib/PrintLibInternal.c b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> index 42b598a432..1cd99b2213 100644
> --- a/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> +++ b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> @@ -720,6 +720,9 @@ BasePrintLibSPrintMarker (
>              case 'l':
>                Flags |= LONG_TYPE;
>                break;
> +            case 'z':
> +              Flags |= SIZET_TYPE;
> +              break;
>              case '*':
>                if ((Flags & PRECISION) == 0) {
>                  Flags |= PAD_TO_WIDTH;
> @@ -833,6 +836,12 @@ BasePrintLibSPrintMarker (
>                } else {
>                  Value = BASE_ARG (BaseListMarker, int);
>                }
> +            } else if ((Flags & SIZET_TYPE) != 0) {
> +              if (BaseListMarker == NULL) {
> +                Value = VA_ARG (VaListMarker, UINTN);
> +              } else {
> +                Value = BASE_ARG (BaseListMarker, UINTN);
> +              }
>              } else {
>                if (BaseListMarker == NULL) {
>                  Value = VA_ARG (VaListMarker, INT64);
> diff --git a/MdePkg/Library/BasePrintLib/PrintLibInternal.h b/MdePkg/Library/BasePrintLib/PrintLibInternal.h
> index 34d591c6fc..9193e6192b 100644
> --- a/MdePkg/Library/BasePrintLib/PrintLibInternal.h
> +++ b/MdePkg/Library/BasePrintLib/PrintLibInternal.h
> @@ -29,6 +29,7 @@
>  #define ARGUMENT_REVERSED    BIT12
>  #define COUNT_ONLY_NO_PRINT  BIT13
>  #define UNSIGNED_TYPE        BIT14
> +#define SIZET_TYPE           BIT15
>
>  //
>  // Record date and time information
> --
> 2.37.0
>
>
>
>
>


--
Pedro Falcato


--
Pedro Falcato



--
Pedro Falcato


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#91419): https://edk2.groups.io/g/devel/message/91419
Mute This Topic: https://groups.io/mt/92177290/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/edk2-devel-archive/attachments/20220715/abf9a088/attachment-0001.htm>


More information about the edk2-devel-archive mailing list