[Libguestfs] [libnbd PATCH v4 1/4] states: Document our reliance on type overlaps
Laszlo Ersek
lersek at redhat.com
Mon Jun 19 11:49:28 UTC 2023
On 6/12/23 20:10, Richard W.M. Jones wrote:
> On Fri, Jun 09, 2023 at 03:39:19PM -0500, Eric Blake wrote:
>> [Bah - I typed up a longer response, but lost it when accidentally
>> trying to send through the wrong SMTP server, so now I have to
>> remember what I had...]
>>
>> On Fri, Jun 09, 2023 at 02:45:56PM +0200, Laszlo Ersek wrote:
>>> On 6/9/23 04:17, Eric Blake wrote:
>>>> When I added structured replies to the NBD spec, I intentionally chose
>>>> a wire layout where the magic number and cookie overlap, even while
>>>> the middle member changes from uint32_t error to the pair uint16_t
>>>> flags and type. Based only on a strict reading of C rules on
>>>> effective types and compatible type prefixes, it's probably
>>>> questionable on whether my reliance on type aliasing to reuse cookie
>>>> from the same offset of a union, or even the fact that a structured
>>>> reply is built by first reading bytes into sbuf.simple_reply then
>>>> following up with only bytes into the tail of sbuf.sr.structured_reply
>>>> is strictly portable. But since it works in practice, it's worth at
>>>> least adding some compile- and run-time assertions that our (ab)use of
>>>> aliasing is accessing the bytes we want under the types we expect.
>>>> Upcoming patches will restructure part of the sbuf layout to hopefully
>>>> be a little easier to tie back to strict C standards.
>>>>
>>>> Suggested-by: Laszlo Ersek <lersek at redhat.com>
>>>> Signed-off-by: Eric Blake <eblake at redhat.com>
>>>> ---
>>>> generator/states-reply.c | 17 +++++++++++++----
>>>> generator/states-reply-structured.c | 13 +++++++++----
>>>> 2 files changed, 22 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/generator/states-reply.c b/generator/states-reply.c
>>>> index 511e5cb1..2c77658b 100644
>>>> --- a/generator/states-reply.c
>>>> +++ b/generator/states-reply.c
>>>> @@ -17,6 +17,7 @@
>>>> */
>>>>
>>>> #include <assert.h>
>>>> +#include <stddef.h>
>>>>
>>>> /* State machine for receiving reply messages from the server.
>>>> *
>>>> @@ -63,9 +64,15 @@ REPLY.START:
>>>> ssize_t r;
>>>>
>>>> /* We read all replies initially as if they are simple replies, but
>>>> - * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below.
>>>> - * This works because the structured_reply header is larger.
>>>> + * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. This
>>>> + * works because the structured_reply header is larger, and because
>>>> + * the last member of a simple reply, cookie, is coincident between
>>>> + * the two structs (an intentional design decision in the NBD spec
>>>> + * when structured replies were added).
>>>> */
>>>> + STATIC_ASSERT (offsetof (struct nbd_handle, sbuf.simple_reply.cookie) ==
>>>> + offsetof (struct nbd_handle, sbuf.sr.structured_reply.cookie),
>>>> + cookie_aliasing);
>>>
>>> Can you perhaps append
>>>
>>> ... &&
>>> sizeof h->sbuf.simple_reply.cookie ==
>>> sizeof h->sbuf.sr.structured_reply.cookie
>>>
>>> (if you agree)?
>>
>> Yes, that makes sense, and I did so for what got pushed as 29342fedb53
>>
>>>
>>> Also, the commit message and the comment talk about the magic number as
>>> well, not just the cookie, and the static assertion ignores magic.
>>> However, I can see the magic handling changes in the next patch.
>>
>> I was a bit less concerned about magic (it is easy to see that it is
>> at offset 0 in both types and could satisfy the common prefix rules,
>> while seeing cookie's location and a non-common prefix makes the
>> latter more imporant to assert). But checking two members instead of
>> one shouldn't hurt, and in fact, once extended types are in (plus
>> patch 4/4 of this series also adds an anonymous sub-struct in 'union
>> reply_header' which is also worth validating), it may make sense to do
>> a followup patch that adds:
>>
>> #define ASSERT_MEMBER_OVERLAP(TypeA, memberA, TypeB, memberB) \
>> STATIC_ASSERT (offsetof (TypeA, memberA) == offsetof (TypeB, memberB) && \
>> sizeof ((TypeA *)NULL)->memberA == sizeof ((TypeB *)NULL)->memberB, \
>> member_overlap)
>>
>> to be used either as:
>>
>> ASSERT_MEMBER_OVERLAP (struct nbd_simple_reply, cookie,
>> struct nbd_structured_reply, cookie);
>>
>> or as
>>
>> ASSERT_MEMBER_OVERLAP (struct nbd_handle, sbuf.simple_reply.magic,
>> struct nbd_handle, sbuf.sr.structured_reply.magic);
>
> This is a nice idea!
>
>> Would it make sense to have the macro take only three arguments (since
>> both of those invocations repeat an argument); if so, is it better to
>> share the common type name, or the common member name?
>
> We can always start with the 3 arg version and change it if we need to
> later. At the moment I can't think of a reason to check that fields
> in two unrelated types overlap, since you'd presumably always want to
> use them through an actual union type, but I suppose it could happen.
That's a good point!
>
>> I also note that our "static-assert.h" file defines STATIC_ASSERT() as
>> a do/while statement (that is, it MUST appear inside a function body,
>> so we can't use it easily in .h files); contrast that with C11's
>> _Static_assert() or qemu's QEMU_BUILD_BUG_ON() that behave more as a
>> type declaration (and can therefore appear outside of a function body;
>> C23 will take it one step further by adding static_assert(expr)
>> alongside static_assert(expr, msg). I consider myself too tainted,
>> not only by helping with qemu's implementation, but also by reviewing
>> gnulib's implementation (which uses __VA_ARGS__ to emulate C23
>> semantics of an optional message), to be able to feel comfortable
>> trying to improve our static-assert.h for sharing back to nbdkit, but
>> I don't mind reviewing anyone else's attempts.
>
> Additionally, we currently only support GCC and Clang, so anything
> that works for those only is fine.
Sure, but where is it beneficial to put a static assertion outside of a
function body?
I can imagine using sizeofs and offsetofs in declarations, and we might
want to assert various overlaps (or other predicates) right there
(before those dependent declarations). I can see this being useful in
theory, but I don't see a practical need for it in libnbd / nbdkit at
the moment. Do it when it becomes a plus?
Laszlo
More information about the Libguestfs
mailing list