[Libguestfs] [PATCH libnbd 0/5] generator: Add attribute((nonnull)) annotations

Tue Sep 27 18:59:27 UTC 2022

On Tue, Sep 27, 2022 at 06:24:17PM +0200, Laszlo Ersek wrote:
> On 09/27/22 16:46, Richard W.M. Jones wrote:
> > This patch series adds nonnull annotations for parameters which should
> > be non-NULL.
> > 
> > There was much discussion on IRC about whether this is a good idea,
> > pointing in particular to the bug below which is still present in
> > modern GCC.  It's better to have these discussions on list so they're
> > archived.
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1041336
> > 
> > There's a possible follow-up patch which *removes* all the pointer ==
> > NULL tests added in the final patch, again something for discussion.
> > See my view on this topic here (and Eric's follow up):
> > https://listman.redhat.com/archives/libguestfs/2022-September/029966.html
> 
> I think it boils down to the permitted multiplicity of a paramater:
> 
> - exactly 1 (mandatory parameter)
> - 0 or 1 (optional parameter)
> - 1 or more (mandatory (non-empty) list)

argv[] for nbd_connect_command

> - 0 or more (optional list)

queries[] for nbd_opt_list_meta_context_queries (in my v3 series)

> 
> First I think we should figure out what parameter has what multiplicity.
> 
> Then, it should be documented for the end user (this can be generated,
> but either way, the documentation should be clear about the decisions).

Yes, we need to document whether NULL triggers undefined behavior,
regardless of how we then decide to further deal with it (either
adding attributes to the .h, adding in explicit NULL checks, or
letting the code segv are all okay if we have first documented that
NULL gives unspecified results).

> 
> Considering "optional list" in particular, I see no semantic difference
> between vec==NULL and vec[0]==NULL. If an optional list is expected,
> both should be tolerated; if a mandatory (non-empty) list is expected,
> both are invalid.
> 
> Once we decided / documented what parameters were valid, I think the
> most practical way to enfoce mandatory parameters (in case they are
> taken by address) and mandatory (non-empty) lists would be with assert().

Your argument of multiplicity is interesting.  Extrapolating it a bit
more, I think you are arguing that in Python,

h.connect_command([]) - error, since list must be non-empty
h.connect_command(None) - error should be same as [], rather than
complaining that 'None' is not a list type

h.opt_list_meta_context_queries([], func) - success, since empty list
makes sense
h.opt_list_meta_context_queries(None, func) - same effect (whereas in
my v3 patches, it complains that 'None' is not a list type)

> 
> (We should also make sure that NDEBUG is never defined -- some parts of
> libnbd and nbdkit already "#undef NDEBUG"; I'd go farther and just
> forbid building libnbd and nbdkit without assertions. Assertions cost a
> few CPU cycles, and I don't expect nbdkit to be CPU-bound ever.
> Assertions are worth the CPU costs.)

We undef it during unit tests, but I don't know if we have been brave
enough to declare that we mandate that assertions remain live in the
library itself.

> 
> assert() is good because:
> 
> - it crashes (and yes, once we document the expectations, crashing a
> program from a library is fine),
> 
> - static analyzers such as coverity understand it (to my knowledge),
> 
> - gcc will not remove it (in the absence of NDEBUG, but for that, see
> above).

Agree with all of those points.

> 
> I think the nonnull attribute is not worth it. It might catch statically
> provable NULL arguments, but cannot catch such that are not statically
> provable. By weakening the function's internall NULL checks, it
> introduces new problems for those statically-not-provable-but-still-NULL
> code paths.
> 
> In fact, assert() *in combination* with attribute nonnull would be the
> best: issue warnings at build time in case the NULL arg is statically
> provable, use assert() to catch anything that might slip through
> dynamically. Unfortunately, attribute nonnull may not live up to the
> build-time-warning expectation (dependent on the gcc version), but it
> *does* eviscerate assert() -- if I understand correctly. So attribute
> nonnull does more harm than good, apparently.

That was the conclusion libvirt had several years ago - avoiding
attribute nonnull was better than trying to use it, at least for
public interfaces (it is still useful for Coverity analysis, though).
I don't know if the state of the art gcc is doing better at it now.

> 
> If *all* NULL args were statically provable, then attribute nonnull,
> with gcc 12+, would clearly win over assert() -- but not all such args
> are statically provable. Therefore we need assert(), and because
> attribute nonnull actually weakens assert(), we should *only* use assert().
> 
> ... And my argument would end here, in case we didn't generate the
> python bindings. If the python bindings were a separate project, I'd say
> that the symptom
> 
> $ nbdsh -c 'h.connect_command([])'
> nbdsh: generator/states-connect.c:247:
> enter_STATE_CONNECT_COMMAND_START: Assertion `h->argv.ptr[0]' failed.
> Aborted (core dumped)

This assertion is NOT because we violated the non-NULL parameter, but
because we weren't checking for non-empty list soon enough.  The
python equivalent to allowing a NULL pointer would be accepting
h.connect_command(None) (right now, that generates a python
TypeError).

> 
> was entirely valid (expected), and that it was up to the implementors of
> the python bindings to catch an empty list here, turn it into a python
> exception, lest the C function's invariants be violated. So, end-to-end,
> that would result in an assert() in the C function, and an "if" in the
> Python code.
> 
> *But*. Given that we generate the python bindings... we might as well
> just move the "if" into the most deeply lying C code, in place of the
> assert()s, and then let the generator turn that error into a Python
> exception higher up, as usual.
> 
> So, purely because of this centralized code generation, I guess I'm
> arguing for explicit "ifs" in the deepest (generated, or hand-written) C
> code, and avoiding the nonnull attribute. Again, I don't see a semantic
> difference here between vec==NULL and (vec != NULL && vec[0] == NULL).
> 
> (There *is* a difference between "setting something to an empty list"
> vs. "not setting something at all", but we express that differently
> already, I hope! If we do distinguish "nonexistent" (~optional) from
> "empty", then I get to redo my whole argument...)

So far, none of our list arguments have been optional, and none of our
optional arguments have been lists.  The addition of
nbd_opt_list_meta_context_queries, where it is often desirable to pass
an empty list of queries, may be the first case where we do want to
represent the queries as an optional list (in my v3 series, it was a
mandatory argument).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org