[Libvir] RFC: safer memory allocation APIs with compile time checking

Mon Apr 28 07:39:41 UTC 2008

On Sun, Apr 27, 2008 at 08:29:33PM +0100, Daniel P. Berrange wrote:
> After updating the virBuffer APIs to protect against improper usage I have
> been thinking about how we might provider safer memory allocation APIs 
> with protection against common usage errors and compile time validation of
> checks for failure.
> 
> The standard C library malloc/realloc/calloc/free APIs have just about the
> worst possible design, positively encouraging serious application coding 
> errors. There are at least 7 common problems I can think of at the moment,
> perhaps more....
> 
>  1. malloc()  - pass incorrect number of bytes
>  2. malloc()  - fail to check the return value
>  3. malloc()  - use uninitialized memory
>  4. free()    - double free by not setting pointer to NULL
>  5. realloc() - fail to check the return value
>  6. realloc() - fail to re-assign the pointer with new address
>  7. realloc() - accidental overwrite of original pointer with NULL on
>                 failure, leaking memory
> 
> Many applications / libraries wrap these in their own functions, but typically
> fail to address one or more these problems. 
> 
> eg glib re-definitions try to address point 2 by having g_malloc() call
> abort() on failure, but then re-introduce the problem by adding g_try_malloc()

  Calling abort() in a library is a major NO-NO and one of the reasons
I avoided glib in most of the code I developped. You just can't exit()/abort()
from a library.

> All 7 of the errors listed early could be avoided and/or checked at compile 
> time if the APIs used a different calling convention.
> 
>  1. For any given pointer the compiler knows how many bytes need to be
>     allocated for a single instance of it. ie  sizeof(*(ptr)). Since C
>     has no data type representing a data type, this cannot be done with
>     a simple function - it needs a (trivial) macro.
> 
>  2. The __attribute__((__warn_unused_result__)) annotation causes the
>     compiler to warn if an application doesn't do something with a return
>     value. With the standard malloc() API this doesn't help because you
>     always assign the return value to a pointer. The core problem here
>     is that the API encodes the error condition as a special case of
>     the actual data returned.  This can be addressed by separating the
>     two. The return value should simply be bi-state success or fail code
>     and the return data passed back via an pointer to a pointer parameter.
> 
>  3. The memory allocated by malloc() is not initialized to zero. For
>     that a separate calloc() function is provided. This leaves open the
>     possiblity of an app mistakenly using the wrong variant. Or consider
>     an app using malloc() and explicitly initializing all struct members.
>     Some time later a new member is added and now the developer needs to
>     find all malloc() calls wrt to that struct and explicitly initilize
>     the new member. It would be safer to always have malloc zero all 
>     memory, even though it has a small performance impact, the net win
>     in safety is probably worth it.
> 
>  4. Since free() takes the pointer to be freed directly it cannot reset
>     the caller's original pointer back to NULL. This can be addressed if
>     a pointer to the pointer is passed instead. The computational cost
>     of the often redundant assignment to NULL is negligable given the 
>     safety benefit
> 
>  5, 6 & 7. As with problem 2, these problems are all caused by the fact 
>     that the error condition is special case of the return data value.
>     The impact of these is typically far worse because it often results
>     in a buffer overflow and the complex rules for calling realloc() are
>     hard to remember.
> 
> So the core requirements of a safer API for allocation are:
> 
>  - Use return values *only* for the success/fail error condition
>  - Annotate all the return values with __warn_unused_result__
>  - Pass a pointer to the pointer into all functions
>  - Define macros around all functions to automatically do the sizeof(*(ptr))
> 
> Passing a pointer to a pointer gets a little tedious because of the need
> to write '&(foo)' instead of just 'foo', and is actually introduces a
> new problem of the caller accidentally passing merely a pointer, instead
> of a pointer to a pointer. This is a minor problem though because it will
> be identified on the very first run of the code. In addition, since we're
> defining macros around every function we can have the macro also convert
> from ptr to &(ptr) for us, avoiding this potential error:
> 
> So the primary application usage would be via a set of macros:
> 
>     VIR_ALLOC(ptr)
>     VIR_ALLOC_N(ptr, count)
>     VIR_REALLOC(ptr)

  you will need some size here.

>     VIR_REALLOC_N(ptr, count)
>     VIR_FREE(ptr)
> 
> These call through to the underlying APIs:
> 
>     int virAlloc(void *, size_t bytes)
>     int virAllocN(void *, size_t bytes, size_t count)
>     int virRealloc(void *, size_t bytes)
>     int virReallocN(void *, size_t bytes, size_t count)
>     int virFree(void *)
> 
> Note that although we're passing a pointer to a pointer into all these, the
> first param is still just 'void *' and not 'void **'. This is because 'void *'
> is defined to hold any type of pointer, and using 'void **' causes gcc to
> complain bitterly about strict aliasing violations. Internally the impls of
> these methods will safely cast to 'void **' when deferencing the pointer.

  One of the problem this introduce is what uses to be a common mistake
freeing the a non-pointer object which is normally immediately pointed out
by the compiler whicll be lost because your macro will turn this into 
a void * to the data, and you will get a runtime problem instead.

> All 4 of Alloc/Realloc functions will have __warn_unused_result__ annotation
> so the caller is forced to check the return value for failure, validated at
> compile time generating a warning (or fatal compile error with -Werror).
> 
> So to wire up the macros to the APIs:
> 
>   #define VIR_ALLOC(ptr)            virAlloc(&(ptr), sizeof(*(ptr)))
>   #define VIR_ALLOC_N(ptr, count)   virAllocN(&(ptr), sizeof(*(ptr)), (count))
>   #define VIR_REALLOC(ptr)          virRealloc(&(ptr), sizeof(*(ptr)))

  That i really don't understand. How do you expect to use that realloc
without passing a new size.

>   #define VIR_REALLOC_N(ptr, count) virReallocN(&(ptr), sizeof(*(ptr)), (count))

  That I can understand , but the previous one i can't.

>   #define VIR_FREE(ptr)             virFree(&(ptr))
[...]
> Much less ugly:
> 
>       if (VIR_ALLOC_N(guest->arch.defaultInfo.machines, nmachines) < 0)
>          return -1;
> 
>       if (VIR_REALLOC(migrateTrans, caps->host.nmigrateTrans+1) < 0)
>          return -1;

  how does sizeof(*(caps->host.nmigrateTrans+1)) increases the size ? 
Doesn't make sense to me, you take a pointer, increment it, so basically just
pointing to the next element in the array, but the size of the pointed object
would be identical and realloc() becomes a noop.

  The proposal may help clean a lot of things, but VIR_REALLOC I don't
understand, what did i missed ?

> and produced easier to read code too. The cleaner realloc() API is particularly
> appealing.

  Well ... i don't understand it !

> This does all seem a little bit too good to be true - I'm wondering why other
> apps/libraries don't use this style of allocation functions already ? Perhaps
> someone can find nasty flaws in this approach, but hopefuly not...

  I would be tempted to hook into the error reporting directly within
memery.c , even if we don't have a good context there, basically if we
have a memory shortage even reporting to stderr is sensible , at least once.

> +int virAlloc(void *ptrptr, size_t size)
> +{
> +    if (size == 0) {
> +        *(void **)ptrptr = NULL;
> +        return 0;
> +    }
> +
> +    *(void **)ptrptr = calloc(1, size);
> +    if (*(void **)ptrptr == NULL)
> +        return -1;
> +    return 0;
> +}
[...]
> Index: hash.c
> ===================================================================
> RCS file: /data/cvs/libvirt/src/hash.c,v
> retrieving revision 1.37
> diff -u -p -r1.37 hash.c
> --- hash.c	18 Apr 2008 08:33:23 -0000	1.37
> +++ hash.c	27 Apr 2008 19:04:28 -0000

   The hash example is interesting, but it doesn't use REALLOC ...

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard at redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/