[Libguestfs] sscanf/stroul (was: Re: [PATCH nbdkit v3 2/3] Add new retry filter.)

Eric Blake eblake at redhat.com
Fri Sep 20 13:31:48 UTC 2019


On 9/20/19 7:44 AM, Richard W.M. Jones wrote:

>> sscanf("%d") cannot detect overflow; should this use strtol with errno
>> checking instead?
> 
> AIUI glibc does detect overflow, but it's the C17/POSIX standards
> which don't mandate it?

Proof that glibc does not detect overflow:
$ cat foo.c
#include <stdio.h>
#include <string.h>

void examine(const char *in)
{
  int i, r, n = -1;
  r = sscanf (in, "%i%n", &i, &n);
  printf ("parsed '%s', result %d, got '%d' after %d characters\n",
          in, r, i, n);
  if (n >= 0 && n < strlen (in))
    printf (" but did not consume %s\n", in + n);
}

int main(void)
{
  examine ("");
  examine ("  ");
  examine ("0");
  examine ("-0");
  examine ("1x");
  examine ("2147483647");
  examine ("2147483648");
  examine ("-2147483647");
  examine ("-2147483648");
  examine ("-2147483649");
  examine ("4294967295");
  examine ("4294967296");
  examine ("9999999999");
  examine ("10000000000");
}
$ gcc -o foo -Wall foo.c
$ ./foo
parsed '', result -1, got '0' after 0 characters
parsed '  ', result -1, got '0' after 0 characters
parsed '0', result 1, got '0' after 1 characters
parsed '-0', result 1, got '0' after 2 characters
parsed '1x', result 1, got '1' after 1 characters
 but did not consume x
parsed '2147483647', result 1, got '2147483647' after 10 characters
parsed '2147483648', result 1, got '-2147483648' after 10 characters
parsed '-2147483647', result 1, got '-2147483647' after 11 characters
parsed '-2147483648', result 1, got '-2147483648' after 11 characters
parsed '-2147483649', result 1, got '2147483647' after 11 characters
parsed '4294967295', result 1, got '-1' after 10 characters
parsed '4294967296', result 1, got '0' after 10 characters
parsed '9999999999', result 1, got '1410065407' after 10 characters
parsed '10000000000', result 1, got '1410065408' after 11 characters

glibc happily performs 2's-complement overflow, rather than flagging it
in any discernable way.  And since C17 does not require integer overflow
detection, and no existing libc provides it, it is unlikely that POSIX
would mandate it either.

Yes, parsing integers from untrusted sources is a bear.  I overlook uses
of sscanf on kernel /proc/... files (if that would ever overflow, either
you've been attacked where someone is overriding the mount such that you
are not reading actual kernel procfs files but an attacker's overlay, or
the kernel is hosed - but either way, if the attacker can mount an
attack like that, you're hosed anyways).  But for parsing other strings,
if overflow detection is important, then *scanf() is unusable.

Parsing fixed-width (or max-width) strings using sscanf is fine.  And
even when parsing integers, you can use tricks such as "%5i" when
parsing a 16-bit value to guarantee that you don't overflow an int (but
you still have to check whether you overflowed a uint16_t, and it is
unspecified whether any leading whitespace or '-' counted against the 5
byte limit or was silently skipped during the scanf parse).

> 
> I had a longer email here where I was going to suggest replacing all
> use of sscanf/strtoul with new nbdkit_parse_* functions, but we would
> need a lot of them and they still wouldn't cover all the clever uses
> we make of sscanf, eg:
> 
>   https://github.com/libguestfs/nbdkit/blob/cfbbe17ee21a9a47a96a5ac7f4321d20ed90cad9/filters/error/error.c#L134
> 

Unfortunately, such a scrub is the only way I know to be safe, if we
care about detecting integer overflow.

> So I still think the right way to solve this is to fix C2x / POSIX so
> it mandates what is possibly already the glibc behaviour.

POSIX won't change unless glibc actually implements overflow detection
first; but glibc probably has no incentive to change as it might slow
*scanf() down and you already have fine-grained control over integer
parsing via strtol().

And even if POSIX were to change, it would be years before all the
various libc will catch up.  So in the meantime, your best bet is to
avoid sscanf() when detecting input errors is important, on the grounds
that it is more portable that way than to rely on whether libraries have
been changed to make detection via sscanf() feasible.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190920/dd2a378e/attachment.sig>


More information about the Libguestfs mailing list