[Libguestfs] RFC: *scanf vs. overflow

Rich Felker dalias at libc.org
Sat May 23 16:45:01 UTC 2020


On Sat, May 23, 2020 at 09:28:26AM -0700, Paul Eggert wrote:
> On 5/23/20 9:11 AM, Rich Felker wrote:
> 
> > stopping on an initial prefix ... does not admit easily sharing a backend with strto*.
> 
> I don't see why. If the backend has a "stop scanning on integer overflow" flag
> (which it would need to have anyway, to support the proposed behavior), then
> *scanf can use the flag and strto* can not use it.
> 
> Anyway, this is not an issue for glibc, which has no such backend.

It's relevant because you want to propose this for standardization.

> > that's contrary to the abstract behavior defined for scanf
> > (matching fields syntactically then value conversion)
> 
> That's not really a problem. The abstract behavior already provides for matching
> that is not purely syntactic. For example, string conversion specifiers can
> impose length limits on the match, which means the matching does not rely purely
> on the syntax of the input. It would be easy to say that integer conversion
> specifiers can also impose limits related to integer overflow.

Sure that's syntax. It's /[^ ]{1,n}"/.

Of course for integers you can define a syntax that matches every
non-overflowing value (this is always true for finite matching sets),
but that's nothing like how the function is specified and I don't
think anyone reasonable would classify non-overflow as a syntactic
property.

> > It's also even *more
> > likely* to break programs that don't expect the behavior than just
> > storing a wrapped or clamped value
> 
> That's not true of the code that I looked at (see the URLs earlier in this
> thread). That code was pretty carefully written and yet still vulnerable to the
> integer-overflow issue.

I don't follow. *Any* use of scanf on untrusted input is "vulnerable
to the integer-overflow issue" in the sense that overflow is UB. This
is not something subtle.

If you mean actually using overflowed values in an unsafe way
(assuming no ballooning effects of UB, just wrong values), I don't see
how it's subtle either. Any value that could be produced via overflow
could also be produced via non-overflowing input, and you have to
validate data either way.

> > I'm pretty sure the real answer here is just "don't use *scanf for
> > that."
> 
> Absolutely true right now. We are merely talking about (a) what sort of
> implementation behavior is more useful for programs that are currently relying
> on undefined behavior, and (b) what might be the cleanest addition to POSIX
> later, to help improve this mess so that future programmers can use *scanf
> safely in more situations.

This is absolutely not "clean" and I am opposed to it.

Rich




More information about the Libguestfs mailing list