rpm hashes

Adam Jackson ajax at redhat.com
Thu May 14 14:52:22 UTC 2009


On Thu, 2009-05-14 at 10:46 +0300, Panu Matilainen wrote:
> On Wed, 13 May 2009, Adam Jackson wrote:
> > It would have been really, _really_ nice if sha256 was merely another
> > hash that could be in the payload, instead of forcing you to pick one or
> > the other.  For that matter, it would still be really really nice.
> 
> Could it have been done that way? Yes, and if it were just per-package 
> hash then certainly it would've been done that way. But remember this is 
> per-file data, storing two (and when the day comes when sha256 is 
> considered insufficient, three etc) hashes per file adds a non-trivial 
> amount of header bloat.

32 bytes per file, plus another four for the header tag, unless I have
my math wildly wrong and/or I'm misremembering how hashes are stored.
My F11 machine has 430910 files over 2167 packages, so that extra
metadata comes to a massive 14.8M, compared to 11.6G of actual payload.
I have trouble getting worked up over this.

The point about having to store arbitrarily many hashes is certainly
fair, but a) sha512 is only twice as large as sha256, and 0.2% overhead
is still not a lot, b) that seems like a distro policy question.

> Having the md5 hashes too would've been nice for backwards compatibility 
> but actually using them for file conflict calculations would mean (in 
> addition to the header bloat):
> - considerable increase in memory use

I just don't buy this at all.  The checksums are computed as part of the
stdio stream, and any competent implementation of a SHA-like algorithm
requires storage that's O(n) on the size of the hash, not on the size of
the file.  So you'd need whatever the overhead is for the additional
metadata on the package you're currently inspecting, plus no more than a
page for the additional work area for the second hash.  (I assume here
that fileconflict checks are done one package at a time, not by loading
all packages into memory and then checking them for conflicts, since the
latter would be unusable.)

Oh, I guess there's also a case where you have to check for
fileconflicts among multiple packages in the same transaction laying
down the same files.  Handwave, same problem really.

> - falling back to md5 for conflict resolution would void the supposed
>    extra security of the better hash

So there's two cases, if rpm would let you carry both hashes.

1 is where the file on disk has both MD5 and SHA256 sums, and the new
package has only MD5.  You already trust the package on disk, because
you already installed it; so compute the SHA256 of the file you're about
to lay down!  Now you have both hashes, and you can compare them both.
The odds of defeating this are the odds of finding a payload that
collides for both MD5 and SHA256, which can't possibly be lower than the
odds of finding a collision for just SHA256 itself.

2 is where the file on disk has only MD5, and the package you're about
to install has both.  If you have an rpm that only understands MD5, then
whatever, you just ignore the SHA256 hash.  If you have an rpm that
understands both, then you have options.  If you're being sensible, you
do the same thing as for case 1, which is to generate the SHA256 of the
disk file that's implicitly already trusted and compare both sums, and
presumably you only got to this point because you trust the GPG key that
signed the package you're about to install, so, good enough.  (There's a
flaw here if the file on disk is modified.  I could see arguments here
for any of rpmnew/rpmsave/fileconflict as the "right thing", which I
leave to someone more detail-oriented than I am.)

If you're in FIPS mode - that is, if you're _not_ being sensible - then
you fail the transaction, which you ought rightly do anyway since oh no
the package on disk is only hashed with MD5, you're already in trouble.

For the case where there's N hashes and not just the two, I think this
has a straightforward generalization, assuming you have an internal
strict ordering of hash strength.  Margin too small to contain it, etc.

- ajax
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/fedora-devel-list/attachments/20090514/b23d489d/attachment.sig>


More information about the fedora-devel-list mailing list