Gzip better than Bz2 ? Normal?

Deron Meranda deron.meranda at gmail.com
Tue Jan 3 16:51:38 UTC 2006


Thanks for clarifying Mike, although Donald was correct
in pointing out that I didn't make the claim for *all* algorithms.
But I wasn't as explicit as perhaps I should have been.  Given any
lossless algorithm A, it's always possible to trivially derive an
algorithm A2 which adds one bit but prevents bloating.

Of course the 1-bit maximum expansion claim assumes an
O(n)-SPACE algorithm.  Meaning that it's not always practical
when compressing very large data sets, or for stream-compressors,
as the program will have to attempt to compress the entire
data (or most of it) before it sees it's going to bloat it.

For streaming compressors, it's still possible to get an
O(1)-SPACE (constant space) by using a blocking method.
You divide the input into blocks of K-bits, and then you
only need to add one bit for each block.  So the potential
extra bloat is no longer a minimal 1-bit, but it's still a very
small constant percentage of the total data size.
There are surely other ways to do this too.

In a practical sense though, what all this means is that
file bloating should not be a concern for any modern
compression program.  Unlike for instance the old
Unix compress(1) program which can bloat quite badly.
--
Deron Meranda




More information about the fedora-list mailing list