Request for Comments: updating RPMs using binary deltas.

Lamar Owen lowen at pari.edu
Fri Jan 9 00:56:36 UTC 2004


On Thursday 08 January 2004 12:31 pm, Jef Spaleta wrote:
> Lamar Owen wrote:
> > The rpmdiff would be generated by the build process, and then the >
>
> rpmdiff would be uploaded.

> I don't think that was really Seth's point. So let me say what i think
> his point was. How much extra space do mirrors have to supply to mirror
> the rpmdiffs an an OPTION as well as the fully cooked update rpms.

In the ultimate realization there would be no 'fully cooked' updates.  The 
mirror would keep the original distribution files (just in case someone 
needed them) and the update patches.

> the accumulated lifetime of a core release. You can not get away from
> providing full rpm update packages.

Why?

> I'm certainly not prepared to
> entertain the idea that if I have a custom or alternative package
> version install that I need to downgrade all the way back to the iso
> release version then apply 7 updates. All those extra steps is bound to
> add extra fragility.

No, you misunderstand.  There is no downgrade involved.  The steps, again:
1.)	The buildsystem generates the updated rpm (just like it does now);
2.)	The buildsystem then does the 'rpmdiff' against the original distributed 
packages and uploads to the update server (and mirrors then get ;
3.)	You download the update patch; (just like you do now, but quicker);
4.)	The update tool asks you where to find the original binary RPM, and you 
provide the original binary RPM package, which may or may not be the version 
you have installed;
5.)	The update tool applies the patch TO THE ORIGINAL BINARY RPM (not to your 
installed system);
6.)	The update tool then updates your system with the reconstructed update 
RPM, which is exactly the same bits as it would have been if you had 
downloaded the 'fully cooked' update.

The mirrors have no extra processing overhead (they're just serving files).  
The mirrors have much less to transfer (since it's a patch).
If this is applied distribution-wide the mirrors have less bits to mirror, 
since there would be NO fully cooked updates (I'm not talking about something 
that is optional; I'm proposing a replacement).  That last bit might be too 
much for people to swallow, but whether the mirrors want to provide fully 
cooked updates in addition to the diff updates would be up to them.  The 
mirrors use much less bandwidth , which might reduce their bandwidth needs 
(or may make them more able to serve more people).

> If there is ANY chance of getting bit because you are using custom rpms
> or Fedora Alternatives or 3rd party alternatives...that's pretty much
> NOT inline with the project structure and objectives as presented so
> far. What is the POINT exactly of having rpmdiffs to confuse things when
> trying to use Fedora Alternatives and 3rd party repositories.

You would only get bitten in the same cases as you would now.  The idea is 
that we're only transmitting the changes of the update versus the original 
ISO RPM, and assume the user has the original (which we will check for using 
cryptographic checksums).  The user does not have to have the original RPM 
installed; they just need the original RPM _package_.  You are installing the 
same identical bits, just generating them a different way that is still 
secure.

And if you need the originals, then you can download the original plus the 
diff, and the update tool will still reconstruct the update for you.  

Like I said, let's see how it's working for SuSE users.

Elliot Lee mentioned having built such a tool; I'd like to see myself.

But, again, to achieve the best savings in storage and bandwidth we're not 
talking about a full delta of the entire package (or the entire payload); 
we're talking about file-by-file deltas: if the file didn't change, it 
doesn't get transmitted.  If it did change, transmit the delta.

We do the same thing with source code patches now, particularly if you use 
CVS, BitKeeper, or Subversion.  The savings in bandwidth have proven 
substantial in those areas.

And I'm talking just as much about saving server bandwidth, too, which is not 
cheap.  I know; I pay for bandwidth.
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu





More information about the fedora-devel-list mailing list