Request for Comments: updating RPMs using binary deltas.
Lamar Owen
lowen at pari.edu
Fri Jan 9 01:08:56 UTC 2004
On Thursday 08 January 2004 04:19 pm, Alan Cox wrote:
> Problem 1: Gzip files don't rsync well. bzip2 files I'm not sure of the
> situation. Rusty did work with rsyncable dpkg files and along with Tridge
> hacked up a gzip library that generated slightly larger rsyncable files.
> That change was tested ages ago in rpm and broke stuff so went back out.
> I don't know if anyone ever sat down and debugged it in full
The difference that I'm proposing is to generate the diff on the buildserver,
not the update server. The build (or even a for-task diff) server would have
a repository of originals, and as each fresh update package is produced the
diff to the unpacked original is generated. The resulting diff is signed,
and summed. Then it's upload to the update server, prediffed and presigned.
> Problem 2: Where do you get the original package from ? The CD has been
> one suggestion but JBJ pointed out that you can assemble an approximation
> of the original package from the on disk data in most cases. The config
> files might be a little different but most of the content is basically the
> same.
Then you run afoul of the problem Jef brought up. We can't make the user
downgrade to upgrade; thus, we must have the original RPM avilable (or
struggle with an unmanageable plethora of permutations of packages).
Available can mean the install media; it could mean a few GB of space on the
user's HD (if they chose to install a local repository of the RPMs that they
installed (which would have to be updated as new RPMs are installed, or be a
full copy one)). It could mean a download of the original off the update
server if the user just simply cannot find the original RPM, in which case
the advantage is negated. They should learn to keep the CD or other local
repository around anyway to be able to roll back errant updates.
> Problem 3: Server resources. The rsync computation clobbers the server
> compared to the overhead of just spewing bits. Given people are running
> 3000 vsftp sessions in parallel off big servers that is a concern.
The diff would not be done real-time. That would blow out all the advantages
of doing the diffs in the first place. The diff would be done at build time,
not server time, for the updates.
The reason I mentioned rsync at all is because it can produce an incremental
local backup of changed files very easily, which then can be packaged and
uploaded to the server. I was not intending or proposing the use of rsync to
be the wire protocol between the update server and each user. Sorry if I
mislead, there.
--
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC 28772
(828)862-5554
www.pari.edu
More information about the fedora-devel-list
mailing list