Heads-up: brand new RPM version about to hit rawhide

Wed Jul 16 08:05:57 UTC 2008

Andreas Ericsson <ae <at> op5.se> writes:
> > Are you trying to imply that KDE has "extremely poor project policy"?
> 
> If a the code corresponding to a particular version can be changed once
> released publicly, then yes.

What's your definition of "publicly"? KDE's definition of a public release is 
when the tarballs are synced to official KDE mirrors. But we packagers get 
access to the tarballs a few days earlier, and of course the releases are 
tagged in SVN before the first tarballs are produced. Any respins happen in the 
time period when the tarballs are available to us packagers, but not the 
general public (in principle... of course with packagers working in public SCMs 
like our CVS, source and binary packages tend to leak out early, but that's 
another can of worms). So as far as KDE is concerned, the versions are 
not "released publicly" when the changes happen, but as they _are_ released to 
us (packagers), we have to be able to handle the possibility of a respin.

> Otherwise you can have kde-4.0.0.tar.gz with one particular bug and
> kde-4.0.0.tar.gz next week where that bug is missing (but something
> else is broken).

In KDE's case, you won't get the old tarball, at least not from KDE's mirrors.

As for packages, they can already contain patches, so if 4.0.0-1 and 4.0.0-2 
differ by a patch or a respun tarball doesn't make that much a difference in 
practice.

That said, there are other upstreams (usually small projects) which respin 
tarballs even after they were uploaded, hoping nobody noticed the broken 
version. ;-)

> Calling the first package kde-4.0.0rc1.tar.gz would make sense though,

Except there was actually an RC1, and it went through the same "prerelease to 
packagers, respin, release" process.

> If you have a history looking like this:
> 
> A--B--C--D
>        \
>         E

You can't have such a history in a centralized SCM. You can have a branch off 
C, but that means a different branch (= URL in SVN's case, where branches are 
just directories) where C is branched (= copied in SVN) to. At a given SVN URL, 
you only see A-B-C-D or A-B-C-E as the history. That's kinda the definition 
of "centralized". Therefore, URL+revision uniquely identifies a SVN fileset.

In addition, SVN gives out unique numbers throughout the entire repository, so 
the actual commits for D and E will have different commit IDs. The IDs are 
totally ordered by the time they were committed to the central repository, 
which also implies that the relevant IDs for a branch are totally ordered. For 
example, if you specify the branch as the URL and any revision between C and D, 
you'll get the same files as if you specify the revision number for C.

> True that. Current maths suggests that with the current commit-tempo to the
> kernel (10487 commits between 2.6.25 and 2.6.26, most of them merges), we'll
> run into the first SHA1 collision a mere 16 billion years after the
> calculated end of the universe. I can see how that's a real problem....

All this is probabilistic, so nothing guarantees we won't run into a collision 
today, as unlikely as it is. I consider this a major design flaw in the concept 
of distributed SCMs.

        Kevin Kofler