Supporting EPEL Builds in Koji

Thu Jul 10 17:12:16 UTC 2008

Mike Bonnet wrote:
> Hi.  I've written up a proposal for a way to support EPEL builds in
> Koji.  It's not the only way we could do this, but I think it's doable
> with a reasonable amount of effort, and has the side-effect of greatly
> simplifying the Koji setup process for a lot of people (by removing the
> need to bootstrap/import an entire distro of packages into your private
> Koji instance).  You can view the proposal here:
> 
> http://fedoraproject.org/wiki/Koji/EPELSupport
> 
> It's fairly detailed regarding the data model changes necessary, so if
> you're not familiar with the Koji codebase you can skip those parts.
> Questions and comments welcome.
> 

Hi Mike,

good to see you've spend some time on this whereas I have been lazy in 
Littleton (holiday).

I'd like to share a few thoughts on the Wiki page -which is a great start;

 From the Wiki page: "There is a strong feeling that if a package exists 
in the Koji-managed local repo (whose contents the Koji admin has full 
control over) it should always be preferred over the external repo 
(whose contents the Koji admin may have little or no control over)."

The preference koji will have (in using which package in the buildroot), 
might introduce the problem where customly built package foo-1.0 is used 
in the buildroot, and upstream updates to foo-1.1 - the running nodes 
would update to foo-1.1 whereas the buildroot still uses the custom 
foo-1.0...

The point being, that these updates have to managed as they are 
released. The updates need to managed on the side where said packages 
are being mashed into a repository (infra side) or applied (client side).

You can see the duplicate effort when the updates are managed on either 
side (infra or client), _and_ in koji, separately.

I would like to suggest the koji development team makes the priority 
setting koji is going to use a configurable item -which in compared to 
the bigger picture isn't all that much a priority, just something to 
think about.

Additionally, I'd like to comment on / ask about the proposed database 
changes for the tag_config table; In an attempt to show you what I was 
thinking, here's a number of questions;

 From the Wiki page: "At repo creation time, the repodata will be 
retrieved from the processed url and merged with the local repodata as 
described above. This single repo will then be used for subsequent 
builds against the tag"

Do I understand correctly one can only give one single repository URL to 
a certain tag? Does this mean that a tag is created for (example) 
"dist-el5" with a remote repository URL, and then "dist-el5-updates" 
with another remote repository URL? This means for the build target used 
to have dist-el5-updates inherit dist-el5, right? Which then implies 
either metadata needs to be imported for dist-el5-updates or inheritance 
can only be applied during build-time... right?

The question I guess is basically; how does koji handle tags with a 
combination of remote urls & inheritance?

 From the Wiki page: "Right now that (rpminfo) table enforces uniqueness 
of (name, version, release, arch)."

I see that koji does not store complete package nevra which may become a 
problem in case duplicate nvra occur (which is very much likely the case 
where rebuilding packages with the release number bumped might collide 
with upstream doing a release bump -which is where the epoch is often 
used as upstream has clear guidelines for epoch bumps which -hopefully- 
make them occur in special circumstances only and thus very much reduces 
the chance of a colliding nevra). I like the proposed uniqueness of 
NVRA-namespaces as well, don't get me wrong ;-)

The other thing (and probably the last thing for now) I'd like to share 
is that, for reproducibility purposes, how viable would it be to have 
koji automatically import the remote RPM (the file and all the data) as 
it is used from the remote repository? This may or may not be a 
configurable option, saves work for admins compared to the situation 
now, and preserves reproducibility under all circumstances, adding the 
automatically imported RPM to the appropriate tags, storing them for 
reproducibility whereas upstream only keeps two versions in the 
repository... Though I understand it 1) consumes space and 2) isn't 
helpful for the EPEL case, I think this is particularly useful for 
long-term supported appliance software. Just wondering here ;-)

Let me know what you think,

Kind regards,

Jeroen van Meeuwen
-kanarip