thoughts about secondary architectures

David Woodhouse dwmw2 at infradead.org
Sat Jun 2 10:45:06 UTC 2007


On Sat, 2007-06-02 at 05:42 +0200, Lennert Buytenhek wrote:
> My view is that it's clear that most of the people hacking on Fedora
> and using Fedora care only about x86/x86_64 systems, and that I (and
> the other people who are interested in secondary architectures)
> should try as much as possible to avoid making the lives of the x86
> people difficult, if we ever want to have a chance of getting our
> patches merged without pissing everyone else off.

I think you do us a disservice here. I hope that getting sensible
patches merged should always be easy, regardless of how the build system
works. There aren't many of us who'll start refusing your patches just
because your build system pissed us off, hopefully :)

We're trying to make it easier for people to build Fedora for new
architectures. As you've so capably demonstrated, it's already
_possible_ to do that -- but we want to make it less painful, and in
particular we want to help you keep in sync with Fedora during the
development cycle. (At least, that's what I _think_ we're trying to
achieve -- Spot's document itself lacked any explicit rationale.)

Yes, we want to keep the impact on the package maintainers minimal. But
that doesn't mean it has to be entirely zero. If we want _zero_ impact,
then you might just as well keep building it for yourself as you already
are.

> While it is very well possible that there is some bug in a package
> that does not surface on x86, 99.9% of the Fedora developers are
> unlikely to care about that 

(Actually I think are many more than 1 in 1000 who are more
conscientious than that and actually do care about portability. We're
not _that_ lackadaisical, as a rule.)

> if the package builds OK on x86 and no ill effects are seen on x86.

Nevertheless, in the case where a package _used_ to work on ARM and the
updated version suddenly doesn't build, don't you think that warrants at
least a _glance_ from the package maintainer to see if it's actually a
_generic_ issue which just happens to bite on ARM today for some
timing-related or other not 100% repeatable reason?

That glance is _all_ that should be required -- package maintainers
should _definitely_ have the option of just pushing a button to say they
don't care, and shipping the package anyway on all the architectures for
which it _did_ build. We don't want to make life hard for them; you're
right. But we don't necessarily want them to ignore failures which could
show up a real problem, either.

We could see the builds on other architectures as free testing. They
often _do_ show up issues which are generic, and not just arch-specific.
Especially in the cases where the package in question _used_ to build OK
on that architecture -- which is all we'd be expecting the package
maintainers to notice in the general case.

> From a purely technical point of view I would advocate that a build
> failure on any architecture fails the package build, but there will
> only have to be 3 or 4 cases where some gcc ICE causes some package
> to fail to build on some secondary architecture but build fine on
> x86 and the x86 people will hate us forever afterwards, and will
> eventually start clamoring that getting all these secondary
> architectures on board was a bad idea to begin with.  (Which, of
> course, would be totally understandable at that point.)

I don't think it would be understandable at all. It's not as if it would
take them long to glance at the failure and click the "don't care"
button. (Well, OK, we'd want a bug filed, but there can be automated
assistance with templates for that, even though it's a bad idea to have
it all done _completely_ automatically).

If we think GCC is going to be unstable on some platforms, then perhaps
the 'complete rebuild in mock' process which Matt Domsch has been doing
should be made mandatory? That would generally help to catch such
compiler-related problems before they affect package maintainers.

> There is a similar issue with build speed.  While my fastest ARM box
> (800 MHz with 512K L2 cache) is quite snappy as far as ARM systems go,
> it is probably no match for even the crappiest of x86 boxes.  The
> fastest ARM CPU I know of is a dual core 1.2GHz, which is still no
> match for x86.
> 
> This doesn't mean, IMHO, that it makes no sense to run Fedora on ARM
> systems.
> 
> But it does mean that if the building of packages on primary
> architectures is throttled at the speed of building packages on ARM,
> we're going to make a lot of (x86) Fedora developers very sad, angry,
> frustrated, or all of the above.

Not really. The builds for the architecture they _care_ about would be
available in koji from the moment they finish, and with the
'chain-build' option they'd be able to build subsequent dependent
packages immediately too. The only thing that would be waiting is the
push to the main repository... which also in practice waits for mirrors
to sync and other stuff like that. It's hardly a fast-path.

If there are architectures which are _really_ slow, and it really does
start to cause problems, then _perhaps_ we'd need to stop waiting for
those architectures. 

I think we should try to avoid that unless it's really necessary though.
Not only would we be pushing partially-failed packages without any
investigation, but you'd also start getting build failures and
inconsistencies on that architecture even when there wasn't actually a
real problem -- if a developer doesn't use chain-build but just submits
jobs one after the other, before the first job finished on all
architectures. The repository for that architecture wouldn't really be
in sync with Fedora at all -- you haven't gained much over the current
situation where you're entirely on your own.

Perhaps one way to deal with this potential problem is to allow the
package maintainer to push the 'go ahead and push it anyway' button in
the build system even _before_ the build has run to completion on every
architecture?

That way, the build would _normally_ wait for everyone to finish and the
repositories would remain in sync, and potential bugs would get at least
a cursory glance before the package is shipped -- but in the fairly rare
case where there's an urgent need for it in the actual repositories, the
package maintainer could speed things up.

(Presumably they'd need to have a way to force the mirrors to sync up
immediately too, if they're in this much of a rush? Something which has
never been brought up as an issue before, AIUI.)

> So, IMHO, ideally, the existence of secondary architectures should
> not significantly affect the typical workflow of an x86 Fedora
> developer, and secondary architectures should not negatively affect
> development on x86.

This is true, but taken to extremes it means we may as well not bother
trying to make life easier for you at all.

I think the whole point of the proposal is that there _are_ things we
can do, which are simple enough for us, which will help you a lot.
Should we refuse even to lift a finger to merge your patches, just
because we're too lazy? Despite the fact that you then have to work "two
or three times as hard" because you have to work around our
recalcitrance? If so, then why are we bothering with this proposal at
all? You might as well just keep doing it on your own, surely?

I think that if we're going to bother doing _anything_, the least we
should do is merge your patches and keep the build system in sync in the
_default_ case. Yes, package maintainers should have the option not to
care about builds which fail on ARM -- but any competent maintainer
should at least be taking a _cursory_ look at any new failure.

If we _really_ have to, we could have an option not to wait for the
build to complete -- but using that should be discouraged except in very
special cases. As I said, it's not as if packages making it to the
mirrors is a fast path.

-- 
dwmw2




More information about the fedora-devel-list mailing list