RFC: Page size on PPC/PPC64 builders

David Woodhouse dwmw2 at infradead.org
Mon Mar 3 12:30:51 UTC 2008


On Mon, 2008-03-03 at 00:31 -0500, Tom Lane wrote:
> Random people occasionally offering machines by means of the mailing
> list is not what I call an organized infrastructure.  At minimum there
> needs to be easily-findable information on the fedoraproject wiki about
> how to obtain access to test machines.

Yes, there should. For _all_ architectures, not just secondary
architectures. (In fact, the whole distinction between primary and
secondary architectures is IMHO a bad idea and should be dropped).

> As for
> 
> > That's horseshit.  Complete and utter horseshit.  If the primary
> > package maintainer doesn't care about a particular secondary
> > architecture then it's no skin off their nose to simply ignore it.
> 
> I'm going to call horseshit on you.  Portability problems are frequently
> deep enough to require the skills of the primary package maintainer,
> or even the key upstream developers. 

You assume that the primary package maintainer has the skills necessary
to maintain and debug the package, and will apply them. That isn't
necessarily true, although thankfully it is _usually_ true.

So you're both right. The package maintainer _can_ simply ignore a
failure, add an ExcludeArch (with associated bug on the relevant
tracker), and move on. Or they can deal with it properly. What actually
happens will vary from package to package -- or even from day to day.

Some packagers are very conscientious and we can leave really hairy
problems to them even when they _are_ arch-specific. And that's great.

But some packagers don't even understand the language their package is
written in -- or even if they do, they don't have the time or the
inclination to maintain the package properly.

So occasionally we find an ExcludeArch which is due to a trivial bug in
the code, but the packager doesn't even seem to have _looked_ at it, and
has just excluded the offending architecture from the build¹.

If _every_ packager fell into the latter category, the arch SIG would
have an impossible task; I agree. But if every packager fell into that
category, then Fedora wouldn't be worth bothering with anyway.

As it is, there are enough of the 'good' maintainers out there that it
seems to work quite well. Even the packagers who don't understand the
language are usually very good at finding help when they need it; either
from upstream or from random Fedora hackers including the arch SIG where
necessary.

> I think that a secondary arch's SIG can probably be expected to detect
> portability issues, but asking them to take full responsibility for
> solving them is a project design doomed to failure. 

I wouldn't say we take 'full responsibility'. I'd maybe say we take the
'final responsibility' for it. That is -- if other people don't do it,
it falls to us. But thankfully they don't _all_ leave it to us. That
wouldn't work out very well; you're right.

So we do the 'fun' parts like porting OCaml/Modula-3/etc. which really
are arch-specific tasks. We also help out by giving pointers like "look
at how 64KiB pages affect stack allocation" to maintainers who are
having problems. And when maintainers give up on a problem and add an
ExcludeArch, we pick up the pieces -- usually discovering in the end
that it's actually a generic bug.

I do most of that work for PowerPC, and it takes a relatively small
proportion of my time to keep the FE-ExcludeArch-ppc bug mostly empty.
There just _aren't_ that many problems in decent software. And most of
what we ship is decent software.

I believe that the situation we have at the moment is working quite
well. I'm concerned that if we ever let builds succeed overall when they
fail on PPC, as has been discussed, then we will tip the balance closer
to your 'doomed to failure' scenario. But we'll deal with that if/when
we come to it.

Getting back to $SUBJECT, one of my concerns about continuing to use
64KiB pages on the PPC builders is that we'll make more people _want_
that situation to come about, where a failure on PPC doesn't even stop
their i386/x86_64 build and they don't even have to _look_ at it. (Not
that they really _do_ have to look at it right now anyway; if they're
_really_ lazy they can just add an ExcludeArch and move on without
bothering to look).

> Even more to the point, portability bugs are usually "real" bugs, as
> was already noted upthread.  Any self-respecting package maintainer
> *should* be expected to take an interest in them. 

That is, unfortunately, not universally agreed. Some people object to
the idea what we might expect packagers to actually look at bugs in
their packages. They think we'll frighten away the potential volunteers
if we make unreasonable demands like expecting them to actually maintain
the code they want us to ship.

I'm concerned by that, but thankfully we don't have many packagers who
really are that lax. 

> But how can she, without access to test machines? 
> A secondary arch that can't provide developers with test machines
> is not worth being taken seriously.

I am not aware of anyone else having problems with lack of access to
test machines, and I apologise that you had such a problem. We should be
putting a less ad-hoc arrangement in place some time very soon --
although as a Red Hat employee, you already _had_ access to a system
which even gave you full root access to such test boxes, instead of just
an account with mock privileges which is what's being set up.

-- 
dwmw2


¹ We have one package in Fedora/i386 which is excluded from x86_64, ppc
and ppc64 because it fails its own self-tests on all three. I fully
expect it to stop working on i386 too if the wind changes, but still we
ship it. And it scares me.




More information about the fedora-devel-list mailing list