Noarch subpackage problem
Toshio Kuratomi
a.badger at gmail.com
Thu Feb 26 06:45:22 UTC 2009
Toshio Kuratomi wrote:
> Florian Festi wrote:
>>> But building things as arch specific subpackages when they could be
>>> noarch is a feature that costs us what exactly? A bit of space?
>> I think you underestimate the amount of noarch data in the distribution.
>> From approximately 40 GB of content (in files) about 29 GB are not arch
>> dependent while there are only 10.7GB of binaries and libraries. Even if
>> you assume that the noarch content is compressed twice as good as the
>> binaries the larger part of the distribution is still packaged noarch
>> content. Right now about half of the noarch content already is in
>> regular noarch packages.
>>
>> So even if we only save about 10GB of mirror space for one release for
>> now it sums up over time for updates and new releases. I even expect
>> that the percentage of noarch content is increasing in the future and
>> every new supported architecture will automatically gain from the work
>> done (if there going to be any).
>>
>
> Ah... but I'm not talking about turning noarch subpackages off. I'm
> talking about talking about increasing the level of checking that
> happens before a noarch subpackage is allowed. So how much of the
> content that you list in 29GB saved is %doc? How much is scripting
> languages that we could decide to select on via path and filename
> extension? How much of those are headers that do not have timestamps or
> build hosts embedded into them? We have the capability to noarh
> subpackage all of those if we turn on an rpmdiff that does md5sum
> checking but can exclude those properties.
>
> I think we'll see substantial savings from allowing through things that
> meet a heuristic while still placing the burden of checking this onto an
> automated tool instead of a human.
>
And so that you can run a test...
I took the rpmdiff that is currently in the koji repo and modified it to
have a --lenient-hash option. --lenient-hash currently compares a hash
of things that are:
* not %doc (won't matter for program execution)
* not *.pyc or *.pyo (Changes that are incompatible between a
sub-package noarch build should be caught by differences in the *.py file.)
So to test the effect, compare the difference between:
rpmdiff -iT -iS -i5 [noarch package built on x86_64] [noarch package
built on i386]
rpmdiff -iT -iS --lenient-hash [noarch package built on x86_64] [noarch
package built on i386]
If we can identify other classes of files that can be filtered safely,
and create false positives we can add a heuristic for them to see about
getting this down even more.
-Toshio
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: rpmdiff.mine
URL: <http://listman.redhat.com/archives/fedora-devel-list/attachments/20090225/733cab2d/attachment.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/fedora-devel-list/attachments/20090225/733cab2d/attachment.sig>
More information about the fedora-devel-list
mailing list