[Fedora-packaging] Re: UTF-8 package names

Toshio Kuratomi a.badger at gmail.com
Wed Feb 27 19:04:31 UTC 2008


Axel Thimm wrote:
> On Tue, Feb 26, 2008 at 06:52:24PM -0800, Toshio Kuratomi wrote:
>> Package names should follow upstream since attempting to transliterate or 
>> translate upstream names can't be done sanely on our side.  For things that 
>> map easily into the ASCii set (diacritic/accented characters, for instance, 
>> as found in latin-1) a transliterated Provides can be added to make 
>> installation easier for ASCii-conditioned users but carrying this on to 
>> other scripts is a losing proposition.
> 
> We violate this rule with capitalization,
> python/perl/php/.... packages and what not and we'll stick with an
> upstream's name that seems to want to ship its project only is certain
> locales?
> 
You raise some good points.  Why do we change upstream WRT 
capitalization?  Probably usability.  What will we do if capitalization 
matters? (ie: foobar and FooBar are separate projects)  Not approve a 
package that has a conflicting name and try to get either or both 
projects to rename upstream.  If the packages weren't changed upstream 
(because, says upstream, FooBar and foobar are plainly different names) 
what would we do?  This is an unknown that seems very relevant here.

OTOH, there is a single straightforward, non-controversial mapping from 
uppercase ASCii to lowercase ASCii.  Transliteration and transcription 
from other scripts is not so blessed.  So the rationale for wanting to 
ban Unicode could be the same as wanting to ban capitalization but the 
ramifications are very different.

Changing module names is interesting.  Speaking for python we do two things:
1) We use "import foo" to determine what the package's upstream name is 
rather than the name of the tarball.  I don't consider this changing the 
name as there are several equally valid ways to determine what the 
upstream name is so we've just standardized on one.

2) We prepend "python-" to the name in most cases.  This is partially 
categorization and partially a namespacing issue.  Categorization is 
changing the name to make it easier for users to recognize the purpose 
of a package.  python-turboflot doesn't need the "python-" for anything 
other than making *people* aware that turboflot is a python module. 
Namespacing is a valid issue, though.  python-json != php-json != 
perl-json even if they all use the equivalent of "import json" and 
distributed their binaries in tarballs called json-1.0.tar.gz.

I think justifying banning unicode with the module namespace as a 
precedent is illogical.  If anything, converting unicode characters in a 
user's chosen script to a ASCii-ization is removing a means of 
categorizing the package by sight.  It will also lead to more namespace 
collisions rather than less.

> I would instead propose a rule that says "if the transliteration to
> ASCII can't be done by the packager, he should contact upstream to
> provide one and use that".
>
If you said that upstream should always be in charge of transliterating 
I think this rule would be better.  To use an example that people might 
know from non-computer life what if upstream named their package 北京? 
One distribution might feel perfectly confident transliterating that as 
beijing while another one uses peking.  Having upstream manage 
transliteration pushes the decision to the correct level to coordinate 
and avoid confusion.

-Toshio

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/fedora-packaging/attachments/20080227/e97cce95/attachment.sig>


More information about the Fedora-packaging mailing list