[Fedora-packaging] UTF-8 package names

Toshio Kuratomi a.badger at gmail.com
Tue Feb 26 18:25:45 UTC 2008


Today the Packaging Committee began a discussion of whether package 
names should be allowed to contain the full range of Unicode characters 
(encoded as UTF-8) or be restricted to the ASCii subset.

This is a bit of a contentious issue with the Packaging Committee 
members split but not yet set in stone about how to proceed.

The main arguments seem to be:
Pro Unicode:
* Upstream knows best what name is most appropriate for their users. 
For us to change it locally in Fedora doesn't make much sense.
* We allow Unicode in every other piece of the spec so why not in the 
package name?
* We should be shaking out bugs in the handling of Unicode in our 
software rather than hiding issues with it.

Pro ASCII:
* Hard to type unicode package names, therefore it is a usability problem.
* Is there a limit?  Even if European letters are fine what about Kanji 
or Sanskrit?
* Some pieces of software won't handle unicode package names and will 
need to be fixed.

One package has been submitted for review with a unicode using package 
name and has some applicable comments:
   https://bugzilla.redhat.com/show_bug.cgi?id=261881

= Section of Packaging Committee Logs about Unicode Package Names =

(09:32:18 AM) racor: banning non-ascii chars from package names
(09:32:29 AM) spot: racor: eww. is someone actually doing that?
(09:32:39 AM) f13: somebody did a + in a version I thought
(09:32:40 AM) abadger1999: ivazquez sent something to the list.  No 
actual draft but it was very simple.
(09:32:47 AM) tibbs|h: spot: There's a review submitted.
(09:32:49 AM) svahl left the room (quit: ).
(09:32:51 AM) f13: but that doesn't count.
(09:32:51 AM) rdieter: f13: inkscape, yeah, but fixed now.
(09:32:59 AM) abadger1999: racor: What is the ratioinale?
(09:33:01 AM) f13: tibbs|h: I bet its from nim-nim isn't it?
(09:33:06 AM) racor: yes, there is a packaging under review ... <digging>
(09:33:07 AM) tibbs|h: Yes, some fonts thing.
(09:33:13 AM) spot: that seems like a no-brainer to me.
(09:33:19 AM) abadger1999: spot: Why?
(09:33:22 AM) tibbs|h: 
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=261881
(09:33:24 AM) buggbot: Bug 261881: medium, medium, ---, Nobody's working 
on this, feel free to take it, NEW , Review Request: 
écolier-court-fonts - Écolier court fonts
(09:33:37 AM) spot: because xchat rendered that two different ways for me?
(09:33:45 AM) spot: i don't even want to think about yum.
(09:34:02 AM) abadger1999: spot: yum does need a fix.  There's an open bug.
(09:34:05 AM) f13: well, it would be good to test our infrastructure.
(09:34:08 AM) tibbs|h: I don't really have an issue with non-ascii 
package names; we can't keep falling back on "our infrastructure sucks" 
forever.
(09:34:11 AM) abadger1999: But I'm wondering why it's a bad thing.
(09:34:21 AM) abadger1999: Shouldn't we consider places where utf-8 
fails to be bugs?
(09:34:44 AM) f13: non-ascii packages may have to be renamed if they 
ever show up in RHEL
(09:34:53 AM) f13: I'm not sure how the RHN beast will handle it.
(09:35:07 AM) spot: well, that is RHEL's problem.
(09:35:10 AM) f13: yep
(09:35:17 AM) f13: I'm not saying it should have much bearing on our 
decision
(09:35:27 AM) spot: racor: are you against it for aesthetic reasons or 
technical ones?
(09:35:31 AM) racor: IMO, the technical issues are minor, the real issue 
is usabilitsy
(09:35:44 AM) racor: spot: neither usability
(09:35:51 AM) tibbs|h: I always said that I won't review what I can't type.
(09:36:05 AM) tibbs|h: But my inability to type that is my dysfunction.
(09:36:11 AM) spot: well, i suppose that it would make it more difficult 
for english typists to install/use a package
(09:36:23 AM) spot: but not impossible
(09:36:25 AM) racor: consider: 90% of US users are not even able to type 
accented chars
(09:36:27 AM) tibbs|h: But is that a reason to ban something?
(09:36:47 AM) tibbs|h: I can't read German either, so let's kick out 
German translations.
(09:36:53 AM) racor: 99.89% of Western folks are not able to type east 
asian chars
(09:37:07 AM) f13: and 40% of all statistics are made up on the spot
(09:37:20 AM) spot: my concern is when people start naming packages in 
kanji.
(09:37:40 AM) racor: tibbs: right type the char ß (this is not a beta 
it's a sharp ss)
(09:37:46 AM) spot: we already mandate that the spec file must be 
written in american english
(09:38:11 AM) tibbs|h: racor: I already said I can't type those 
characters; I don't see what point your question serves.
(09:38:28 AM) rdieter: spot: +1, I think that covers the case here then, no?
(09:38:48 AM) tibbs|h: But I don't see the practical difference between 
ideograms and accented vowels, either.
(09:38:57 AM) racor: tibbs: my point: anything outside of ascii not 
universal enough
(09:38:59 AM) abadger1999: Yum bug: 
https://bugzilla.redhat.com/show_bug.cgi?id=261961
(09:39:01 AM) buggbot: Bug 261961: low, medium, ---, Jeremy Katz, NEW , 
Yum does not like non-ascii package names
(09:39:14 AM) spot: well, technically we only say summary and description
(09:39:18 AM) spot: "Please put personal preferences aside and use 
American English spelling in the summary and description."
(09:39:18 AM) tibbs|h: Either we say "ASCII" or "UTF-8"; there's no 
point in anything in the middle.
(09:39:39 AM) tibbs|h: We do not mandate that the entire spec be in English.
(09:39:42 AM) f13: we already support non-ascii file names
(09:39:56 AM) f13: so lon gas they are UTF-8
(09:40:00 AM) tibbs|h: We permit translated descriptions.
(09:40:28 AM) rdieter: Can we at least agree that ASCII *SHOULD* be 
used?  Not sure if it warrants a MUST.
(09:40:40 AM) spot: yeah, i can support that ASCII should be used
(09:40:43 AM) abadger1999: rdieter: I'm not sure I would agree with that.
(09:40:51 AM) tibbs|h: I don't know if there's really a point.
(09:41:05 AM) racor: package names!!! descr, etc. are legacy, convenience,
(09:41:21 AM) tibbs|h: But we have rules about naming packages after 
things like the upstream tarball.
(09:41:27 AM) abadger1999: I mean, nim-nim's point is also valid -- if 
the upstream name is non-ascii who are we to differ from upstream?
(09:41:40 AM) rdieter: shrug, I'm ok with pushing the envelope here too.
(09:41:58 AM) spot: if only to encourage people not to be stupid and 
spell things like this: ƁƎƗȂ
(09:42:11 AM) f13: can we try this package as a trial run and see what 
all it breaks?
(09:42:18 AM) racor: abadger1999: would you say the same if the package 
name was in cyrillian or turk?
(09:42:28 AM) ***spot is getting pulled away
(09:42:40 AM) spot: we'll have to pick this up next meeting
(09:42:41 AM) spot: sorry. :(
(09:42:52 AM) rdieter: f13: +1 :)
(09:42:53 AM) abadger1999: racor: What does the package do?  Is it a 
cyrillic font package?  Is it something specific to Russian language 
speakers?
(09:43:26 AM) abadger1999: spot: I think that needs to be addressed 
upstream.
(09:43:35 AM) racor: abadger1999: not necessarily. Just author preference.
(09:44:09 AM) abadger1999: racor: So that's the gray line for me.  I 
think I'd say that we can try to influence upstream but it is an 
upstream decision.
(09:44:09 AM) tibbs|h: We have a practical issue in any case, because 
our infrastructure doesn't properly support non-ASCII package names 
currently.
(09:44:27 AM) racor: I could call a package: bärensößchen ...
(09:44:33 AM) tibbs|h: Why not?
(09:45:00 AM) tibbs|h: Looks like as good a name as anything else.
(09:45:54 AM) racor: tibbs: except that most people would not be able to 
type it .... yum install <package-name>
(09:46:09 AM) abadger1999: Copy and paste....
(09:46:19 AM) tibbs|h: Graphical interface.
(09:46:25 AM) tibbs|h: Tab completion.
(09:46:34 AM) tibbs|h: Learn to type German.
(09:47:06 AM) racor: abadger1999, tibbs: that's laughable.
(09:47:16 AM) tibbs|h: It's weird to see this argument backwards; 
usually the Americans are arguing for "nothing I can't understand" while 
the Europeans and Asians just laugh at the idiot luddites.
(09:47:46 AM) tibbs|h: "Nothing in Fedora should use the metric system."
(09:48:26 AM) racor: tibbs: learn to type Thai
(09:49:15 AM) tibbs|h: I personally have no interest in doing so.
(09:49:24 AM) tibbs|h: I don't see how that has any bearing on anything, 
though.

-Toshio

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/fedora-packaging/attachments/20080226/47843743/attachment.sig>


More information about the Fedora-packaging mailing list