[Fedora-packaging] UTF-8 package names

Toshio Kuratomi a.badger at gmail.com
Wed Feb 27 02:52:24 UTC 2008

Peter Gordon wrote:
> On Tue, 2008-02-26 at 10:25 -0800, Toshio Kuratomi wrote:
>> Pro ASCII:
>> * Hard to type unicode package names, therefore it is a usability problem.
>> * Is there a limit?  Even if European letters are fine what about Kanji 
>> or Sanskrit?
> Japanese package names would really be odd here. Would we spell the
> package name with its kanji or its phonetic (e.g., hiragana) reading?
> For example, say there were a package called 「勉強」 (Rōmaji:
> "benkyoo", English: study) which had flash-cards or some helpful
> studying software. Would we name this package by this Kanji, or its
> hiragana equivalent
> 「べんきょう」? Would we require the package to have Provides for this
> kana reading if named in Kanji, and vice-versa? What about
> transliterations (so-called "Rōmaji"): What transliteration system [1]
> should we use? 
> If we do require the Provides, what if two packages end up being
> different kanji names that are homophones (read the same, phonetically)?
> One example that comes to mind is between 花 and 華 (both flower) and 鼻
> (nose), all read as "hana" (hiragana: はな)? For even more fun, 神
> (god), 紙 (paper), and 髮 (hair) all have readings of "kami" (かみ). And
> extending this to kanji compounds will yield even further enjoyment:
> 明日 (tomorrow) can be read as "asu" (あす) or "ashita" (あした), and 昨
> 日 (yesterday) can be read as "kinoo" (きのう) or "sakujitsu" (さくじ
> つ) depending on formality. 
> I suppose it would be similar for other languages based on both phonetic
> and logographic scripts, but I use Japanese as my example since that's
> what I'm attempting to learn currently. :)
> What about misc technical characters too - arrows (← → ↑ ↓) or the like?
> This can get quite overwhelming if we're not very careful.
Well, there was ☠ for a while but that looks to be pretty dead upstream. 
  I'm sure that there will be more at some point, though.

> In closing, I think it would be best to limit this to diacritic/accented
> characters. With an additional transliterated Provides, the ease case
> would be satisified, without the complexities provided by such writing
> systems as above.
You do a wonderful job of explaining what's wrong with us trying to 
adjust upstream's name to be ASCII but I just want to be certain we're 
on the same page by the end:

Package names should follow upstream since attempting to transliterate 
or translate upstream names can't be done sanely on our side.  For 
things that map easily into the ASCii set (diacritic/accented 
characters, for instance, as found in latin-1) a transliterated Provides 
can be added to make installation easier for ASCii-conditioned users but 
carrying this on to other scripts is a losing proposition.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/fedora-packaging/attachments/20080226/85b9131c/attachment.sig>

More information about the Fedora-packaging mailing list