[Fedora-packaging] UTF-8 package names

Patrice Dumas pertusus at free.fr
Wed Feb 27 09:24:24 UTC 2008

On Tue, Feb 26, 2008 at 08:05:01PM -0800, Peter Gordon wrote:
> In some languages, though, the diacritic differentiates the character
> from the "plain" form. For example, a Spanish package name for a similar
> studying software could be "¡Estudiará!" (third-person indicative future
> tense; literally, "You will study!"). However, we would need to be
> careful here because, without that accent, this changes the conjugation
> to "estudiara," which is the first- and third-person imperfect
> subjunctive (which really makes no sense on its own, since the
> subjunctive tense is meant to be used in a subjective or predictive
> clause of a sentence, such as referring to one's wants and desires for
> the future).

If wikipedia is right in http://en.wikipedia.org/wiki/Transliteration
what you are trying to do is not transliteration (in a narrow sense), 
but transcription. Transliterated word are not necessarily pronounced 
the same. There is an automatic mapping between characters only, 
irrespective of the correctness of the result in the original language
(the character mapping is in general based on characters similitude or 
sounds in english when it comes to transliteration in ASCII 7 bit). 

I think that we should not permit non ASCII 7 bit letters in package
names, but the transliteration or transcription scheme used should be 
left to the packager.

We can have aids, still, for packagers who don't have an idea on how to 
transliterate an upstream name. For example in texi2html/texinfo we
transliterate non ascii characters in file names. There are tables in
texinfo for some characters, and I use Text::Unidecode to complete in
texi2html. That way the file names are ascii 7 bit and are unlikely to
be problematic in any platform (but the portability issue is more severe
than in fedora, since we want these file names to be usable everywhere).
This scheme seems to work for a lot of characters -- though maybe better
schemes could be devised, if needed. But I don't think this would be
needed, instead left to the packager.


More information about the Fedora-packaging mailing list