RPM submission script

Nicolas Mailhot Nicolas.Mailhot at laPoste.net
Sat Nov 8 09:48:10 UTC 2003


Le sam 08/11/2003 à 08:08, Michael Schwendt a écrit :
> On Fri, 07 Nov 2003 20:30:00 -0500, seth vidal wrote:
> 
> > On Fri, 2003-11-07 at 19:18, Alan Cox wrote:
> > > > so mandating utf-8 is great, if you also do it in rpm spec files, and
> > > > filenames, etc, etc, etc.
> > > 
> > > Linux filenames are utf-8 and defined that way. Gives the nautilus people
> > > something to do ;)
> > > 
> > 
> > and yet if you look at packages from europe you often find file names
> > that are not.
> 
> It's sort of a virus, e.g. in Germany. One of the first things the average
> user does after a distribution upgrade is to edit /etc/sysconfig/i18n and
> change from UTF-8 to @euro or ISO 8859-1. 

Yeah, sure, sure way to kill the € symbol.

> The reason is that the effects
> of this change are not understood. It just "seems to work" with old
> Latin-1 file names, file contents and non-Unicode-aware applications.

So you provide a stupid shell script that recodes filenames to UTF-8 if
the old locale was something else (because you have access to the old
locale). And/or a glade app that recodes files in unicode.

As for file contents, either they have a sane encoding header and
they'll be fine, either they haven't and you're in a lot of trouble
anyway (in case you haven't noticed one of Java's top bugs is zip
encoding problems - Java uses zip-based formats everywhere but their
filename encoding is unspecified so java creates unicode zip filenames,
windows cp437/cp850 ones, linux iso8859-1 and now UTF-8 and unzipping
stuff can be... interesting)

.specs does not specify encodings.
People edit them in their default text editor which means the spec file
can be in any of the various encodings available, most notably in UTF-8
for all recent setups and people who regularly use something else than
base latin characters (and that includes a large part of even western
Europe).

Current specs are not ASCII already (not that it means anything anyway
since few people use 7-bit encodings and there are langages where you
need the upper 128 chars regularly). A growing number is in UTF-8, so
you can not say "hold on, specs have always been ascii". Plus in the
near future specs will have to be UTF-8 over the board, since that will
be the default mode of every single text editor available.

There is no easy solution. You have to require UTF-8 specs at some
point, and treat non-UTF-8 as a (minor) bug. In fact, it's probably high
time to do it, ie require all for-FC2 spec files to be converted as they
are updated so by the time FC2 is out everything will be nicely UTF-8
(FC1 staying the encoding mess it already is).

Cheers,

-- 
Nicolas Mailhot
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Ceci est une partie de message num?riquement sign?e.
URL: <http://listman.redhat.com/archives/fedora-devel-list/attachments/20031108/e9ac24f6/attachment.sig>


More information about the fedora-devel-list mailing list