UTF-8 and filenames

Toshio Kuratomi a.badger at gmail.com
Wed Mar 14 07:01:18 UTC 2007


On Wed, 2007-03-14 at 02:07 -0400, Matthias Clasen wrote:
> On Tue, 2007-03-13 at 22:57 -0700, Toshio Kuratomi wrote:
> 
> > 
> > The tools that we're building (package database, koji, etc) currently
> > assume that we'll only encounter UTF-8 filenames.  We've found at least
> > one package (aspell-is) which currently has a non-UTF-8 filename so we
> > want to decide if these cases should be considered packaging bugs or if
> > we need to build some sort of support for this into our tools.  Does
> > this need to be a packaging guideline?  Perhaps not but where else does
> > it fit?  We could tuck it in as one of the things rpmlint reports and
> > not list it explicitly but it is something that we are going to always
> > want fixed (whereas we allow people to dispute many of the other errors
> > and warnings reported by rpmlint.)
> > 
> 
> While in practise 99.9% of all filenames will be UTF-8 or even ASCII,
> it seems misguided to let tools make assumptions about that. The only
> assumption that can be safely made is that '/' and '\0' don't occur
> inside the byte sequence that makes up a filename...

The thing is we control the filenames to some extent.  If we decide that
every filename in one of our packages has to be utf-8 then we'll never
have a filename enter the database that isn't utf-8.  If we decide that
it's okay for fedora packages to contain files whose names are not
encoded in utf-8 then the tools will have to cope with it.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/fedora-maintainers/attachments/20070314/3954924b/attachment.sig>


More information about the Fedora-maintainers mailing list