[FC2] Character set problem, still there :(

Björn Persson listor1.rombobeorn at comhem.se
Mon May 24 16:59:38 UTC 2004


Coume - Lubox.com wrote:

> - I created files under Win 95,98,ME and XP, which include french
> characters in their names such as é,è,à, etc...
> - When I was using MDK or Knoppix I was able to see them correctly as
> they were still on a Win partition of my HDD.
> - For the last 14-15 months, I deleted all the Windows partitions I was
> having to use ONLY linux and I consequently copied all those files from
> a Windows partition to a linux one ext3.
> 
> Now when I try to see a file which has french characters , the displayed
> name is: (for instance) 
>      Mon beau p?re et moi.avi (invalid Unicode)
> When it should be
>      Mon beau pére et moi.avi

The file names are encoded in one of the 8-bit encodings, probably Latin 
1 (ISO 8859-1) or Latin 9 (ISO 8859-15), but Fedora assumes that they 
are in UTF-8. You can change Fedora's system-wide character encoding in 
/etc/sysconfig/i18n. I hear Latin 9 is best for French.

> Then I had a mad idea, rename all those problematic files to get them
> working... Sounds great, doesn't it?

File names can be converted. The content of the files is worse, because 
plain text files must be converted while many other file formats must 
not be touched, and some files like XML need to be converted if you want 
them to be readable in text editors, but may or may not be in a 
different encoding and may or may not contain encoding information that 
has to be updated if they are converted.

> erm.... No! Because if I try to
> upload those files on my portable MP3 player an iRiver, I got characters
> problem...

Does Linux view the iRiver as a hard disk, a file server or something 
else? If it's viewed as a disk, I bet it's mounted as VFAT. Long file 
names in VFAT are in a 16-bit encoding (UTF-16 or UCS-2), and the driver 
converts them. It seems that for some reason you have to give parameters 
to mount to tell it which encoding to translate to. You'd think it could 
have used the system-wide encoding automatically ...

If it's viewed as a file server it's most certainly an SMB server. 
Again, it looks like you have to give parameters to smbmount.

If it talks some other protocol you have to find out if the driver can 
do character encoding conversion. If not, you could always work around 
it by making Fedora use the same character encoding as the iRiver uses.

Björn Persson





More information about the fedora-list mailing list