[FC2] Character set problem, still there :(

Björn Persson listor1.rombobeorn at comhem.se
Wed May 26 10:55:54 UTC 2004


Coume - Lubox.com wrote:

> On Mon, 2004-05-24 at 18:59, Björn Persson wrote:
> 
>>File names can be converted. The content of the files is worse, because 
>>plain text files must be converted while many other file formats must 
>>not be touched, and some files like XML need to be converted if you want 
>>them to be readable in text editors, but may or may not be in a 
>>different encoding and may or may not contain encoding information that 
>>has to be updated if they are converted.
> 
> That's great if file names can be converted, I will have to look at that.. like for converting all my old file as UTF-8

I hope you didn't misunderstand me. I didn't mean that I know of a tool 
that converts file names for you, just that it can be done.

> But just wondering, how are displayed UTF-8 under windows? are they well
> displayed?

File names? As far as I know, none of the Windows filesystems allows 
file names in UTF-8.

The Unix oriented filesystems – at least ext2/ext3 – apparently don't 
specify how file names are encoded. Instead the operating system assumes 
that they are encoded in the system-wide encoding. As you discovered, 
this causes problems if more than one OS accesses the same disk, and for 
example Mandrake writes file names in Latin 1 and Fedora reads them as 
UTF-8.

In the Windows filesystems – at least FAT but I think NTFS too – long 
names are stored as UTF-16 (or UCS-2, I'm not sure). I don't think 
Windows has a setting to use UTF-8 instead. In FAT, every file also has 
a short name – the Dos style 8 + 3 characters – which is encoded in one 
of the Dos "codepages". I'd be very surprised if UTF-8 were allowed there.

Björn Persson





More information about the fedora-list mailing list