[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [FC2] Character set problem, still there :(

Coume - Lubox.com wrote:

On Mon, 2004-05-24 at 18:59, BjÃrn Persson wrote:

File names can be converted. The content of the files is worse, because plain text files must be converted while many other file formats must not be touched, and some files like XML need to be converted if you want them to be readable in text editors, but may or may not be in a different encoding and may or may not contain encoding information that has to be updated if they are converted.

That's great if file names can be converted, I will have to look at that.. like for converting all my old file as UTF-8

I hope you didn't misunderstand me. I didn't mean that I know of a tool that converts file names for you, just that it can be done.

But just wondering, how are displayed UTF-8 under windows? are they well

File names? As far as I know, none of the Windows filesystems allows file names in UTF-8.

The Unix oriented filesystems â at least ext2/ext3 â apparently don't specify how file names are encoded. Instead the operating system assumes that they are encoded in the system-wide encoding. As you discovered, this causes problems if more than one OS accesses the same disk, and for example Mandrake writes file names in Latin 1 and Fedora reads them as UTF-8.

In the Windows filesystems â at least FAT but I think NTFS too â long names are stored as UTF-16 (or UCS-2, I'm not sure). I don't think Windows has a setting to use UTF-8 instead. In FAT, every file also has a short name â the Dos style 8 + 3 characters â which is encoded in one of the Dos "codepages". I'd be very surprised if UTF-8 were allowed there.

BjÃrn Persson

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]