Character encoding
Anders Karlsson
anders at trudheim.co.uk
Sun Sep 7 18:07:33 UTC 2008
* Adil Drissi <adil.drissi at yahoo.com> [20080907 18:15]:
> Hi François,
>
> Thank you for your answer.
> the result of echo $LANG is the following: en_CA.UTF-8
In this event, you are most likely writing files out as utf-8 encoded
already. I got curious by your posting, and started testing a bit
myself.
> I want the web pages i develop to fully support french. The tag in html is must of course. But i want the file to be encoded in UTF-8.
>
> Before when i was using windows i was an editor that allows to save in utf-8. Now after modifying some files using vi, vim or kate, i am finding that some files are encoded in us-ascii, some others don't show the type of encoding, so i'm really lost.
>
> I can code a bash script that can convert from us-ascii to utf-8 for all the files of my website but for the files that don't show the current encoding i don't know what to do.
>
> Please help
I wrote a small html file to test with (Yeah, I know, it's not valid
HTML, but it serves the purpose).
On this, I ran:
$ cat test.html
<html>
åäö
</html>
$ file --mime test.html
test.html: text/html
$ $ od -x -c test.html
0000000 683c 6d74 3e6c c30a c3a5 c3a4 0ab6 2f3c
< h t m l > \n 303 245 303 244 303 266 \n <
/
0000020 7468 6c6d 0a3e
h t m l > \n
0000026
$
If you on your file start seeing the "303" char involved, and you get
two bytes encoding a single char - your file is likely utf-8
already. I had a quick read at
http://www.phpwact.org/php/i18n/charsets as well, and it may help
you.
Hope this helps a bit at least.
/Anders
More information about the fedora-list
mailing list