Character encoding
Björn Persson
bjorn at xn--rombobjrn-67a.se
Sun Sep 7 17:23:02 UTC 2008
Adil Drissi wrote:
> the result of echo $LANG is the following: en_CA.UTF-8
Then you don't need to change the locale.
> Before when i was using windows i was an editor that allows to save in
> utf-8. Now after modifying some files using vi, vim or kate, i am finding
> that some files are encoded in us-ascii, some others don't show the type of
> encoding, so i'm really lost.
Do they look wrong if you read them as UTF-8? If all the characters are right
then there is no problem.
You should know that the program "file" can't really know the character
encoding of a file. I suppose it reads the file and tries to guess the
encoding.
> I can code a bash script that can convert from us-ascii to utf-8 for all
> the files of my website
Converting from ASCII to UTF-8 is very simple: Just declare that it is UTF-8.
UTF-8 is designed so that all ASCII characters are encoded the same way in
ASCII and UTF-8, so you can take any ASCII text and treat it as UTF-8, and if
you have a UTF-8 text that doesn't use any non-ASCII characters then it is in
practice ASCII.
Now, if a text is actually not 7-bit ASCII but one of the 8-bit encodings that
are sometimes called "ASCII", then it needs to be transcoded to become UTF-8.
> but for the files that don't show the current
> encoding i don't know what to do.
Open them and try different encodings. Try UTF-8 first, ISO 8859-1 second and
ISO 8859-15 third. Then continue with other encodings. When you find one that
makes the text look right, convert the file from that encoding to UTF-8.
Björn Persson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://listman.redhat.com/archives/fedora-list/attachments/20080907/2fb217ec/attachment-0001.sig>
More information about the fedora-list
mailing list