Curious characters in Thunderbird on Linux...

Tim ignored_mailbox at yahoo.com.au
Thu Jul 31 03:35:06 UTC 2008


On Wed, 2008-07-30 at 20:54 -0500, Kevin Martin wrote:
> So if messages are sent using an encoding that you are not this will 
> happen?  Crud, how do you get around /that/?

Your client should automatically display the text correctly, transcoding
if it has to.  Of course, that will only work if:

1. The message correctly identifies which encoding it used.
2. It's an encoding that your client understands.
3. You have fonts that can provide the characters needed.
4. You haven't forced your client to use a particular encoding.
5. The message hasn't been mangled in transit.

Point 1 is a common problem, because authors will cut and paste text
using different encodings, using systems that don't transcode as they do
that.  And they may be forcing their client to use a particular encoding
scheme, one that's not what they're actually using.

Point 2 is probably fine, in this day and age.  Point 3 is probably
fine, too, but is still a likely problem area.

Point 4 can be a common problem, generally you shouldn't try to force
your client to always work in a particular encoding, you should leave
your client to work it out automatically.  Setting a default is fine,
for which to choose to start off with when authoring messages (your
system's default, would be best).  And setting a default is almost fine,
for which to choose when reading unidentified text, though the RFC
default for unidentified text was ASCII, and I think is now ISO-8859-1
(I can't be bothered checking, now, and the two are equivalent, as far
the ASCII portion goes).

Point 5 is a common problem when messages pass through some services
that want to transcode a message into their own schemes, and make a
pig's breakfast of it.  Some mailing list software was notorious for
doing that sort of thing.

When replying to a message, there's two common defaults:

a. Reply in the same scheme as the original message.
b. Reply in your usual encoding scheme.

Point a is fine, so long as your client can make use of that encoding
(it probably can, unless it's a truly strange one - some weird schemes
required specially-arranged fonts).

Point b is fine, so long as your client transcodes the original message,
if quoting it, into the encoding scheme that it's going to use.  Looking
at the mess some clients make, I wonder if they actually do that, rather
than just bodge the text in as-is, without changing the encoding.

If you think all of that is a right headache, it is.  That's why there
was a push for unicode all those years back.  One scheme for everyone,
and no transcoding required.
-- 
[tim at localhost ~]$ uname -r
2.6.25.11-97.fc9.i686

Don't send private replies to my address, the mailbox is ignored.  I
read messages from the public lists.






More information about the fedora-list mailing list