[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: UTF8 settings (was: Can scp be used to update a directory?)

On Friday 24 March 2006 13:34, James Wilkinson wrote:
> A backup from an FC3 machine listed
> SUPPORTED="en_GB.UTF-8:en_GB:en:en_US.UTF-8:en_US:en"
> although I doubt both en references are strictly necessary.
OK, I've altered the file.

> > Here's a sample -
> >
> > ../Mp3/marisa_monte/rose_and_charcoal/06_dan�_da_solid�.mp3
> >
> > The title should read
> >
> > 06_dança_da_solidäo.mp3
> That's actually a different symptom of the same problem. UTF8 takes two
> bytes to store most common non-ASCII characters, whereas the ISO-8859
> family always uses one byte.
> What you first described was seeing the two UTF8 bytes in an ISO-8859
> program, so each accented character shows as two ISO-8859 characters
> (some of which will probably be "illegal", so you'll see spaces or
> something similar there).
It's quite possible that the two different displays were because, when first 
attempting to troubleshoot this, I experimented by setting different 
character sets in kde.

> What you've just illustrated is an ISO-8859 name viewed in an UTF-8
> environment, where two ISO-8859 characters are interpreted as one
> illegal UTF-8 character.
> My first reaction is to blame the generating program (what was it?)

Grip generated the mp3s.  I first saw the problem in k3b, but then in 
konqueror and kmail, all under FC4.

> In 
> my experience, many MP3 programs, following Winamp's example, have gone
> flat-out for skins and custome text-handling. Too many of them don't
> support UTF8 in $LANG properly.
> Alternatively, what did the server box use to run? How did you transfer
> the files? Red Hat went to UTF-8 early, and many other distros took a
> lot longer to upgrade. And transferring files might not get the
> conversion right.
It was running Mandriva 10.0.  In truth, though, I can't remember whether the 
box that generated the files was running Mdv 10.1 or 10.2.  I don't think 
10.0 had utf-8 (could be wrong) but it's very likely that I never elected to 
use utf-8 when it first became available.

> (You used to use Mandriva, didn't you? I'm not sure when they adopted
> UTF-8...)
> I wrote:
> > As for the single e-mail -- I'd blame the other end, personally.
> Anne said:
> > Maybe.  Maybe he has the same problem as I do.
> Um. Mail clients have no business not knowing which encoding they're
> using. And if they know that, they've no business not putting it into
> the headers of outgoing e-mail properly.
> We've proved that your e-mail client can receive UTF-8. I suppose
> there's still the chance that your correspondent used a weird encoding
> that your client didn't understand. But you're not going to get the
> "right" message anyway in those situations, except by blind luck.
Well thanks for the insights I've got, anyway.  And finding convmv was another 
good thing to come out of it.  It all helps.

> --
> E-mail address: james | In the Royal Air Force a landing's OK,
> @westexe.demon.co.uk  | If the pilot gets out and can still walk away.
>                       | But in the Fleet Air Arm the outlook is grim,
>                       | If your landings are duff and you've not learnt to
>                       | swim.

Attachment: pgpLqIQxldzau.pgp
Description: PGP signature

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]