Asian Squirrelmail trouble

Fri Apr 7 11:16:18 UTC 2006

On Thu, 2006-04-06 at 21:43 -0400, Warren Togami wrote:
> One of our European engineers dwmw2 tried to improve the Unicode 
> situation in our squirrelmail with the current scripted conversion hack 
> in an attempt to workaround upstream squirrelmail's horrible 
> localization policy where their encodings are inconsistently mixed. 
> This improved things a bit only for some European languages, but not the 
> Asian languages.

Squirrelmail was a mess. It would confuse character sets whenever the
user quoted a mail with non-ASCII characters, if the character set of
the mail being quoted was different from the character set of the user.

By switching it to use UTF-8 throughout, we fixed that problem -- and in
doing so we made it conform to the Fedora policy of using UTF-8 for
everything -- a policy which we've had for a long time now. So I changed
all the locales to use UTF-8, changed their help text, etc.

It was slightly more complicated for the Asian locales, as Warren says.
The Korean help text was _supposed_ to be in EUC-KR encoding, but
'iconv' refused to convert it to UTF-8, claiming that it wasn't valid
EUC-KR to start with.

The Japanese support in Squirrelmail has 'EUC-JP' hard-coded in one or
two more places than the normal configuration files, and does the
confusing thing with EUC-JP in the Web interface and ISO-2022-JP in the
email. I _thought_ we'd switched both of those to UTF-8, but I'm afraid
I lack the knowledge of Japanese which would be necessary to do
meaningful tests.

Yamakawa-san, if you are able to spare some of your time to help us test
the behaviour of Squirrelmail packages in the Japanese locale, then that
would be very much appreciated.

> As you may be well aware, it is currently infeasible to expect the Asian 
> countries to use only Unicode encoding due to the constantly moving 
> standards and different glyphs in different languages using the same 
> code-points, among other problems.

I've heard it said that UTF-8 is insufficient for Asian locales, but
I've also heard it argued against -- I lack the knowledge to judge its
truthfulness. I had supposed that the Fedora policy of using UTF-8
everywhere was sane and correct -- are you suggesting that it really
isn't? I know there are Luddites even in Europe who object to UTF-8. I
ignore those too.

I'd prefer to fix the bugs and make sure Squirrelmail works correctly in
UTF-8 for all languages, rather than reverting to obsolescent character
sets for certain locales.

-- 
dwmw2