From mikko.harjula at mediware.fi Tue Jan 10 06:12:41 2006 From: mikko.harjula at mediware.fi (Mikko Harjula) Date: Tue, 10 Jan 2006 08:12:41 +0200 Subject: [Fedora-i18n-list] utf-8 national character input in Linux Konsole/KDE/X? Message-ID: <17347.20569.768352.569593@mikko.mediware.fi> I'm running KDE on Fedora Core 4, kernel 2.6.14-0.10.rrt.rhfc4.ccrma (i.e. Planet CCRMA patched FC4 kernel, but the same problem occurs with the FC4 kernel too): $ konsole -v Qt: 3.3.4 KDE: 3.4.2 Level "b" Konsole: 1.5.2 My problem is that I cannot input national characters from keyboard in KDE Konsole. This used to work fine with my old RedHat 9 using iso-8859-1. I have the adiaeresis etc. keys on my keyboard. I have googled around and others having problems like this are trying to type chinese or other much more complicated, so this must be something that should work right out of the box! In virtual console (text mode you get with alt-FN, N=1-6 outside ox X) I can type and see utf-8 characters. If XXX is a string with national characters I can say: $ cat >XXX XXX ^D $ cat XXX XXX $ Also less works if I set LESSCHARSET to utf-8. However even in virtual console 'ls' shows utf-8 national characters in filenames as two question marks. In X (KDE) the default xterm (konsole) is worse. I cannot type national characters, not even as dead key sequences. National character keys and 'dead' keys print nothing and also swallow the following characters. The display shows only some small dots, smaller and positioned a bit lower than a period. The ls command prints the "??" for each national char. The 'cat XXX' shows the contents of the file OK - of course I have to use wildcards to match the filename. Less works too. So it seems konsole can show utf-8 but the input method is wrong. In GNU Emacs 21.4.1 under X and in konsole (option -nw) entering national characters produces nothing, dired shows national chars in filenames correctly and displays file content correctly. XEmacs 21.4 (patch 17) under X seems to work correctly with national characters, but with -nw option input does not work anymore. Xterm is a little better than Konsole because there I can type these letters and they show up correctly, but every time I type one letter the next letter is not displyed before I type something else. And this behaviour sticks even through returns, so it seems to be deep inside some input processing as if these would be like dead keys. By the way the dead-key mechanism (pressing the diaeresis followed by a produces adiaeresis) does not work either. Currently I have: $ env|egrep '^(LANG|LC_)' LC_COLLATE=posix LANG=fi_FI.UTF-8 and locale is: $ locale locale: Cannot set LC_ALL to default locale: Tiedostoa tai hakemistoa ei ole LANG=fi_FI.UTF-8 LC_CTYPE="fi_FI.UTF-8" LC_NUMERIC="fi_FI.UTF-8" LC_TIME="fi_FI.UTF-8" LC_COLLATE=posix LC_MONETARY="fi_FI.UTF-8" LC_MESSAGES="fi_FI.UTF-8" LC_PAPER="fi_FI.UTF-8" LC_NAME="fi_FI.UTF-8" LC_ADDRESS="fi_FI.UTF-8" LC_TELEPHONE="fi_FI.UTF-8" LC_MEASUREMENT="fi_FI.UTF-8" LC_IDENTIFICATION="fi_FI.UTF-8" LC_ALL= The message "Tiedostoa tai hakemistoa ei ole" is "no such file" in Finnish. My /etc/sysconfig/i18n contains: LANG="fi_FI.UTF-8" SYSFONT="latarcyrheb-sun16" SUPPORTED="fi_FI.UTF-8:fi_FI:fi" My /etc/X11/xorg.conf has: Section "InputDevice" Identifier "Keyboard0" Driver "kbd" Option "XkbModel" "pc105" Option "XkbLayout" "fi" EndSection I don't have ~/.i18n or ~/.Xmodmap and I don't know if I should have one. When I start xev and press and releas '?' I get: KeyPress event, serial 31, synthetic NO, window 0x2600001, root 0x5f, subw 0x0, time 36690043, (434,546), root:(438,567), state 0x10, keycode 48 (keysym 0xe4, adiaeresis), same_screen YES, XLookupString gives 1 bytes: (e4) "??? XmbLookupString gives 1 bytes: (e4) "??? XFilterEvent returns: False KeyRelease event, serial 31, synthetic NO, window 0x2600001, root 0x5f, subw 0x0, time 36690142, (434,546), root:(438,567), state 0x10, keycode 48 (keysym 0xe4, adiaeresis), same_screen YES, XLookupString gives 1 bytes: (e4) "??? The adiaeresis seems to be correct but XLookupString gives something funny. I have spent several days on this without any luck (including several reinstalls) and would appreciate any help. Also leads to in-depth documentation thoroughly explaining the mechanisms involved when keycodes go through kernel, X, KDE, konsole, bash and whatever would be mostly welcome. -- Mikko Harjula puh. 010-525 1555 mikko.harjula at mediware.fi gsm 040-778 6669 From ftnx at ksbase.com Tue Jan 10 15:50:44 2006 From: ftnx at ksbase.com (Not Here) Date: Tue, 10 Jan 2006 11:50:44 -0400 Subject: [Fedora-i18n-list] utf-8 national character input in Linux Konsole/ In-Reply-To: <17347.20569.768352.569593@mikko.mediware.fi> Message-ID: <1136894102@ksbase> Tuesday January 10 2006 08:12, Mikko Harjula wrote to All: MH> I'm running KDE on Fedora Core 4, kernel MH> 2.6.14-0.10.rrt.rhfc4.ccrma MH> (i.e. Planet CCRMA patched FC4 kernel, but the same problem occurs MH> with the FC4 kernel too): I used to type the ?, ? etc. on DOS, OS/2 and still do in, Windows using Alt-keys and High ASCII. Never been able to do that in Linux under any version, either on the console or in any application. Changing the language system-wide is out of the question, since en_US is required everywhere else. Kari Suomela KARICO Business Services Toronto, ON Canada http://www.karico.ca ... Money is the root of all wealth. From llch at redhat.com Thu Jan 19 05:09:43 2006 From: llch at redhat.com (Leon Ho) Date: Thu, 19 Jan 2006 15:09:43 +1000 Subject: [Fedora-i18n-list] utf-8 national character input in Linux Konsole/KDE/X? In-Reply-To: <17347.20569.768352.569593@mikko.mediware.fi> References: <17347.20569.768352.569593@mikko.mediware.fi> Message-ID: <200601191509.43981.llch@redhat.com> Hi Mikko, How does this compare to gnome-terminal? And have you tried to change the Settings -> Encodings to 'UTF-8' in konsole and see if it works? Is your KDE rpms grabbed somewhere else? [llch at dhcp-109 ~]$ konsole -v Qt: 3.3.4 KDE: 3.4.2-0.fc4.1 Red Hat Konsole: 1.5.2 Cheers, Leon On Tuesday 10 January 2006 16:12, Mikko Harjula wrote: > I'm running KDE on Fedora Core 4, kernel 2.6.14-0.10.rrt.rhfc4.ccrma > (i.e. Planet CCRMA patched FC4 kernel, but the same problem occurs > with the FC4 kernel too): > > $ konsole -v > Qt: 3.3.4 > KDE: 3.4.2 Level "b" > Konsole: 1.5.2 > > My problem is that I cannot input national characters from keyboard in KDE > Konsole. This used to work fine with my old RedHat 9 using iso-8859-1. I > have the adiaeresis etc. keys on my keyboard. I have googled around and > others having problems like this are trying to type chinese or other much > more complicated, so this must be something that should work right out of > the box! > > In virtual console (text mode you get with alt-FN, N=1-6 outside ox X) I > can type and see utf-8 characters. If XXX is a string with national > characters I can say: > > $ cat >XXX > XXX > ^D > $ cat XXX > XXX > $ > > Also less works if I set LESSCHARSET to utf-8. However even in > virtual console 'ls' shows utf-8 national characters in filenames as > two question marks. > > In X (KDE) the default xterm (konsole) is worse. I cannot type > national characters, not even as dead key sequences. National > character keys and 'dead' keys print nothing and also swallow the > following characters. The display shows only some small dots, smaller > and positioned a bit lower than a period. The ls command prints the > "??" for each national char. The 'cat XXX' shows the contents of the > file OK - of course I have to use wildcards to match the filename. > Less works too. So it seems konsole can show utf-8 but the input > method is wrong. > > In GNU Emacs 21.4.1 under X and in konsole (option -nw) entering > national characters produces nothing, dired shows national chars in > filenames correctly and displays file content correctly. > > XEmacs 21.4 (patch 17) under X seems to work correctly with national > characters, but with -nw option input does not work anymore. > > Xterm is a little better than Konsole because there I can type these > letters and they show up correctly, but every time I type one letter > the next letter is not displyed before I type something else. And > this behaviour sticks even through returns, so it seems to be deep > inside some input processing as if these would be like dead keys. By > the way the dead-key mechanism (pressing the diaeresis followed by a > produces adiaeresis) does not work either. > > Currently I have: > > $ env|egrep '^(LANG|LC_)' > LC_COLLATE=posix > LANG=fi_FI.UTF-8 > > and locale is: > > $ locale > locale: Cannot set LC_ALL to default locale: Tiedostoa tai hakemistoa ei > ole LANG=fi_FI.UTF-8 > LC_CTYPE="fi_FI.UTF-8" > LC_NUMERIC="fi_FI.UTF-8" > LC_TIME="fi_FI.UTF-8" > LC_COLLATE=posix > LC_MONETARY="fi_FI.UTF-8" > LC_MESSAGES="fi_FI.UTF-8" > LC_PAPER="fi_FI.UTF-8" > LC_NAME="fi_FI.UTF-8" > LC_ADDRESS="fi_FI.UTF-8" > LC_TELEPHONE="fi_FI.UTF-8" > LC_MEASUREMENT="fi_FI.UTF-8" > LC_IDENTIFICATION="fi_FI.UTF-8" > LC_ALL= > > The message "Tiedostoa tai hakemistoa ei ole" is "no such file" in Finnish. > > My /etc/sysconfig/i18n contains: > > LANG="fi_FI.UTF-8" > SYSFONT="latarcyrheb-sun16" > SUPPORTED="fi_FI.UTF-8:fi_FI:fi" > > My /etc/X11/xorg.conf has: > > Section "InputDevice" > Identifier "Keyboard0" > Driver "kbd" > Option "XkbModel" "pc105" > Option "XkbLayout" "fi" > EndSection > > I don't have ~/.i18n or ~/.Xmodmap and I don't know if I should have one. > > When I start xev and press and releas '?' I get: > > KeyPress event, serial 31, synthetic NO, window 0x2600001, > root 0x5f, subw 0x0, time 36690043, (434,546), root:(438,567), > state 0x10, keycode 48 (keysym 0xe4, adiaeresis), same_screen YES, > XLookupString gives 1 bytes: (e4) "??? > XmbLookupString gives 1 bytes: (e4) "??? > XFilterEvent returns: False > > KeyRelease event, serial 31, synthetic NO, window 0x2600001, > root 0x5f, subw 0x0, time 36690142, (434,546), root:(438,567), > state 0x10, keycode 48 (keysym 0xe4, adiaeresis), same_screen YES, > XLookupString gives 1 bytes: (e4) "??? > > The adiaeresis seems to be correct but XLookupString gives something funny. > > I have spent several days on this without any luck (including several > reinstalls) and would appreciate any help. Also leads to in-depth > documentation thoroughly explaining the mechanisms involved when keycodes > go through kernel, X, KDE, konsole, bash and whatever would be mostly > welcome. From mikko.harjula at mediware.fi Fri Jan 20 12:28:07 2006 From: mikko.harjula at mediware.fi (Mikko Harjula) Date: Fri, 20 Jan 2006 14:28:07 +0200 Subject: [Fedora-i18n-list] utf-8 national character input in Linux Konsole/KDE/X? In-Reply-To: <200601191509.43981.llch@redhat.com> References: <17347.20569.768352.569593@mikko.mediware.fi> <200601191509.43981.llch@redhat.com> Message-ID: <17360.55127.18727.59501@mikko.mediware.fi> Leon Ho writes: > Hi Mikko, > > How does this compare to gnome-terminal? > > And have you tried to change the Settings -> Encodings to 'UTF-8' in konsole > and see if it works? Hi Leon, Thanks for your input but I found the problem. The solution came from Enrique Perez-Terron in comp.os.linux.setup newsgroup. He helped me to compare several config files which helped me to avoid wasting my efforts to something that is correct. Also I got several good hints from him: - The 'ls | od' is not a good test to check the output of ls-command as ls behaves differently when the output is terminal (now I used to know this!). A better way is to use: script ls exit od -c typescript - Another great idea was to check the environment of a running program from /proc/PID/environ. - Third we compared the environment and output of the xev-command and indeed for him the XLookupString returned 2 bytes and a valid utf-8 code when for me it returned one garbage byte! - The finally decisive idea was to use strace -e trace=file locale. This revealed: open("/usr/lib/locale/posix/LC_COLLATE", O_RDONLY) = -1 ENOENT (No such file or directory) and the whole directory /usr/lib/locale/posix actually is missing. So the problem was all the time the LC_COLLATE=posix! I have had this as long as there has been iso-8859-1 locale. As a programmer I like to see Makefile and README first in the directory listing and all over the net everyone tells to use locale 'posix'. Well this does not exist in FC4. There is locale 'C' which seems to produce similar result in LC_COLLATE. So, problem solved!!! Now national character input works OK and ls lists filenames OK! Hope this info helps someone else. It seems to me the LC_COLLATE setting should not affect the results of other locale variables, but maybe the locale processing dies in the middle when it encounters an unknown setting. So I can see where this could come from. There may be some need to improve the robustness of locale processing in glibc. -- Mikko Harjula puh. 010-525 1555 mikko.harjula at mediware.fi gsm 040-778 6669