How to enter unicode in F9

Dan Thurman dant at cdkkt.com
Sun Nov 9 20:02:00 UTC 2008


Alan Cox wrote:
>
> > For example, there is umlaut - and that could be transliterated into
> > `u' for example.  Others may have strange looking unicode and I have
>
> That depends on the language. Unicode is just character encoding rules.
> You need more context to do transliterations. Not that you should need to
> as you can just install the relevant fonts. DejaVu has pretty good
> European coverage for example and is one of the standard installed fonts
> in current Fedora.
>
> You also have to watch the encodings. Its not uncommon to find mis-coded
> information in OGG and similar files where the track data is mis-encoded
> in one of the legacy ISO-8859 code pages not UTF-8 and that produces
> invalid utf-8 sequences so will be displayed as the symbol for an invalid
> character.
>
> > no idea what it is supposed to me - so I cannot transliterate w/o 
> knowing
> > what it is in the first place - so how do I find out?  The unlaut is
> > sometimes
> > obvious - but others are not.  So is there a way to show this?  As I 
> said,
>
> Load the right fonts and they will be rendered correctly.
>
> > I get binary icons so how do I get the unicode decimal 
> representation so
> > that I can match against the unicode character table to see what it is?
>
> The 'four squares' shown for an unknown symbol should each contain a
> hex digit which together give you the symbol code which you can look up
> on the unicode web site.
>
> > Would it be: print \\%d, $1 ?
>
> It's UTF-8 so a variable length encoding of the full unicode symbol
> space. See www.unicode.org if you want to the full details but basically
> each symbol is encoded as a series of bytes such that C special symbol \0
> is never found mid-character and so that the ASCII range of symbols for
> American English is mapped 1:1 with UTF-8.
>
> Alan
>
Thanks for the tip!  I will review the link you gave me (already started!)

Dan




More information about the fedora-list mailing list