ebook-speaker/UTF-8 long files

Linux for blind general discussion blinux-list at redhat.com
Sun Sep 20 12:30:36 UTC 2020


I have no experience with eBook reader, but when converting documents
to plain text, I find that using iconv to convert to ascii is useful
for ensuring files don't contain characters that fail to display/get
spoken properly.

the command I use for this is:

iconv -f UTF-8 -t ascii//TRANSLIT inputFile > outputFile

Note that TRANSLIT is all uppercase and ther's a greater than/right
angle bracket between inputFile and outputFile as iconv outputs to the
screen by default and doesn't, as far as I know, have a built-in
option for setting outputFile, hence the redirection.

I've only really used this on English-language text files, but it'll
do things like converting left and right curly single and double
quotes to straight quotes and I persume replacing accented letters
with their unaccented counterparts... No clue what it would do with
non-Latin text.

It might not help with your problem, but my experience is that doing
this cuts down on the number of "thorn" characters I come across
reading converted to plain text files in nano, and if the issue is
specifically with something UTF-8 related, it might help.

iconv can be used for other encoding conversions, though I've only
ever used it for collapsing UTF-8 files to ascii... also, without the
//TRANSLIT bit on the output encoding, I'm pretty sure the program
just halts the first time it encounters a character that isn't part of
the target charset(e.g. thare are no curly quotes in ascii, so without
the //TRANSLIT, converting to ascii will fail at the first curly quote
while with it, it'll convert to a straight quote and continue).




More information about the Blinux-list mailing list