ebook-speaker/UTF-8 long files
Linux for blind general discussion
blinux-list at redhat.com
Sun Sep 20 12:30:36 UTC 2020
I have no experience with eBook reader, but when converting documents
to plain text, I find that using iconv to convert to ascii is useful
for ensuring files don't contain characters that fail to display/get
spoken properly.
the command I use for this is:
iconv -f UTF-8 -t ascii//TRANSLIT inputFile > outputFile
Note that TRANSLIT is all uppercase and ther's a greater than/right
angle bracket between inputFile and outputFile as iconv outputs to the
screen by default and doesn't, as far as I know, have a built-in
option for setting outputFile, hence the redirection.
I've only really used this on English-language text files, but it'll
do things like converting left and right curly single and double
quotes to straight quotes and I persume replacing accented letters
with their unaccented counterparts... No clue what it would do with
non-Latin text.
It might not help with your problem, but my experience is that doing
this cuts down on the number of "thorn" characters I come across
reading converted to plain text files in nano, and if the issue is
specifically with something UTF-8 related, it might help.
iconv can be used for other encoding conversions, though I've only
ever used it for collapsing UTF-8 files to ascii... also, without the
//TRANSLIT bit on the output encoding, I'm pretty sure the program
just halts the first time it encounters a character that isn't part of
the target charset(e.g. thare are no curly quotes in ascii, so without
the //TRANSLIT, converting to ascii will fail at the first curly quote
while with it, it'll convert to a straight quote and continue).
More information about the Blinux-list
mailing list