extracting text from png files
Linux for blind general discussion
blinux-list at redhat.com
Mon Dec 17 22:56:00 UTC 2018
i use tesseract for doing this.
I recognized with version 4.0 what just is released the results improved
a lot here (for german and english usecases).
some offical numbers could be found here:
the languages improves between 10 and 80 percent - depending on language
and it previouse support level..
It seems it got a new OCR engine spend based on neuronal network.
Am 17.12.18 um 16:57 schrieb Linux for blind general discussion:
> Disclaimer: I don't know which image formats either program supports
> directly, nor do I know of a good way to convert between image
> formats, though I'm pretty sure cuneiform supports at least .jpg and
> .png files directly.
> I also remember at least one OCR tutorial recommending some
> preprocessing to make images easier for the OCR program to work with,
> and I believe they used the convert command provided by imagemagick to
> do so, but I forget the details.
> Also, it's been a while since I've attempted any OCR'ing myself(how
> often I had to manually clean up the output kind of put me off), so
> there might be others on this list who can provide better, and more
> specific advice on this subject.
> Still, I hope I've at least got you started on the right track.
More information about the Blinux-list