Any Accurate O C R Programs in DOS I can run in Linux?
John G. Heim
jheim at math.wisc.edu
Fri Jun 28 17:14:11 UTC 2013
Well, I was goiing to say that too -- that the problem with tesseract
probably wasn't the quality of the OCR itself but the orientation and
cropping. But that's small consolation to someone who is used to having
their software deal with all that on it's own. There probably is
something out there in open-source land that does the rotation, at
least. If 20% of the words in the text aren't in the dictionary, rotate
the image and try again. Something like that would be easy enough to
write. But if there is something like that available as open source, I
am unaware of it.
On 06/27/13 17:44, aw585 at lafn.org wrote:
> As Mr. Hart is well aware, tesseract works good enough that
> it probably solved better than 90% of the captcha's for
> slimrat-nox when this was needed for downloading from Rapidshare.
> What is probably needed is to rotate and crop the output
> images from SANE's scanimage using Imagemagick / convert
> or pnm tools, when scanning anything other that a sheet of paper
> oriented in portrait format.
> In fact, if you look at the tesseract man page, it lists
> the 'convert' program in the 'SEE ALSO' section.
> Dallas E. Legan II
> legan at acm.org / aw585 at lafn.org /
> This message was sent using Endymion MailMan.
> Blinux-list mailing list
> Blinux-list at redhat.com
John G. Heim, 608-263-4189, jheim at math.wisc.edu
More information about the Blinux-list