Any Accurate O C R Programs in DOS I can run in Linux?

John G. Heim jheim at math.wisc.edu
Fri Jun 28 17:14:11 UTC 2013


Well, I was goiing to say that too -- that the problem with tesseract 
probably wasn't the quality of the OCR itself but the orientation and 
cropping. But that's small consolation to someone who is used to having 
their software deal with all that on it's own.  There probably is 
something out there in open-source land that does the rotation, at 
least. If 20% of the words in the text aren't in the dictionary, rotate 
the image and try again.  Something like that would be easy enough to 
write. But if there is something like that available as open source, I 
am unaware of it.

On 06/27/13 17:44, aw585 at lafn.org wrote:
>
> As Mr. Hart is well aware, tesseract works good enough that
> it probably solved better than 90% of the captcha's for
> slimrat-nox when this was needed for downloading from Rapidshare.
>
> What is probably needed is to rotate and crop the output
> images from SANE's scanimage using Imagemagick / convert
> or pnm tools, when scanning anything other that a sheet of paper
> oriented in portrait format.
> In fact, if you look at the tesseract man page, it lists
> the 'convert' program in the 'SEE ALSO' section.
>
>
> Regards,
> Dallas E. Legan II
> legan at acm.org / aw585 at lafn.org /
> http://www.lafn.org/~aw585/index.html
>
> ---------------------------------------------
> This message was sent using Endymion MailMan.
> http://www.endymion.com/products/mailman/
>
>
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list
>

-- 
---
John G. Heim, 608-263-4189, jheim at math.wisc.edu




More information about the Blinux-list mailing list