what software used for ocr on linux

Willem van der Walt wvdwalt at csir.co.za
Fri Jul 11 07:19:43 UTC 2014


The abbyy engine does auto rotation detection or whatever the correct term 
is.  I found that one need to use the contrast setting of scanimage for 
best results.


On Fri, 11 Jul 2014, Tony Baechler wrote:

> I vaguely recall Tesseract having an option for this, but it isn't
> automatic.  Convert from ImageMagick should do that as well, but it isn't
> automatic either.  The short answer is trial and error if memory serves.  I
> remember thinking that maybe the reason for the terrible OCR is due to the
> pages not being aligned and rotating the images, but I didn't get any better
> results.  I haven't played with the other OCR engines.  I think FineReader
> is better about this.  I'm possibly wrong here, but as I understand it, the
> Windows software for the blind does the image rotation before passing it to
> the OCR engine and detects the page misallignment during the scanning
> process.  The Internet Archive seems to use FineReader and scans millions of
> books in all kinds of conditions, so perhaps it can handle the rotation
> automatically.
>
> On 2014-07-10 06:49 AM, Sam Hartman wrote:
>> Is there a way to get tesseract or openocr or anything open-source to
>> deal with rotations?
>> The commercial software along with anything targeted for the blind tends
>> to
>>
>> 1) deal with 90 or 180-degree rotaions--I put the book down on the glass
>> in the wrong orientation
>>
>> and
>>
>> 2) Deal with small rotations (it wasn't perfectly aligned) relatively
>> well.
>>
>> I find these features really important when scanning things myself.
>> Less so when OCRing images from the web etc.
>>
>> _______________________________________________
>> Blinux-list mailing list
>> Blinux-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/blinux-list
>>
>
> -- 
> Have a good day,
> Tony Baechler
> tony at baechler.net
>
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list
>
> -- 
> This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard.
> The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.
>
> This message has been scanned for viruses and dangerous content by MailScanner,
> and is believed to be clean.
>
> Please consider the environment before printing this email.
>
>




More information about the Blinux-list mailing list