OCR on linux

Willem van der Walt wvdwalt at csir.co.za
Thu Apr 24 11:27:37 UTC 2008


The different ocr engines differ on which formats they supported, but 
install imagemagic and use a command like
convert file.jpg file.png
to get the image into the correct format for the engine.
ocropus used to require png but I susspect now-a-days it might accept 
other image formats too.
There are some test images with most engines when you build from source.


On Thu, 24 Apr 2008, Daniel Dalton wrote:

> On Thu, 24 Apr 2008, Willem van der Walt wrote:
> 
> > You can try tesseract or another ocr engine without having sane working if
> > you have an image file that you know has text in like a .tif file or so.
> 
> Oh, is jpeg supported?
> Or how can I get a tif on windows? I know someone with a windows scanner
> setup.
> 
> > Sane is needed to get the stuff from the paper in the scanner into an
> > image file.
> 
> yep
> 
> > Note: the current latest svn version of ocropus does not run, try a
> > version from about two weeks ago.
> 
> Oh ok. I will. Thanks.
> 
> -- 
> Daniel Dalton
> 
> http://members.iinet.net.au/~ddalton/
> <d.dalton at iinet.net.au>
> 
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list
> 

-- 
This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. 
The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.

This message has been scanned for viruses and dangerous content by MailScanner, 
and is believed to be clean.  MailScanner thanks Transtec Computers for their support.




More information about the Blinux-list mailing list