OCR on linux

Daniel Dalton d.dalton at iinet.net.au
Fri Apr 25 11:19:29 UTC 2008


On Fri, 25 Apr 2008, Tony Baechler wrote:

> Daniel Dalton wrote:
>> If I was to buy a new scanner what model is the easiest to set up and the 
>> best supported?
>> What one would you recommend?
>> 
>
>
> Hi,
>
> Pretty much any scanner should work nowadays. You want one that's TWAIN 
> compatible. That includes most Epson, Cannon, HP, etc. You probably want a 
> USB scanner. The only thing to watch is that some require their own Windows

OK, I'll make sure I do that.

> drivers which of course won't work in Linux. This seems true of HP but I had

indeed

> this with an Epson also. I don't yet do scanning in Linux so I can't really 
> give specific help besides that, but if in doubt, look for something like

oh ok

> "best scanner" or "supported scanner models" at http://www.google.com/linux

I will.

>
> If you get one to work, I would be interested in your results. I am 
> interested in trying to scan documents in Linux and have found the OCR thread 
> interesting. I would also be interested in which engine produces the best

I have too... and I'll let you know how I go...

> text quality. I know from trying different ones under Windows that results 
> can drastically vary depending on many factors.

I didn't do this under windows, but ok.

>
> You asked about page images with text. First, be aware that there are at 
> least 4 different types of .tif images. One is compressed, one is for faxes, 
> one is for multiple pages and one is the standard, old fashioned, single 
> page. You want the later. You'll know that it's right because it will only 
> support one page per document and the files will be very big, about 1 MB per 
> file. I've had bad luck with the other .tif variations. Also, there are many

oh ok
> I hope this is helpful to you. Have a good weekend.

It is. Have a good weekend too!

-- 
Daniel Dalton

http://members.iinet.net.au/~ddalton/
<d.dalton at iinet.net.au>




More information about the Blinux-list mailing list