OCR in Fedora?

joachim.backes at rhrk.uni-kl.de joachim.backes at rhrk.uni-kl.de
Wed Jul 23 10:01:08 UTC 2008


Gustav Degreef wrote:
> On Mon, Jul 21, 2008 at 3:54 PM, Valent Turkovic
> <valent.turkovic at gmail.com> wrote:
>> On Mon, Jul 21, 2008 at 12:13 PM, Paul Smith <phhs80 at gmail.com> wrote:
>>> 2008/7/21 joachim.backes at rhrk.uni-kl.de <joachim.backes at rhrk.uni-kl.de>:
>>>>> Does anybody do OCR using software available in Fedora? Which ones do
>>>>> you use? How do you use them?
>>>>> I saw an article about OCRopus [1] and how great app it is but there
>>>>> is no ocropus in fedora currently.
>>>>>
>>>>> [1]
>>>>> http://arstechnica.com/news.ars/post/20071024-hands-on-with-googles-ocropus-open-source-scanning-software.html
>>>> I use gocr-0.45-2.fc9.i386
>>>>
>>>> I think it comes from the fedora repo.
>>> Tesseract is better:
>>>
>>> yum install tesseract
>>>
>>> Paul
>>>
>>> --
>>> fedora-list mailing list
>>> fedora-list at redhat.com
>>> To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
>>>
>> Hi Joachim and Paul,
>> do gocr and tesseract have GUIs? How are you using them? Do you get
>> formated text or just plain text file? Do gocr and tesseract recognise
>> colums? Is it possible to get formated OpenOffice Writer document that
>> matches the original scanned page?
>>
>> I read the article I posed the link to about OCRopus and it seams that
>> uses tesseract but it somehow improved.
>>
>> Cheers,
>> Valent.
> 
> I've used both gocr and tesseract on the same text.  gocr has a gui,

Sorry, but gocr has no gui. I think, gocr-gui has:

http://www.openbsd.org/4.2_packages/i386/gocr-gui-0.44.tgz-long.html

> tesseract is only command line.  I've used both tools on various tiff
> files.  There is a good writeup on the net, forget where on using
> tesseract on ubuntu.  I got much better text recognition with
> tesseract from the same original scanned text.  Never tried ocropus.
> gustav
> 


-- 
Joachim Backes <joachim.backes at rhrk.uni-kl.de>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 6101 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/fedora-list/attachments/20080723/318bd2a3/attachment-0001.bin>


More information about the fedora-list mailing list