Any Accurate O C R Programs in DOS I can run in Linux?
John G. Heim
jheim at math.wisc.edu
Fri Jun 28 20:03:58 UTC 2013
It looks as if tesseract can do something called "orientation and script
detection" but it doesn't do it by default. I haven't been able to try
it since my scanner is at home. But here is a quote from the tesseract
man page. Note that it says the default (option 3) is to not do OSD.
--- begin quote ---
tesseract - command-line OCR engine
tesseract imagename outbase [-l lang] [-psm N] [configfile ...]
Set Tesseract to only run a subset of layout analysis and
assume a certain form of image. The options for N are:
0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR.
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.
More information about the Blinux-list