pdf documents
Tony Baechler
tony at baechler.net
Fri Dec 28 09:25:43 UTC 2007
Hi,
Have you experimented with the pdftotext -layout and -raw options? I've
noticed that usually no command line options produces usable output but
sometimes using the -raw command line optin works better. Then again,
sometimes it makes it worse. There's generally a decrease in the output
file size when using the -raw option. Also, since presumably the
interest of most people here is in making pdf documents accessible or
somehow finding a way to use the text from them, do you know of a way to
get around the protection bits? I'm not trying to pirate or anything
else, but I know of at least one major audio editing package which ships
manuals that can't be read with pdftotext because of the no-print and
no-copy bits. I know the text is there in the pdf file because they
sent me an unprotected copy upon request, but they have been bought out
by a major media company now.
Geoff Shang wrote:
> I have both pstotext and pdftotext installed here. Results seem to
> vary as to which is better and you may want to try both if a document
> is proving difficult to read and see which gives the best results.
More information about the Blinux-list
mailing list