pdf documents

Tony Baechler tony at baechler.net
Fri Dec 28 09:25:43 UTC 2007


Hi,

Have you experimented with the pdftotext -layout and -raw options?  I've 
noticed that usually no command line options produces usable output but 
sometimes using the -raw command line optin works better.  Then again, 
sometimes it makes it worse.  There's generally a decrease in the output 
file size when using the -raw option.  Also, since presumably the 
interest of most people here is in making pdf documents accessible or 
somehow finding a way to use the text from them, do you know of a way to 
get around the protection bits?  I'm not trying to pirate or anything 
else, but I know of at least one major audio editing package which ships 
manuals that can't be read with pdftotext because of the no-print and 
no-copy bits.  I know the text is there in the pdf file because they 
sent me an unprotected copy upon request, but they have been bought out 
by a major media company now.

Geoff Shang wrote:
> I have both pstotext and pdftotext installed here.  Results seem to 
> vary as to which is better and you may want to try both if a document 
> is proving difficult to read and see which gives the best results.




More information about the Blinux-list mailing list