Is there Anything Like Catdocs for reading .docx files?

Daniel Dalton d.dalton at iinet.net.au
Tue Apr 14 11:31:35 UTC 2009


Hi Martin,

On Mon, Apr 13, 2009 at 04:18:50PM -0500, Martin McCormick wrote:
> 	Is there any Unix program that will turn the new
> Microsoft Office .docx files in to ASCII, html or anything we
> can then turn in to something readible?

Sorry I had to go out today, but there are perhaps a couple of solutions
(untested with dox):
unoconv -- converts between all openoffice formats from a quick scan of
the man page. Generated a nice html file of a .doc file for me before,
so this looks promising considering oo is always updated to stay current
with microsofts new formats! One thing I couldn't do is dump to text,
but the following hack should work ok, (untested of course):
#!/bin/sh

# convert files to txt!
unoconv -f html "$1" && # perhaps some checks for args above could be handy
lynx -dump "$1" > "$1".txt

Uh and you'll have to use the basename package in there I just forgot
about that to strip the extentions, go read man basename I think you
want.

Next option:
DISPLAY= abiword -t txt <word.doc>

gd luck, let me know if something doesn't work coz half this is just cut
and pasted from old archived mails I didn't get around to testing, coz
all this stuff works from scripts I have written which shall be updated
to use unoconv now I think!

--
Daniel Dalton



> 
> Microsoft's new version of Office seems to love to put an extra
> X at the end of all their output files and that pretty well
> describes what the new formats do to applications like catdocs
> and xlhtml which have worked fairly well for several years.
> 
> 	Thanks for any constructive suggestions.
> 
> Martin McCormick WB5AGZ  Stillwater, OK 
> Systems Engineer
> OSU Information Technology Department Telecommunications Services Group
> 
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/blinux-list/attachments/20090414/5d84ce34/attachment.sig>


More information about the Blinux-list mailing list