how to find out dead links

Tim ignored_mailbox at yahoo.com.au
Sun Nov 15 07:28:23 UTC 2009


On Sat, 2009-11-14 at 00:54 -0800, Eugeneapolinary Ju wrote:
> wget -r -p -U Firefox "http://www.somesite.com/" 2>&1 | grep 404 > 404.txt

If you direct the output with one > to a file, each write will overwrite
the prior one.  You want to use a double >> to keep adding lines of text
to the same file.

> why come 404.txt is 0 Byte? how to put the STDOUT to a file with wget?

Here, when I try it against a local webserver, and with the modification
I mentioned, I do get a file.  However, there's no useful information
with the error messages, just a note that there was an error.

e.g. HTTP request sent, awaiting response... 404 Not Found
     2009-11-15 17:51:32 ERROR 404: Not Found.

You'd need to do more to make it list addresses with the errors.

wget has options to do with logging, you might want to explore them.
See the man file.  But if you're doing this to check a website for
errors, there are some tools already set up for doing that.

-- 
[tim at localhost ~]$ uname -r
2.6.27.25-78.2.56.fc9.i686

Don't send private replies to my address, the mailbox is ignored.  I
read messages from the public lists.






More information about the fedora-list mailing list