Download tool for multiple files or wildcards in http

gumnos (Tim Chase) gumnos at hotmail.com
Tue Apr 6 21:04:13 UTC 2004


> Is there any kind of tool that can download by wildcard from http? I
am
> using wget to download web braille. It works fine except that I either

Well, is there any naming convention to the the web braille?  If so, you
could have a script autogenerate the file names, and pipe them to the
file list to drive wget.  If you have file names along the lines of

    book-1.txt
    book-2.txt
    book-3.txt

you could then use the following

    seq 1 3 | sed "s/.*/book-&/" | wget -i -

which roughly translates (I'll try to line it up so it reads on
braille...if you're using TTS, this may sound funny)

    seq 1 3        Produce a sequence of numbers from 1 to 3
    sed            Edit the output of that
      s            substituting
      .*           anything (the number from the previous output)
      book-&       with "book-" followed by the stuff on the line

The "/" in the sed expression are just delimeters.  If you're gonna be
using a lot of forward slashes in a URL, you can change this to another
character (I recommend the at-sign) so you could do

    seq 1 3 | sed "s at .*@http://www.foo.com/file-&.html@" | wget -i -

That creates a list of files you want.  The "wget -i -" then does a
wget, reading the list of URLs to get from a file.  The "-" after the -i
means to read them from standard-in.

So, if you've got a pattern to your file names, it can be done fairly
quickly.

If there's some other pattern to the file names, I'd need a sample of
the file names to modify the scriptlet.  With a little work, this could
be massaged into a shell script that would take the number of files
(feeding this parameter to "seq"), and the file-name skeleton (feeding
this to "sed"), and downloading all your files that match that pattern.

Hope this gets you pointed in the right direction.  If you have more
info on the file names, I can try to help more.

-tim










More information about the Blinux-list mailing list