wget
Andrew Bacchi
bacchi at rpi.edu
Mon Jun 30 12:45:38 UTC 2008
I've already sent you a link that provides explanation and examples. I
don't mind pointing someone in the right direction, but I won't sit here
and solve all your problems for you. Try searching google.
Joy Methew wrote:
> Bacchi
>
> how i use "robost.txt" plz explain with example.
>
> Daniel......
>
> it`s working for "wget" but still we can download from other utilities
> like..."DownloadStudio"
>
> On 6/27/08, Daniel Carrillo <daniel.carrillo at gmail.com> wrote:
>
>> 2008/6/27 Joy Methew <ml4joy at gmail.com>:
>>
>>
>>> hiii all....
>>>
>>> we can download any site from "wget -r " options.
>>> if i want to stop downloading of my site from web server how i can do
>>> this???
>>>
>> You can configure Apache for refuse connections with UserAgent "wget",
>> but note that wget can use any UserAgent (--user-agent option).
>>
>> SetEnvIfNoCase User-Agent "^wget" blacklist
>> <Location />
>> ...
>> your options
>> ...
>> Order allow,deny
>> Allow from all
>> Deny from env=blacklist
>> </Location>
>>
>> BTW: robots.txt only can stop crawling from "good" crawlers, like
>> google, yahoo, alexa, etc.
>>
>>
>> --
>> redhat-list mailing list
>> unsubscribe mailto:redhat-list-request at redhat.com?subject=unsubscribe
>> https://www.redhat.com/mailman/listinfo/redhat-list
>>
>>
--
veritatas simplex oratio est
-Seneca
Andrew Bacchi
Systems Programmer
Rensselaer Polytechnic Institute
phone: 518.276.6415 fax: 518.276.2809
http://www.rpi.edu/~bacchi/
More information about the redhat-list
mailing list