wget

Joy Methew ml4joy at gmail.com
Mon Jun 30 08:16:12 UTC 2008


Bacchi

how i use "robost.txt" plz explain with example.

Daniel......
it`s working for "wget" but still we can download from other utilities
like..."DownloadStudio

On Sun, Jun 29, 2008 at 11:45 PM, Joy Methew <ml4joy at gmail.com> wrote:

> Bacchi
>
> how i use "robost.txt" plz explain with example.
>
> Daniel......
>
> it`s working for "wget" but still we can download from other utilities
> like..."DownloadStudio"
>
> On 6/27/08, Daniel Carrillo <daniel.carrillo at gmail.com> wrote:
>>
>> 2008/6/27 Joy Methew <ml4joy at gmail.com>:
>>
>> > hiii all....
>> >
>> > we can download any site from "wget -r " options.
>> > if i want to stop downloading of my site from web server how i can do
>> > this???
>>
>>
>> You can configure Apache for refuse connections with UserAgent "wget",
>> but note that wget can use any UserAgent (--user-agent option).
>>
>> SetEnvIfNoCase User-Agent "^wget" blacklist
>> <Location />
>>   ...
>>   your options
>>   ...
>>   Order allow,deny
>>   Allow from all
>>   Deny from env=blacklist
>> </Location>
>>
>> BTW: robots.txt only can stop crawling from "good" crawlers, like
>> google, yahoo, alexa, etc.
>>
>>
>> --
>> redhat-list mailing list
>> unsubscribe mailto:redhat-list-request at redhat.com?subject=unsubscribe
>> https://www.redhat.com/mailman/listinfo/redhat-list
>>
>
>



More information about the redhat-list mailing list