[UPHPU] throttling bots and spiders

Velda Christensen velda at novapages.com
Fri Feb 22 16:13:22 MST 2008


cole at colejoplin.com wrote:
> If you really don't want to be crawled, you can alter the robots.txt 
> file (http://www.robotstxt.org/) for your web server, of do a 
> robots="nofollow" (http://www.robotstxt.org/meta.html). You could also 
> use a sitemap spec (http://sitemaps.org/) to control how often you get 
> crawled.
>
> -- Cole 
It's a tad off topic, but for the record, some bots and spiders will 
ignore robots.txt directives, with bad ones doing the deliberate 
opposite of what you're asking them to do.  So be sure you're not using 
robots.txt as a privacy / security measure.


More information about the UPHPU mailing list