[UPHPU] throttling bots and spiders
Velda Christensen
velda at novapages.com
Fri Feb 22 16:13:22 MST 2008
cole at colejoplin.com wrote:
> If you really don't want to be crawled, you can alter the robots.txt
> file (http://www.robotstxt.org/) for your web server, of do a
> robots="nofollow" (http://www.robotstxt.org/meta.html). You could also
> use a sitemap spec (http://sitemaps.org/) to control how often you get
> crawled.
>
> -- Cole
It's a tad off topic, but for the record, some bots and spiders will
ignore robots.txt directives, with bad ones doing the deliberate
opposite of what you're asking them to do. So be sure you're not using
robots.txt as a privacy / security measure.
More information about the UPHPU
mailing list