Sign up Login
home | you | technology | web 2.0 | podcasts | entertainment | photos | comics | videos
 
Latest from O'Reilly Radar - Insight, analysis, and research about emerging technologies.
 

Robots.Txt and the .Gov TLD

Friday, November 20, 2009

I'm on the board of CommonCrawl.Org, a nonprofit corporation that is attempting to provide a web crawl for use by all. An interesting report just got sent to us about the use of robots.txt files within the .Gov Top Level Domain, a standard known as the Robots Exclusion Standard.

In examining about 32,000 subdomains in .gov, it turns at least 1,188 of these have a robots.txt file with a "global ...


Original article from http://feedproxy.google.com/~r/oreilly/radar/atom/~3/qbGZieE0cwg/robotstxt-and-the-gov-tld.html
Login to read full articles and enjoy our free features for members.
« Asia Continues to be Facebook's Strongest Growth Region
»
 

Related articles