- Robots.Txt and the .Gov TLD
- Asia Continues to be Facebook's Strongest Growth Region
- Four short links: 20 November 2009
- Health gets personal in the cloud
- Four short links: 19 November 2009
- Four short links: 18 November 2009
- The iPhone: Tricorder Version 1.0?
- What Does Innovative Social Engagement Look Like For Businesses and Governments?
- Four short links: 17 November 2009
- Turning Predictions into Opportunities
- The War For the Web
- Four short links: 16 November 2009
- Ignite NYC on 11/16: Gov 2.0, Body Hacks, and Hi-Tech Craft
- It's in the Bag! The Apple Tablet Computing Device
- Four short links: 13 November 2009
- Four short links: 12 November 2009
- Counting Unique Users in Real-time with Streaming Databases
- Quarantined Conferences: Claustrophobic Technophiles or Attentive Audiences?
- Four short links: 11 November 2009
- Converting to Electronic Health Records: fits and starts
- Four short links: 10 November 2009
- The Minds Behind Some of the Most Addictive Games Around
- Four short links: 9 November 2009
- Unlikely Group Working Happily Together To Solve Patent Problem
- Three Paradoxes of the Internet Age - Part Three
Robots.Txt and the .Gov TLD
Friday, November 20, 2009I'm on the board of CommonCrawl.Org, a nonprofit corporation that is attempting to provide a web crawl for use by all. An interesting report just got sent to us about the use of robots.txt files within the .Gov Top Level Domain, a standard known as the Robots Exclusion Standard.
In examining about 32,000 subdomains in .gov, it turns at least 1,188 of these have a robots.txt file with a "global ...
Original article from http://feedproxy.google.com/~r/oreilly/radar/atom/~3/qbGZieE0cwg/robotstxt-and-the-gov-tld.html
Login to read full articles and enjoy our free features for members.
Related articles
feedraider "We Eat Internets" v2.0 a LAMP production by Jussi Vaihia
© 2006-2009 |
about |
blog |
help