Skip to content


Robots.txt got a new life

I was looking at www.dre.pt while exploring certain functionality for QuickLex and noticed that Diário da República‘s robots.txt had some information that I had never seen before:

User-agent: *
Visit-time: 0100-0400

I went to recheck the the robots.txt standard at http://www.robotstxt.org/. Visit-time wasn’t in the standard. What the heck was happening here…

Google to the rescue, and I find out that robots.txt is evolving somewhere else: http://www.conman.org/people/spc/robots2.htm

Several new tags are being used in the wild, including the sitemap tag. Check the end of http://google.com/robots.txt.

So, for those that didn’t know (like me), seems like the robots.txt is getting a new life.

Reblog this post [with Zemanta]

Posted in Web.

Tagged with , , , , , .


No Responses (yet)

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.