Search Engine Ranking

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Saturday, 19 August 2006

All About Googlebot

Posted on 11:51 by Unknown
I've seen a lot of questions lately about robots.txt files and Googlebot's behavior. Last week at SES, I spoke on a new panel called the Bot Obedience course. And a few days ago, some other Googlers and I fielded questions on the WebmasterWorld forums. Here are some of the questions we got:

If my site is down for maintenance, how can I tell Googlebot to come back later rather than to index the "down for maintenance" page?
You should configure your server to return a status of 503 (network unavailable) rather than 200 (successful). That lets Googlebot know to try the pages again later.

What should I do if Googlebot is crawling my site too much?
You can contact us -- we'll work with you to make sure we don't overwhelm your server's bandwidth. We're experimenting with a feature in our webmaster tools for you to provide input on your crawl rate, and have gotten great feedback so far, so we hope to offer it to everyone soon.

Is it better to use the meta robots tag or a robots.txt file?
Googlebot obeys either, but meta tags apply to single pages only. If you have a number of pages you want to exclude from crawling, you can structure your site in such a way that you can easily use a robots.txt file to block those pages (for instance, put the pages into a single directory).

If my robots.txt file contains a directive for all bots as well as a specific directive for Googlebot, how does Googlebot interpret the line addressed to all bots?
If your robots.txt file contains a generic or weak directive plus a directive specifically for Googlebot, Googlebot obeys the lines specifically directed at it.

For instance, for this robots.txt file:
User-agent: *
Disallow: /

User-agent: Googlebot
Disallow: /cgi-bin/
Googlebot will crawl everything in the site other than pages in the cgi-bin directory.

For this robots.txt file:
User-agent: *
Disallow: /
Googlebot won't crawl any pages of the site.

If you're not sure how Googlebot will interpret your robots.txt file, you can use our robots.txt analysis tool to test it. You can also test how Googlebot will interpret changes to the file.

For complete information on how Googlebot and Google's other user agents treat robots.txt files, see our webmaster help center.
Email ThisBlogThis!Share to XShare to Facebook
Posted in crawling and indexing | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Our Valentine's day gift: out of beta and adding comments
    Here at webmaster central , we love the webmaster community -- and today, Valentine's Day, we want to show you that our commitment to ...
  • Traveling Down Under: GWC at Search Engine Room and Search Summit Australia
    G'day Webmasters! Google Webmaster Central is excited to be heading to Sydney for Search Summit and Search Engine Room on March 1-2 ...
  • Come see us at SES London and hear tips on successful site architecture
    If you're planning to be at Search Engine Strategies London February 13-15, stop by and say hi to one of the many Googlers who will be ...
  • How to verify Googlebot
    Lately I've heard a couple smart people ask that search engines provide a way know that a bot is authentic. After all, any spammer cou...
  • Introducing Sitemaps for Google News
    Good news for webmasters of English-language news sites: If your site is currently included in Google News , you can now create News Sitemap...
  • Better details about when Googlebot last visited a page
    Most people know that Googlebot downloads pages from web servers to crawl the web. Not as many people know that if Googlebot accesses a page...
  • Update on Public Service Search
    Public Service Search is a service that enables non-profit, university, and government web sites to provide search functionality to their vi...
  • Discover your links
    Update on October 15, 2008 : For more recent news on links, visit Links Week on our Webmaster Central Blog. We're discussing internal l...
  • Setting the preferred domain
    Based on your input, we've recently made a few changes to the preferred domain feature of webmaster tools. And since you've had some...
  • Joint support for the Sitemap Protocol
    We're thrilled to tell you that Yahoo! and Microsoft are joining us in supporting the Sitemap protocol. As part of this development, we...

Categories

  • crawling and indexing
  • events
  • feedback and communication
  • general tips
  • localization
  • products and services
  • search results
  • sitemaps
  • webmaster guidelines
  • webmaster tools

Blog Archive

  • ►  2007 (13)
    • ►  March (3)
    • ►  February (7)
    • ►  January (3)
  • ▼  2006 (34)
    • ►  December (5)
    • ►  November (7)
    • ►  October (7)
    • ►  September (8)
    • ▼  August (7)
      • How search results may differ based on accented ch...
      • Listen in - Matt Cutts and Vanessa Fox talk search
      • System maintenance
      • All About Googlebot
      • Back from SES San Jose
      • Chat with us in person at the Search Engine Strate...
      • More webmaster tools
Powered by Blogger.

About Me

Unknown
View my complete profile