Search Engine Ranking

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Wednesday, 20 September 2006

How to verify Googlebot

Posted on 11:45 by Unknown
Lately I've heard a couple smart people ask that search engines provide a way know that a bot is authentic. After all, any spammer could name their bot "Googlebot" and claim to be Google, so which bots do you trust and which do you block?

The common request we hear is to post a list of Googlebot IP addresses in some public place. The problem with that is that if/when the IP ranges of our crawlers change, not everyone will know to check. In fact, the crawl team migrated Googlebot IPs a couple years ago and it was a real hassle alerting webmasters who had hard-coded an IP range. So the crawl folks have provided another way to authenticate Googlebot. Here's an answer from one of the crawl people (quoted with their permission):


Telling webmasters to use DNS to verify on a case-by-case basis seems like the best way to go. I think the recommended technique would be to do a reverse DNS lookup, verify that the name is in the googlebot.com domain, and then do a corresponding forward DNS->IP lookup using that googlebot.com name; eg:

> host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.

> host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1

I don't think just doing a reverse DNS lookup is sufficient, because a spoofer could set up reverse DNS to point to crawl-a-b-c-d.googlebot.com.


This answer has also been provided to our help-desk, so I'd consider it an official way to authenticate Googlebot. In order to fetch from the "official" Googlebot IP range, the bot has to respect robots.txt and our internal hostload conventions so that Google doesn't crawl you too hard.

(Thanks to N. and J. for help on this answer from the crawl side of things.)
Email ThisBlogThis!Share to XShare to Facebook
Posted in crawling and indexing, general tips | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Our Valentine's day gift: out of beta and adding comments
    Here at webmaster central , we love the webmaster community -- and today, Valentine's Day, we want to show you that our commitment to ...
  • Traveling Down Under: GWC at Search Engine Room and Search Summit Australia
    G'day Webmasters! Google Webmaster Central is excited to be heading to Sydney for Search Summit and Search Engine Room on March 1-2 ...
  • Come see us at SES London and hear tips on successful site architecture
    If you're planning to be at Search Engine Strategies London February 13-15, stop by and say hi to one of the many Googlers who will be ...
  • How to verify Googlebot
    Lately I've heard a couple smart people ask that search engines provide a way know that a bot is authentic. After all, any spammer cou...
  • Introducing Sitemaps for Google News
    Good news for webmasters of English-language news sites: If your site is currently included in Google News , you can now create News Sitemap...
  • Better details about when Googlebot last visited a page
    Most people know that Googlebot downloads pages from web servers to crawl the web. Not as many people know that if Googlebot accesses a page...
  • Update on Public Service Search
    Public Service Search is a service that enables non-profit, university, and government web sites to provide search functionality to their vi...
  • Discover your links
    Update on October 15, 2008 : For more recent news on links, visit Links Week on our Webmaster Central Blog. We're discussing internal l...
  • Setting the preferred domain
    Based on your input, we've recently made a few changes to the preferred domain feature of webmaster tools. And since you've had some...
  • Joint support for the Sitemap Protocol
    We're thrilled to tell you that Yahoo! and Microsoft are joining us in supporting the Sitemap protocol. As part of this development, we...

Categories

  • crawling and indexing
  • events
  • feedback and communication
  • general tips
  • localization
  • products and services
  • search results
  • sitemaps
  • webmaster guidelines
  • webmaster tools

Blog Archive

  • ►  2007 (13)
    • ►  March (3)
    • ►  February (7)
    • ►  January (3)
  • ▼  2006 (34)
    • ►  December (5)
    • ►  November (7)
    • ►  October (7)
    • ▼  September (8)
      • Fresher query stats
      • Introducing Google Checkout
      • How to verify Googlebot
      • Debugging blocked URLs
      • For Those Wondering About Public Service Search
      • Setting the preferred domain
      • Information about Sitelinks
      • Better details about when Googlebot last visited a...
    • ►  August (7)
Powered by Blogger.

About Me

Unknown
View my complete profile