Search Engine Ranking

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Monday, 5 March 2007

Using the robots meta tag

Posted on 16:05 by Unknown
Recently, Danny Sullivan brought up good questions about how search engines handle meta tags. Here are some answers about how we handle these tags at Google.

Multiple content values
We recommend that you place all content values in one meta tag. This keeps the meta tags easy to read and reduces the chance for conflicts. For instance:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

If the page contains multiple meta tags of the same type, we will aggregate the content values. For instance, we will interpret

<META NAME="ROBOTS" CONTENT="NOINDEX">
<META NAME="ROBOTS" CONTENT="NOFOLLOW">

The same way as:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

If content values conflict, we will use the most restrictive. So, if the page has these meta tags:

<META NAME="ROBOTS" CONTENT="NOINDEX">
<META NAME="ROBOTS" CONTENT="INDEX">

We will obey the NOINDEX value.

Unnecessary content values
By default, Googlebot will index a page and follow links to it. So there's no need to tag pages with content values of INDEX or FOLLOW.

Directing a robots meta tag specifically at Googlebot
To provide instruction for all search engines, set the meta name to "ROBOTS". To provide instruction for only Googlebot, set the meta name to "GOOGLEBOT". If you want to provide different instructions for different search engines (for instance, if you want one search engine to index a page, but not another), it's best to use a specific meta tag for each search engine rather than use a generic robots meta tag combined with a specific one. You can find a list of bots at robotstxt.org.

Casing and spacing
Googlebot understands any combination of lowercase and uppercase. So each of these meta tags is interpreted in exactly the same way:

<meta name="ROBOTS" content="NOODP">
<meta name="robots" content="noodp">
<meta name="Robots" content="NoOdp">

If you have multiple content values, you must place a comma between them, but it doesn't matter if you also include spaces. So the following meta tags are interpreted the same way:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW">

If you use both a robots.txt file and robots meta tags
If the robots.txt and meta tag instructions for a page conflict, Googlebot follows the most restrictive. More specifically:
  • If you block a page with robots.txt, Googlebot will never crawl the page and will never read any meta tags on the page.
  • If you allow a page with robots.txt but block it from being indexed using a meta tag, Googlebot will access the page, read the meta tag, and subsequently not index it.
Valid meta robots content values
Googlebot interprets the following robots meta tag values:
  • NOINDEX - prevents the page from being included in the index.
  • NOFOLLOW - prevents Googlebot from following any links on the page. (Note that this is different from the link-level NOFOLLOW attribute, which prevents Googlebot from following an individual link.)
  • NOARCHIVE - prevents a cached copy of this page from being available in the search results.
  • NOSNIPPET - prevents a description from appearing below the page in the search results, as well as prevents caching of the page.
  • NOODP - blocks the Open Directory Project description of the page from being used in the description that appears below the page in the search results.
  • NONE - equivalent to "NOINDEX, NOFOLLOW".
A word about content value "NONE"
As defined by robotstxt.org, the following direction means NOINDEX, NOFOLLOW.

<META NAME="ROBOTS" CONTENT="NONE">

However, some webmasters use this tag to indicate no robots restrictions and inadvertently block all search engines from their content.

Update: For more information, please see our robots meta tag documentation.
Email ThisBlogThis!Share to XShare to Facebook
Posted in crawling and indexing | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Our Valentine's day gift: out of beta and adding comments
    Here at webmaster central , we love the webmaster community -- and today, Valentine's Day, we want to show you that our commitment to ...
  • Traveling Down Under: GWC at Search Engine Room and Search Summit Australia
    G'day Webmasters! Google Webmaster Central is excited to be heading to Sydney for Search Summit and Search Engine Room on March 1-2 ...
  • Come see us at SES London and hear tips on successful site architecture
    If you're planning to be at Search Engine Strategies London February 13-15, stop by and say hi to one of the many Googlers who will be ...
  • How to verify Googlebot
    Lately I've heard a couple smart people ask that search engines provide a way know that a bot is authentic. After all, any spammer cou...
  • Introducing Sitemaps for Google News
    Good news for webmasters of English-language news sites: If your site is currently included in Google News , you can now create News Sitemap...
  • Better details about when Googlebot last visited a page
    Most people know that Googlebot downloads pages from web servers to crawl the web. Not as many people know that if Googlebot accesses a page...
  • Update on Public Service Search
    Public Service Search is a service that enables non-profit, university, and government web sites to provide search functionality to their vi...
  • Discover your links
    Update on October 15, 2008 : For more recent news on links, visit Links Week on our Webmaster Central Blog. We're discussing internal l...
  • Setting the preferred domain
    Based on your input, we've recently made a few changes to the preferred domain feature of webmaster tools. And since you've had some...
  • Joint support for the Sitemap Protocol
    We're thrilled to tell you that Yahoo! and Microsoft are joining us in supporting the Sitemap protocol. As part of this development, we...

Categories

  • crawling and indexing
  • events
  • feedback and communication
  • general tips
  • localization
  • products and services
  • search results
  • sitemaps
  • webmaster guidelines
  • webmaster tools

Blog Archive

  • ▼  2007 (13)
    • ▼  March (3)
      • All about robots
      • Using the robots meta tag
      • Using the site: command
    • ►  February (7)
    • ►  January (3)
  • ►  2006 (34)
    • ►  December (5)
    • ►  November (7)
    • ►  October (7)
    • ►  September (8)
    • ►  August (7)
Powered by Blogger.

About Me

Unknown
View my complete profile