Introducing Yahoo! Robots-Nocontent Feature

Yahoo! has introduced a new feature that will help webmasters specify what sections of a web page should not be indexed.

In addition to using the traditional robots.txt file that instructs search engine robots what pages to skip crawling, this new directive from Yahoo! allows webmasters to use the “robots-nocontent” on tags such as <p>, <div>, <span> and other tags related to block content.

class=”robots-nocontent”

Examples are the following:

<p class=”robots-nocontent”>Any piece of content I don’t want to be indexed.</p>

<span class=”robots-nocontent”>Another piece of content I don’t want to be indexed.</span>

<div class=”robots-nocontent”>Yet, another piece of content I don’t want to be indexed.</div>

If our tag already has a “class” attribute, we can use the following syntax:

<div class=”header robots-nocontent”>,

where “header” is also another value of the “class” attribute of the “div” tag.

Usage
We may ask: in what particular instances is this attribute useful? By marking parts of a page such as navigation menus or advertisements with “robots-nocontent”, Yahoo’s search robot can easily zero-in to the core content of the page while effectively bypassing elements of the page that are repeated all throughout the site.

Yahoo! further mentions the benefits of this tag to webmasters:

  1. It can improve our focus on the main content of your pages.
  2. It helps target your pages in search results by making sure the appropriate deep page in your site can surface for the right queries.
  3. It helps improve the abstracts for your pages in results by identifying unrelated text on the page and thus omitting it from consideration for the search result summaries.

Impact

  • Using this feature doesn’t make the page crawl-free. In fact, it is required for pages to be crawlable (meaning, robots.txt does not restrict Yahoo! crawler from accessing the page).
  • It’s neither comparable to the <div style=”display:none”> where content is not visible to human visitors so it’s not any form of cloaking.
  • Pages that have the tag will still be indexed by search engines (by the way, only Yahoo! supports it at the moment), the only difference is that the marked content will not be searchable.
  • It can’t be used against a competing site — as much as the power of Google Bowling does — unless someone has control of the source code.

With this feature available, it becomes handy to webmasters to make a step further in directing robots what type of contents they’d like to have a look first.