All the ways to exclude a page in Google

In the world of SEO, both meta tags and X-Robots-Tag are crucial tools for determining how search engines interact with the content on your website. Although at first glance they appear to serve similar purposes, they differ in their application, flexibility and scope. Understanding the differences between the two can help you implement a more effective SEO strategy, especially when it comes to managing your website’s visibility in search engines.

Options to exclude parts of a website

If you want to block specific areas of your Web site from search engines, you have several options. These options vary in their application and suitability, depending on your specific needs and the nature of the content you want to hide.

All the ways to exclude a page in Google

Robots.txt file

The robots.txt file is your first line of defense. It is a file that you place in the root directory of your website. This gives search engines instructions on which parts of your site they may or may not crawl. It is a powerful tool, but it has its limitations. Importantly, it does not guarantee that excluded content will not be indexed, as it is more of a “request” than a “ban. This file is especially useful for excluding large sections of your site or certain file types.

Robots meta tag

For more granular control at the page level, use the robots meta tag. This places you in the <head> section of your HTML and allows you to specifically indicate whether a page should be indexed or tracked. This is useful for pages such as temporary promotions or internal search results, where you don’t want them to appear in search engine results.

X-Robots-Tag HTTP header

The X-Robots-Tag HTTP header provides similar functionality to the robots meta tag, but at the server level. This means you can apply it to non-HTML files such as PDFs or images. It is especially useful when you have technical control over the server and when you need instructions beyond what you can do with HTML.

Request removal through the Google Search Console

Sometimes you want to remove pages from Google’s index faster than the normal crawl process would allow. In these cases, you can submit a removal request through Google’s Search Console. This is a temporary measure that causes the page to immediately disappear from search results, but it does not replace the need for a permanent method such as a noindex tag.

Effective use of noindex: a practical guide

Noindex is a powerful tool in your SEO arsenal, but it must be used carefully and strategically.

How noindex affects your page visibility

The noindex tag explicitly tells search engines not to include a page in their index. This means the page will not appear in search results. It is an effective way to avoid displaying certain pages, such as temporary content, private content, or duplicate pages. It is important to remember that although noindex reduces visibility in search results, it does not prevent the page from being crawled or the links on the page from being followed unless you also use the “nofollow” value.

Implementation of noindex: step-by-step

  1. Choose the right pages: Identify which pages you don’t want in search results. These can be duplicate pages, private pages, or pages with temporary or thin content.
  2. Add the noindex tag: Place the <meta name="robots" content="noindex"> tag in the <head> section of the HTML of the pages in question.
  3. Verify implementation: Use tools such as Google’s Search Console to verify that the tag is correctly implemented and recognized by search engines.
  4. Monitor impact: Monitor the index status of these pages. It may take some time for search engines to respond to the noindex tag, so be patient and monitor regularly.
  5. Update as needed: If you decide that a page should be visible again, delete
Add the noindex tag

Removing pages through Google’s URL Removal Tool

Sometimes you run into situations where simply excluding pages from search engines is not enough. For example, you have an urgent need to remove sensitive information or you want to quickly remove a page that has been accidentally indexed from search results. This is where Google’s URL Removal Tool comes in handy. This tool allows you to temporarily remove URLs from Google’s search results. It is a powerful tool, but note that it is only a temporary solution. For permanent removal, you must still use the appropriate noindex tags or remove the content from your site.

Quick action: temporary removal with the URL Removal Tool

The URL Removal Tool is ideal for quick action. You use this tool through the Google Search Console. It’s pretty simple: you enter the URL you want removed from the search results. However, this removal is temporary and lasts about six months. After this period, the page may reappear in search results unless you take further action, such as placing a noindex tag or permanently removing the page.

Long-term deletion: make sure your page does not return

For long-term or permanent removal of a page from search results, you need to go beyond the URL Removal Tool. This means you have to remove the content itself or add a noindex tag.

If you delete the page, make sure the server returns a 404 (not found) or 410 (permanently deleted) status code. These status codes tell search engines that the page no longer exists and, over time, the page will be removed from their indexes.

All possibilities at a glance

Here is a table that outlines the various options for both meta tags and X-Robots-Tag, with a brief explanation for each option:

PossibilityMeta TagsX-Robots-Tag
LocationIn the <head> section of an HTML page.In the HTTP response header, server-side.
ScopeOnly on the specific page where they are posted.On any type of HTTP response, including non-HTML files.
FlexibilityMust be manually added to each page.More flexible, can be applied server-wide.
Use for HTML pagesInstructions for indexing and tracking links.Same capabilities as meta tags, but server-side.
For other files, useNot applicable.Can be used for images, PDFs and other media.
Complexity of instructionsLimited to basic instructions per page.Ability to handle more complex instructions and conditions.
Example<meta name="robots" content="noindex, nofollow">Header set X-Robots-Tag "noindex, noarchive, nosnippet"
Options to exclude a page.

This table shows that although meta tags and X-Robots-Tag have similar functions in terms of giving instructions to search engines, the X-Robots-Tag offers more flexibility and more extensive application possibilities, especially for non-HTML content and more complex scenarios.

Common mistakes when excluding pages

When excluding pages from indexing, it is important to avoid common mistakes. Using robots.txt, meta tags and X-Robots-Tag incorrectly can lead to undesirable results, such as pages still appearing in search results or affecting your site’s SEO in negative ways.

The pitfalls of robots.txt

A common mistake with Robots.txt is assuming that blocking a page in robots.txt means it will not be indexed. This is not true. Robots.txt prevents search engines from crawling the content of the page, but if the page is linked elsewhere, it can still appear in the index. Using noindex in a robots meta tag or X-Robots-Tag is a more effective method of ensuring that pages are not indexed.

Misunderstandings around meta tags and X-Robots

Another area where misunderstandings are common is the use of meta tags and the X-Robots Tag. It is crucial to understand that these tags give instructions to search engines about indexing and link tracking.

Incorrect configuration can lead to unwanted indexing or, on the contrary, exclude pages that you do want indexed. Make sure you clearly understand how these tags work and test the implementation to avoid unexpected SEO problems.

What are the differences?

Meta tags and X-Robots-Tag are both tools used to give instructions to search engines on how to treat certain content on a Web site. Although they perform similar functions, they differ in their application and flexibility.

  1. Meta tags:
    • Location: Meta tags are placed directly in the HTML of an individual Web page, usually in the <head> section.
    • Scope: They apply only to the specific page where they are placed.
    • Flexibility: Meta tags offer limited flexibility because they must be added manually to each page you want to influence.
    • Use: Typical use cases for meta tags include indicating whether search engines should index or track a page (for example, with noindex, nofollow).
    • Example: <meta name="robots" content="noindex, nofollow">
  2. X-Robots-Tag:
    • Location: The X-Robots tag is an HTTP header and thus is sent in the server’s HTTP response.
    • Scope: This tag can be applied to any type of HTTP response, not only HTML pages, but also media such as PDF files or images.
    • Flexibility: The X-Robots Tag is more flexible and powerful, especially for managing crawl instructions for non-HTML files.
    • Use: You can use more complex instructions, such as combining different guidelines for different search engines or applying rules based on certain criteria.
    • Example: In a server configuration, you can add a rule such as Header set X-Robots-Tag "noindex, noarchive, nosnippet".

In summary, while meta tags are limited to providing instructions to search engines at the page level within the HTML code, the X-Robots Tag provides a more versatile and powerful way to manage crawl instructions, applicable to a wide range of content types and through server configurations.

X-Robots-Tag

How to align exclusion strategies with your SEO goals

Aligning exclusion strategies with your SEO goals starts with a clear understanding of what you want to achieve with your website. Ask yourself: which parts of my site add value to my SEO efforts and which do not? Exclusion strategies are designed not only to hide certain content, but also to help search engines focus on the content that really matters. This means thinking strategically about using tools such as robots.txt, noindex tags, and the X-Robots-Tag. By excluding content that does not contribute to your SEO goals, such as duplicate pages or internal search results, you can improve the quality and relevance of your visible content.

Balancing between visibility and privacy

Search engine visibility is crucial for attracting traffic, but not all content is intended for public display. Privacy considerations may make it necessary to hide certain parts of your site, such as user-specific information or internal data.

It is important to strike a balance here: you want to make valuable content available for indexing, while at the same time shielding sensitive information. This requires careful planning and an understanding of the various page exclusion methods to meet both your visibility goals and privacy requirements.

Summary

Meta tags and X-Robots-Tag are both essential for managing how search engines treat your Web site content, but they serve different needs. Meta tags are ideal for basic instructions on individual HTML pages, while X-Robots-Tag provides a more powerful and flexible solution for a wider range of content types and more complex scenarios. By using the right tool at the right time, you can accurately direct the visibility and indexing of your website, contributing to a more effective and targeted SEO strategy.

Senior SEO-specialist

Ralf van Veen

Senior SEO-specialist
Five stars
My clients give me a 5.0 on Google out of 76 reviews

I have been working for 10 years as an independent SEO specialist for companies (in the Netherlands and abroad) that want to rank higher in Google in a sustainable manner. During this period I have consulted A-brands, set up large-scale international SEO campaigns and coached global development teams in the field of search engine optimization.

With this broad experience within SEO, I have developed the SEO course and helped hundreds of companies with improved findability in Google in a sustainable and transparent way. For this you can consult my portfolio, references and collaborations.

This article was originally published on 14 December 2023. The last update of this article was on 28 December 2023. The content of this page was written and approved by Ralf van Veen. Learn more about the creation of my articles in my editorial guidelines.