Configuring robots.txt correctly for SEO (my guide)

The robots.txt is a small file with big impact. It determines what search engines may and may not crawl your website. An error in this file can result in blocked content, missed indexation or even ranking loss. In this article I will explain step by step how to correctly configure a robots.txt file for SEO.

1. What is robots.txt?

The robots.txt is a text file that you place in the root of your domain (e.g. https://jouwdomein.nl/robots.txt). Search engines read this file on their first visit to determine which paths they may crawl.

Important:

  • It is not a guarantee that something will not be indexed (also use noindex for that)
  • It blocks crawling, not necessarily indexing
  • Erroneous rules can cause unintended SEO damage

2. Structure of a robots.txt file

A standard file looks like this:

txt
User-agent: *
Disallow:
Sitemap: https://jouwdomein.nl/sitemap.xml
Copy to Clipboard

Explanation:

  • User-agent: * = applies to all bots
  • Disallow: without path = allow everything
  • Disallow: /admin/ = block everything in the /admin/ folder
  • Allow: /path/ = explicitly allow (useful for exceptions)

3. What are you blocking and not blocking?

Do block:

  • Admin/login pages (/wp-admin/, /cart/, /checkout/)
  • Internal search results (/search/)
  • Filter pages with unnecessary parameters (?color=, ?sort=)
  • Test/dev directories (/beta/, /test/)

Do not block:

  • CSS and JS files (needed for rendering control)
  • Important page types (SEO pages, blog, services)
  • Images (unless you deliberately want to keep them out of image search results)

Google must be able to render the site as users do. So don’t block styling or script files.

Aan de slag met SEO? Neem gerust contact op.

Senior SEO-specialist






    4. Examples of good configuration

    For WordPress:

    txt
    User-agent: *
    Disallow: /wp-admin/
    Allow: /wp-admin/admin-ajax.php
    Disallow: /?s=
    Disallow: /search/
    Sitemap: https://jouwdomein.nl/sitemap_index.xml
    Copy to Clipboard

    For webshop (e.g., WooCommerce):

    txt
    User-agent: *
    Disallow: /cart/
    Disallow: /checkout/
    Disallow: /my-account/
    Disallow: /?orderby=
    Disallow: /*add-to-cart=*
    Sitemap: https://jouwdomein.nl/sitemap.xml
    Copy to Clipboard

    5. Test your robots.txt file

    Mistakes creep in quickly. Always test:

    • Google Search Console > Robots.txt tester
    • Screaming Frog > Configuration > Robots.txt
    • Chrome DevTools > “Blocked by robots.txt” error messages

    6. Common mistakes

    ErrorSolution
    Block everything with Disallow: /Apply only in staging / temporary situations
    CSS/JS blockingAlways leave accessible for correct rendering
    No sitemap line includedAdd sitemap at the bottom of the file
    Disallow: /*? use without test Keep parameters that do have value accessible
    Using robots.txt instead of noindexUse noindex for indexing control, robots.txt only for crawling

    7. Robots.txt and staging/testing environments.

    Want to shield test or staging environments?

    Usage:

    txt
    User-agent: *
    Disallow: /
    Copy to Clipboard

    But: this only prevents crawling, not indexing. Combine with:

    • HTTP authentication (basic security)
    • noindex in <meta> tags
    • Block IP address via .htaccess or firewall

    In conclusion

    A correctly set robots.txt prevents crawl waste and protects your site from unintended indexing problems. Work with clear, controlled rules – and test with every change. Small file, big effect.

    Senior SEO-specialist

    Ralf van Veen

    Senior SEO-specialist
    Five stars
    My clients give me a 5.0 on Google out of 85 reviews

    I have been working for 12 years as an independent SEO specialist for companies (in the Netherlands and abroad) that want to rank higher in Google in a sustainable manner. During this period I have consulted A-brands, set up large-scale international SEO campaigns and coached global development teams in the field of search engine optimization.

    With this broad experience within SEO, I have developed the SEO course and helped hundreds of companies with improved findability in Google in a sustainable and transparent way. For this you can consult my portfolio, references and collaborations.

    This article was originally published on 3 June 2025. The last update of this article was on 18 July 2025. The content of this page was written and approved by Ralf van Veen. Learn more about the creation of my articles in my editorial guidelines.