SEO

    Robots.txt

    Quick definition

    Robots.txt is a plain-text file in your site's root that tells search engine crawlers which URLs they may or may not crawl.

    Robots.txt is a crawl directive, not an indexing directive. Blocking a URL in robots.txt prevents crawling but does not guarantee the URL won't appear in search results.

    Why Robots.txt matters

    Use robots.txt to manage crawl budget — for example, blocking infinite-parameter URLs, internal search results, and admin paths.

    How Robots.txt works in practice

    Use meta robots 'noindex' or HTTP X-Robots-Tag headers to prevent indexing. A noindexed page must remain crawlable (not blocked in robots.txt) for the directive to be read.

    Best practices

    • Don't use robots.txt to deindex content — use noindex instead.
    • Test every change in Search Console's robots.txt Tester.
    • Always include a Sitemap directive.
    • Be careful with wildcard rules — they can over-block.

    Need help applying this to your SaaS?

    Get a free strategy call with our team — no pitch, just a clear next step.