Robots.txt

Robots.txt is a crawl directive, not an indexing directive. Blocking a URL in robots.txt prevents crawling but does not guarantee the URL won't appear in search results.

Why Robots.txt matters

Use robots.txt to manage crawl budget — for example, blocking infinite-parameter URLs, internal search results, and admin paths.

How Robots.txt works in practice

Use meta robots 'noindex' or HTTP X-Robots-Tag headers to prevent indexing. A noindexed page must remain crawlable (not blocked in robots.txt) for the directive to be read.

Best practices

Don't use robots.txt to deindex content — use noindex instead.
Test every change in Search Console's robots.txt Tester.
Always include a Sitemap directive.
Be careful with wildcard rules — they can over-block.