WebStandards | robots.txt generator

Tells crawlers which areas to scan or block. Prevents private paths from being indexed and focuses search engines on the right content.

Type your domain

User Agents

*

User Agent

Disallow

/admin/

Allow

/

Sitemap

Sitemap route

https://www.mywebsite.com/sitemap.xml

Result

About robots.txt generator

The robots.txt file is a simple text file placed in the root of your domain (for example: https://yourdomain.com/robots.txt). It gives instructions to search engine crawlers about which parts of your site should be crawled and which should be ignored. While it does not enforce security, it is an essential tool for managing SEO visibility and controlling crawler traffic.

Standards

This generator follows the official Robots Exclusion Protocol (RFC 9309). Supported directives include User-agent, Disallow, Allow and Sitemap.

Syntax

User-agent: Defines the target crawler (e.g., * for all, Googlebot, Bingbot).
Disallow: Blocks specific paths from being crawled (e.g., /admin, /private/).
Allow: Grants access to exceptions inside restricted areas (e.g., /admin/help).
Sitemap: Points to the XML sitemap URL (e.g., https://yourdomain.com/sitemap.xml).

Wildcards and Matching

* matches any sequence of characters (e.g., /search/*).
$ anchors the rule to the end of a URL (e.g., .pdf$ to target PDF files).
Matching is based on path prefixes; case-sensitivity depends on your server and file system.

Behavior and Limits

The file must be accessible at /robots.txt in the root of the domain.
Rules are grouped by User-agent. The most specific match takes precedence.
Use relative paths for Allow/Disallow, and absolute URLs for Sitemap.
Keep the file concise: large or overly complex robots.txt files can cause inconsistent crawler behavior.

Best Practices

Do not block critical resources such as CSS and JavaScript needed for page rendering.
Use it to restrict administrative areas, duplicate content, or infinite URL filters.
Remember: robots.txt does not prevent indexing if a URL is linked elsewhere — use noindex for that.
Always include Sitemap: entries to improve content discovery.

Example

User-agent: *
Disallow: /admin/
Disallow: /private/
Allow: /admin/help

User-agent: Googlebot
Allow: /
Disallow: /experiments/

Sitemap: https://yourdomain.com/sitemap.xml

Usage

Publish the file at https://yourdomain.com/robots.txt.
Include all Sitemap: entries for your XML sitemap or sitemap index.
Test your file using Google Search Console or Bing Webmaster Tools.

Tip: robots.txt is a public file. Avoid exposing sensitive paths; protect them with proper authentication and access controls.