About robots.txt
The robots.txt file is a text file that tells web robots (typically search engine crawlers) which pages on your site to crawl and which not to crawl.
Key points:
- Must be placed in the root directory of your website
- Is publicly accessible (anyone can view it)
- Is not a security measure - sensitive pages should use proper authentication
- Search engines may choose to ignore directives in robots.txt
Common directives:
- User-agent: Specifies which crawler the rules apply to
- Disallow: Blocks access to specific paths
- Allow: Explicitly allows access (overrides Disallow)
- Sitemap: Specifies the location of your sitemap