# ============================================ # Robots.txt for legalserviceindia.com - Advanced Edition # SEO | AI Governance | Security | Crawl Control # ============================================ # Default rule for all crawlers User-agent: * Disallow: /Legal-Articles/admin/ Disallow: /Legal-Articles/manager/ Disallow: /Legal-Articles/auth/ Disallow: /Legal-Articles/pay/ Disallow: /Legal-Articles/chat/ Disallow: /Legal-Articles/waconnect/ Disallow: /Legal-Articles/waconnectintl/ Disallow: /Legal-Articles/updatestatus/ Disallow: /Legal-Articles/review-mail/ Disallow: /Legal-Articles/link/ Disallow: /Legal-Articles/get-cities Disallow: /Legal-Articles/demoprofile/ Disallow: /Legal-Articles/middle-east/ Disallow: /Legal-Articles/ans/ Disallow: /Legal-Articles/error/ Disallow: /Legal-Articles/404/ Disallow: /Legal-Articles/500/ Disallow: /Legal-Articles/temp/ Disallow: /Legal-Articles/test/ Disallow: /Legal-Articles/private/ Disallow: /Legal-Articles/internal/ Disallow: /Legal-Articles/cgi-bin/ Disallow: /Legal-Articles/api/ Disallow: /Legal-Articles/backups/ Disallow: /Legal-Articles/wp-admin/ Disallow: /Legal-Articles/admin-panel/ Disallow: /backup-database/ Disallow: /Legal-Articles/honeypot/ Disallow: /Legal-Articles/fake-admin/ Disallow: /Legal-Articles/debug/ # Block dynamic and filtered URLs Disallow: /Legal-Articles/?*sort_by= Disallow: /Legal-Articles/?*gender= Disallow: /Legal-Articles/?*experience= Disallow: /Legal-Articles/?*fee= Disallow: /Legal-Articles/?*exp= Disallow: /Legal-Articles/?*city= Disallow: /Legal-Articles/?*service_mode= Disallow: /Legal-Articles/?*PageSpeed=noscript Disallow: /Legal-Articles/?*utm_ Disallow: /Legal-Articles/?*ref= Disallow: /Legal-Articles/?*sessionid= Disallow: /Legal-Articles/?*token= Disallow: /Legal-Articles/?*email= Disallow: /Legal-Articles/?*page= Disallow: /Legal-Articles/?*thank-you Disallow: /Legal-Articles/?*checkout Disallow: /Legal-Articles/?*login Disallow: /Legal-Articles/*.php Disallow: /Legal-Articles/*.cgi Disallow: /Legal-Articles/*.json # Allow essential public content Allow: /Legal-Articles/ Allow: /legal/ Allow: /lawyers/ Allow: /copyright/ Allow: /helpline/ Allow: /articles/ Allow: /article/ Allow: /int_lawyers/ Allow: /Legal-Articles/experts/images/ Allow: /Legal-Articles/pdf/ Allow: /Legal-Articles/blog/ Allow: /Legal-Articles/articles/ Allow: /Legal-Articles/legal-advice/ Allow: /Legal-Articles/contact/ Allow: /Legal-Articles/about/ Allow: /Legal-Articles/terms/ Allow: /Legal-Articles/privacy/ Allow: /sitemap.xml # AI Search Agents — Allowed (respectful bots) User-agent: OAI-SearchBot User-agent: OAI-SearchBot/1.0 User-agent: ChatGPT-User User-agent: ChatGPT-User/1.0 User-agent: ChatGPT-User/2.0 User-agent: ClaudeBot User-agent: Claude-User User-agent: Claude-SearchBot User-agent: FirecrawlAgent User-agent: AndiBot User-agent: ExaBot User-agent: PhindBot User-agent: YouBot Allow: / # AI Training / Non-Compliant Bots — Blocked User-agent: GPTBot User-agent: CCBot User-agent: Google-Extended User-agent: GeminiBot User-agent: MistralBot User-agent: Amazon-AI User-agent: ScaleBot # Perplexity - documented stealth crawling User-agent: PerplexityBot User-agent: Perplexity-User Disallow: / # Traditional Search Engines — Full Access User-agent: Googlebot User-agent: Bingbot User-agent: Yahoo! Slurp User-agent: DuckDuckBot User-agent: Baiduspider User-agent: Yandex Allow: / # Crawl rate control (non-Google bots) User-agent: Bingbot Crawl-delay: 5 User-agent: Yandex Crawl-delay: 10 # Sitemaps Sitemap: https://www.legalserviceindia.com/Legal-Articles/sitemap_index.xml Sitemap: https://www.legalserviceindia.com/legal/articles.xml Sitemap: https://www.legalserviceindia.com/sitemap/lawyer.xml Sitemap: https://www.legalserviceindia.com/sitemap/article-2024.xml Sitemap: https://www.legalserviceindia.com/ror.xml Sitemap: https://www.legalserviceindia.com/sitemap.xml # 📌 Metadata # Updated: 2025-08-09 # Maintainer: Tarun, Supreme Court Advocate # Purpose: SEO optimization, AI governance, security hardening # References: # - Cloudflare: Perplexity Stealth Crawling Report (Aug 2025) # - OpenAI Official Crawler Docs # - Anthropic AI Crawler Identifiers