# Rahber Quran Institute — robots.txt # Canonical host: https://rahberquraninstitute.com # Last reviewed: 2026-06-10 # ───────────────────────────────────────────────────────────── # Default: allow all crawlers, block private / system / param-noise # ───────────────────────────────────────────────────────────── User-agent: * Allow: / Disallow: /admin Disallow: /admin/ Disallow: /api/ Disallow: /lovable/ Disallow: /*?*utm_ Disallow: /*?*gclid= Disallow: /*?*fbclid= Disallow: /*?*ref= Disallow: /*?*session= Disallow: /*?*sessionid= Disallow: /*?*sort= Disallow: /*?*filter= # ───────────────────────────────────────────────────────────── # Major search engines (explicit allow for clarity) # ───────────────────────────────────────────────────────────── User-agent: Googlebot Allow: / User-agent: Googlebot-Image Allow: / User-agent: Googlebot-News Allow: / User-agent: Googlebot-Video Allow: / User-agent: AdsBot-Google Allow: / User-agent: Bingbot Allow: / User-agent: BingPreview Allow: / User-agent: Slurp Allow: / User-agent: DuckDuckBot Allow: / User-agent: YandexBot Allow: / User-agent: YandexImages Allow: / User-agent: Baiduspider Allow: / User-agent: Baiduspider-image Allow: / User-agent: Applebot Allow: / User-agent: Sogou web spider Allow: / User-agent: Naverbot Allow: / User-agent: Yeti Allow: / User-agent: SeznamBot Allow: / User-agent: facebookexternalhit Allow: / User-agent: Twitterbot Allow: / User-agent: LinkedInBot Allow: / User-agent: Pinterestbot Allow: / User-agent: WhatsApp Allow: / User-agent: TelegramBot Allow: / # ───────────────────────────────────────────────────────────── # AI / LLM crawlers (explicit allow — opt-in to AI search visibility) # ───────────────────────────────────────────────────────────── User-agent: GPTBot Allow: / User-agent: OAI-SearchBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: ClaudeBot Allow: / User-agent: Claude-Web Allow: / User-agent: anthropic-ai Allow: / User-agent: Claude-SearchBot Allow: / User-agent: PerplexityBot Allow: / User-agent: Perplexity-User Allow: / User-agent: Google-Extended Allow: / User-agent: GoogleOther Allow: / User-agent: Applebot-Extended Allow: / User-agent: CCBot Allow: / User-agent: Bytespider Allow: / User-agent: Amazonbot Allow: / User-agent: cohere-ai Allow: / User-agent: cohere-training-data-crawler Allow: / User-agent: Meta-ExternalAgent Allow: / User-agent: Meta-ExternalFetcher Allow: / User-agent: FacebookBot Allow: / User-agent: DuckAssistBot Allow: / User-agent: YouBot Allow: / User-agent: MistralAI-User Allow: / User-agent: Diffbot Allow: / User-agent: AwarioBot Allow: / User-agent: PetalBot Allow: / User-agent: AI2Bot Allow: / User-agent: Timpibot Allow: / User-agent: Webzio-Extended Allow: / User-agent: ImagesiftBot Allow: / # ───────────────────────────────────────────────────────────── # Sitemaps (index + per-locale + legacy single sitemap) # ───────────────────────────────────────────────────────────── Sitemap: https://rahberquraninstitute.com/sitemap-index.xml Sitemap: https://rahberquraninstitute.com/sitemap.xml Sitemap: https://rahberquraninstitute.com/sitemap-en-us.xml Sitemap: https://rahberquraninstitute.com/sitemap-en-gb.xml Sitemap: https://rahberquraninstitute.com/sitemap-en-ca.xml Sitemap: https://rahberquraninstitute.com/sitemap-en-au.xml Sitemap: https://rahberquraninstitute.com/sitemap-en-nz.xml Sitemap: https://rahberquraninstitute.com/sitemap-en.xml # Host directive (Yandex) Host: https://rahberquraninstitute.com