improvement(seo): restore explicit AI/search bot allow-list and add link-preview rules#4480
improvement(seo): restore explicit AI/search bot allow-list and add link-preview rules#4480waleedlatif1 merged 6 commits intostagingfrom
Conversation
…ink-preview rules
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview Refactors the disallow lists into named constants ( Reviewed by Cursor Bugbot for commit b9a1f58. Configure here. |
Greptile SummaryThis PR restores a structured
Confidence Score: 5/5Safe to merge — the change is scoped entirely to robots.txt generation and carries no runtime risk to application behavior. The restructuring is straightforward: one wildcard rule with a tight disallow list, one named link-preview bot group with a looser list for OG card fetching. All previously identified issues (incorrect Grok UA strings, no-op Bravebot entry, missing /playground/ and /w/ from the preview-bot disallow list) were corrected in a prior commit and are confirmed absent in the current file. Named user-agent groups take precedence over the wildcard per RFC 9309, so the routing logic is correct. No application logic, data access, or authentication paths are affected. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Incoming Bot Request] --> B{User-Agent Match?}
B -->|Matches LINK_PREVIEW_BOTS\nTwitterbot, LinkedInBot,\nSlackbot, Discordbot, etc.| C[Apply LINK_PREVIEW_DISALLOWED_PATHS]
B -->|Wildcard *\nall other bots| D[Apply DISALLOWED_PATHS]
C --> E{Path Allowed?}
D --> F{Path Allowed?}
C -.->|Accessible for OG cards| G["/chat/, /form/"]
E -->|Disallowed| H["Block: /api/, /workspace/, /w/, /playground/, /resume/, /invite/, /unsubscribe/, /credential-account/, /_next/, /private/"]
E -->|Allowed| I[Crawl Permitted]
F -->|Disallowed| J["Block: same as preview list + /chat/, /form/, /blog*tag="]
F -->|Allowed| K[Crawl Permitted]
Reviews (3): Last reviewed commit: "chore(seo): trim verbose comments in rob..." | Re-trigger Greptile |
…nd /w/ from link-preview bots
|
@cursor review |
|
@greptile |
|
@greptile |
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit b9a1f58. Configure here.
…ink-preview rules (#4480) * improvement(seo): restore explicit AI/search bot allow-list and add link-preview rules * fix(seo): correct xAI UA strings, drop Bravebot, block /playground/ and /w/ from link-preview bots * fix(seo): drop unverified Grok UAs, correct DeepSeekBot and ImagesiftBot tokens * fix(seo): re-add Bravebot to allow-list per Brave Search docs * improvement(seo): drop redundant named AI/search bot allow-list * chore(seo): trim verbose comments in robots.ts
Summary
User-agent: *) allows crawling with a tight disallow list for authenticated surfaces and internal endpoints. Any new AI/search crawler is auto-allowed without code changes./chat/and/form/URLs.Type of Change
Testing
Tested manually — typecheck passes,
bun run lintclean. Output verified to render the expected two rule groups per Next.js Metadata API.Checklist