feat: manifest-first entity recall for all platforms (#224)#226
feat: manifest-first entity recall for all platforms (#224)#226visahak wants to merge 8 commits intoAgentToolkit:mainfrom
Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (16)
📝 WalkthroughWalkthroughThis PR implements a manifest-first recall workflow across Claude, Codex, and Bob platform integrations. Instead of eagerly loading and parsing full entity markdown files, the system now first generates a lightweight manifest containing only each entity's path, type, and trigger field, then expands relevant entities on-demand by reading only the necessary files. Core changes include shared entity I/O utilities for manifest loading and deduplication, updated retrieval scripts, documentation, and comprehensive test coverage. Changes
Sequence DiagramsequenceDiagram
autonumber
participant Hook as Hook/<br/>Caller
participant Scanner as Manifest<br/>Scanner
participant FSys as File<br/>System
participant Parser as Manifest<br/>Loader
participant Filter as Dedup<br/>Filter
participant Reader as Entity<br/>Expander
participant Output as Output<br/>Formatter
Note over Hook,Output: OLD: Eager Load All Entities
Hook->>Scanner: Discover entity directories
Scanner->>FSys: List all .md files recursively
FSys-->>Scanner: File list
loop For each markdown file
Scanner->>Parser: Parse full markdown + frontmatter
FSys->>Parser: Read complete file
Parser->>Filter: Extract path, type, content, source
end
Filter-->>Output: All entity objects with content
Output-->>Hook: Formatted text list (content included)
rect rgba(100, 150, 200, 0.5)
Note over Hook,Output: NEW: Manifest-First On-Demand
Hook->>Scanner: Discover entity directories
Scanner->>FSys: List all .md files recursively
FSys-->>Scanner: File list
loop For each markdown file
Scanner->>Parser: Extract only YAML frontmatter
Parser->>FSys: Read file header (minimal bytes)
FSys-->>Parser: Frontmatter (path, type, trigger)
end
Parser->>Filter: Deduplicate by (path, type, trigger)
Filter-->>Output: Minimal manifest list
Output-->>Hook: JSON entries (no content)
Hook->>Reader: Select relevant entities by trigger
Reader->>FSys: Read full .md for matching triggers only
FSys-->>Reader: Complete entity content + rationale
Reader-->>Hook: Expanded entity details
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Replace full-body entity injection with compact manifest output in Claude's UserPromptSubmit hook. The hook now emits one JSON line per entity containing only path, type, and trigger — Claude reads full files on demand via the Read tool. Reuses the shared load_manifest and dedupe_manifest_entries helpers already in entity_io.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
subscribe.py catches audit write failures and warns instead of failing. Update the test to assert returncode 0, the warning message on stderr, and that the clone is preserved. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace full-body entity injection with human-readable manifest output in Bob's recall script. Uses shared load_manifest and dedupe helpers from entity_io.py. Output format is markdown lines with path, type, and trigger — Bob reads full files on demand via read_file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Catch UnicodeDecodeError in _parse_frontmatter_only - Reject files missing closing --- delimiter - Validate stdin JSON is a dict before accessing keys - Add e2e marker to manifest retrieval tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
218edec to
46ab12c
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Vatche Isahagian <vatchei@ibm.com>
|
Haven't tested this in the agent harnresses yet claude, codex, and IBM Bob. |
⏺ ## Summary
UserPromptSubmithook (Claude/Codex) and manual recall skill (Bob) now emit onlypath,type, andtriggerper entity — full content is read on demandload_manifestanddedupe_manifest_entrieshelpers inentity_io.pypower all three implementationsChanges
entity_io.py— Added_parse_frontmatter_only,load_manifest,dedupe_manifest_entriesshared helpersretrieve_entities.py— Switched to manifest JSON outputretrieve_entities.py— Switched to manifest JSON outputretrieve_entities.py— Switched to human-readable manifest outputtest_claude_retrieve_manifest.py,test_codex_retrieve_manifest.py; updatedtest_retrieve.py,test_bob_sharing.py,test_codex_sharing.py,test_sync.py,test_subscribe.pyTest plan
path,type,triggeraddressing issue #224
Summary by CodeRabbit
Release Notes
Refactor
Tests