feat: manifest-first entity recall for all platforms (#224) by visahak · Pull Request #226 · AgentToolkit/altk-evolve

visahak · 2026-04-28T14:08:59Z

⏺ ## Summary

Replace full-body entity injection with compact manifest output across all three platforms (Codex, Claude, Bob)
The UserPromptSubmit hook (Claude/Codex) and manual recall skill (Bob) now emit only path, type, and trigger per entity — full content is read on demand
Shared load_manifest and dedupe_manifest_entries helpers in entity_io.py power all three implementations
Claude/Codex output JSON lines; Bob outputs human-readable markdown (visible in Cline UI)

Changes

entity_io.py — Added _parse_frontmatter_only, load_manifest, dedupe_manifest_entries shared helpers
Codex retrieve_entities.py — Switched to manifest JSON output
Claude retrieve_entities.py — Switched to manifest JSON output
Bob retrieve_entities.py — Switched to human-readable manifest output
SKILL.md (all platforms) — Updated to document manifest-first two-step flow
Tests — New test_claude_retrieve_manifest.py, test_codex_retrieve_manifest.py; updated test_retrieve.py, test_bob_sharing.py, test_codex_sharing.py, test_sync.py, test_subscribe.py

Test plan

294 platform_integrations tests passing
Manifest shape: entries contain only path, type, trigger
No full entity bodies in output
Deterministic ordering and deduplication
Symlinked entities filtered out
Subscribed and public entities included
Invalid stdin handled gracefully (Claude/Codex)

addressing issue #224

Summary by CodeRabbit

Release Notes

Refactor
- Updated entity retrieval workflow across all integrations to use a manifest-first approach, loading only entity metadata (path, type, trigger) initially and expanding full content on-demand for improved efficiency.
- Changed output format from markdown to JSON-based manifest entries.
Tests
- Added comprehensive test coverage for manifest loading and JSON output validation across all platform integrations.

coderabbitai · 2026-04-28T14:09:26Z

Warning

Rate limit exceeded

@visahak has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 27 minutes and 59 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1c70d2ad-e9dc-4928-bf96-612afb316e70

📥 Commits

Reviewing files that changed from the base of the PR and between 369f2da and 87d1b15.

📒 Files selected for processing (16)

platform-integrations/bob/evolve-lite/skills/evolve-lite:recall/SKILL.md
platform-integrations/bob/evolve-lite/skills/evolve-lite:recall/scripts/retrieve_entities.py
platform-integrations/claude/plugins/evolve-lite/lib/entity_io.py
platform-integrations/claude/plugins/evolve-lite/skills/recall/SKILL.md
platform-integrations/claude/plugins/evolve-lite/skills/recall/scripts/retrieve_entities.py
platform-integrations/codex/plugins/evolve-lite/skills/recall/SKILL.md
platform-integrations/codex/plugins/evolve-lite/skills/recall/scripts/retrieve_entities.py
tests/platform_integrations/conftest.py
tests/platform_integrations/test_bob_sharing.py
tests/platform_integrations/test_claude_retrieve_manifest.py
tests/platform_integrations/test_codex_retrieve_manifest.py
tests/platform_integrations/test_codex_sharing.py
tests/platform_integrations/test_entity_io_core.py
tests/platform_integrations/test_retrieve.py
tests/platform_integrations/test_subscribe.py
tests/platform_integrations/test_sync.py

📝 Walkthrough

Walkthrough

This PR implements a manifest-first recall workflow across Claude, Codex, and Bob platform integrations. Instead of eagerly loading and parsing full entity markdown files, the system now first generates a lightweight manifest containing only each entity's path, type, and trigger field, then expands relevant entities on-demand by reading only the necessary files. Core changes include shared entity I/O utilities for manifest loading and deduplication, updated retrieval scripts, documentation, and comprehensive test coverage.

Changes

Cohort / File(s)	Summary
Documentation Updates `platform-integrations/bob/evolve-lite/skills/evolve-lite:recall/SKILL.md`, `platform-integrations/claude/plugins/evolve-lite/skills/recall/SKILL.md`, `platform-integrations/codex/plugins/evolve-lite/skills/recall/SKILL.md`	Updated "How It Works" sections to describe manifest-first retrieval: hook emits minimal manifest (path/type/trigger only), then full entities are expanded on demand. Removed references to eager loading and inline entity content/source annotations.
Shared Entity I/O Library `platform-integrations/claude/plugins/evolve-lite/lib/entity_io.py`	Added two new functions: `load_manifest(root_dir)` reads YAML frontmatter from markdown files without parsing bodies, and `dedupe_manifest_entries(entries)` ensures deterministic deduplication by (path, type, trigger) tuples. Both skip symlinks and enforce required frontmatter fields.
Retrieval Scripts `platform-integrations/bob/evolve-lite/skills/evolve-lite:recall/scripts/retrieve_entities.py`, `platform-integrations/claude/plugins/evolve-lite/skills/recall/scripts/retrieve_entities.py`, `platform-integrations/codex/plugins/evolve-lite/skills/recall/scripts/retrieve_entities.py`	Converted from markdown-file parsing with full entity content to manifest-based loading via `load_manifest` and `dedupe_manifest_entries`. Output changed from formatted bullet lists with entity bodies/rationale/metadata to JSON-serialized manifest entries (path/type/trigger only). Removed `_source` provenance annotations and body content.
Configuration `platform-integrations/install.sh`	Simplified Codex user-prompt hook filtering logic to use truthiness check instead of explicit length comparison (functionally equivalent).
Test Fixtures & Integration Tests `tests/platform_integrations/conftest.py`, `tests/platform_integrations/test_bob_sharing.py`, `tests/platform_integrations/test_sync.py`, `tests/platform_integrations/test_subscribe.py`	Updated guideline markdown fixtures to include `trigger` frontmatter field. Modified assertions to validate manifest-style output (trigger values present, entity body text absent) and verify symlink deduplication. One test scenario adjusted to expect graceful stderr warning for audit-write failures instead of fatal error.
New Manifest-First Test Suites `tests/platform_integrations/test_claude_retrieve_manifest.py`, `tests/platform_integrations/test_codex_retrieve_manifest.py`, `tests/platform_integrations/test_entity_io_core.py`	Introduced comprehensive test modules validating manifest output: header format, JSON entry structure (path/type/trigger only), absence of body content/extra fields, determinism, deduplication, symlink skipping, and graceful stdin handling. Entity I/O tests cover `load_manifest` frontmatter parsing, relative path conversion, and `dedupe_manifest_entries` behavior.
Existing Test Suite Updates `tests/platform_integrations/test_retrieve.py`, `tests/platform_integrations/test_codex_sharing.py`	Refactored assertions from validating raw entity text output to validating structured JSON manifest entries. Added `project_dir` subprocess context, JSON-line parsing helper, and updated test fixtures with `trigger` fields. Removed checks for source annotations (`[from: ...]`) and entity body content.

Sequence Diagram

sequenceDiagram
    autonumber
    participant Hook as Hook/<br/>Caller
    participant Scanner as Manifest<br/>Scanner
    participant FSys as File<br/>System
    participant Parser as Manifest<br/>Loader
    participant Filter as Dedup<br/>Filter
    participant Reader as Entity<br/>Expander
    participant Output as Output<br/>Formatter

    Note over Hook,Output: OLD: Eager Load All Entities
    Hook->>Scanner: Discover entity directories
    Scanner->>FSys: List all .md files recursively
    FSys-->>Scanner: File list
    loop For each markdown file
        Scanner->>Parser: Parse full markdown + frontmatter
        FSys->>Parser: Read complete file
        Parser->>Filter: Extract path, type, content, source
    end
    Filter-->>Output: All entity objects with content
    Output-->>Hook: Formatted text list (content included)

    rect rgba(100, 150, 200, 0.5)
    Note over Hook,Output: NEW: Manifest-First On-Demand
    Hook->>Scanner: Discover entity directories
    Scanner->>FSys: List all .md files recursively
    FSys-->>Scanner: File list
    loop For each markdown file
        Scanner->>Parser: Extract only YAML frontmatter
        Parser->>FSys: Read file header (minimal bytes)
        FSys-->>Parser: Frontmatter (path, type, trigger)
    end
    Parser->>Filter: Deduplicate by (path, type, trigger)
    Filter-->>Output: Minimal manifest list
    Output-->>Hook: JSON entries (no content)
    
    Hook->>Reader: Select relevant entities by trigger
    Reader->>FSys: Read full .md for matching triggers only
    FSys-->>Reader: Complete entity content + rationale
    Reader-->>Hook: Expanded entity details
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related issues

Implement two-step entity recall with a minimal manifest across Codex, Claude, and Bob #224: Directly describes the manifest-first recall implementation with frontmatter-only manifest, load_manifest + dedupe_manifest_entries utilities, and on-demand entity expansion that this PR implements in full.

Possibly related PRs

Codex/fix platform integration recall hooks #220: Shares symlink-skipping logic for entity file discovery and corresponding test updates across platform integrations.
feat(platform-integrations): add codex evolve-lite installer #111: Introduces the Codex evolve-lite plugin and retrieval scaffold that this PR refactors from eager-loading to manifest-first workflow.
feat(codex): add lite sharing skills and session-start sync #196: Adds Codex sharing/subscribe behavior with entity source annotations (_source, [from: ...]) that this PR removes in favor of trigger-based manifest selection.

Suggested reviewers

illeatmyhat
gaodan-fang

Poem

🐰 Hoppy hooray, the manifest's here!
No more loading files, the path is clear—
Just path and type and trigger small,
Then expand on-demand, that's all!
Swift as carrots, light as air! 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 21.31% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: manifest-first entity recall for all platforms' clearly and concisely summarizes the main change: implementing a manifest-first approach for entity recall across multiple platforms (Codex, Claude, Bob).
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Replace full-body entity injection with compact manifest output in Claude's UserPromptSubmit hook. The hook now emits one JSON line per entity containing only path, type, and trigger — Claude reads full files on demand via the Read tool. Reuses the shared load_manifest and dedupe_manifest_entries helpers already in entity_io.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

subscribe.py catches audit write failures and warns instead of failing. Update the test to assert returncode 0, the warning message on stderr, and that the clone is preserved. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace full-body entity injection with human-readable manifest output in Bob's recall script. Uses shared load_manifest and dedupe helpers from entity_io.py. Output format is markdown lines with path, type, and trigger — Bob reads full files on demand via read_file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Catch UnicodeDecodeError in _parse_frontmatter_only - Reject files missing closing --- delimiter - Validate stdin JSON is a dict before accessing keys - Add e2e marker to manifest retrieval tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Vatche Isahagian <vatchei@ibm.com>

visahak · 2026-04-28T20:58:32Z

Haven't tested this in the agent harnresses yet claude, codex, and IBM Bob.

This comment was marked as resolved.

Sign in to view

visahak mentioned this pull request Apr 28, 2026

Implement two-step entity recall with a minimal manifest across Codex, Claude, and Bob #224

Open

visahak and others added 7 commits April 28, 2026 16:38

feat(codex): implement manifest-first entity recall

59dbe99

chore: remove e2e testing proposal from branch

09a518d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style(tests): ruff format platform integration tests

e033aef

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

visahak force-pushed the feat/codex-recall-manifest-224 branch from 218edec to 46ab12c Compare April 28, 2026 20:44

fix(recall): include public dir in recall entity search

87d1b15

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Vatche Isahagian <vatchei@ibm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: manifest-first entity recall for all platforms (#224)#226

feat: manifest-first entity recall for all platforms (#224)#226
visahak wants to merge 8 commits intoAgentToolkit:mainfrom
visahak:feat/codex-recall-manifest-224

visahak commented Apr 28, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

This comment was marked as resolved.

Uh oh!

visahak commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

visahak commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

This comment was marked as resolved.

Uh oh!

visahak commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

visahak commented Apr 28, 2026 •

edited

Loading

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading