docs: add Puppetmaster integration pattern (#416)#702
Conversation
Add docs/integration_puppetmaster.md showing how contextweaver consumes Puppetmaster-style job artifacts, worker summaries, logs, and follow-up reads without dumping raw artifacts into model context. Covers: - Artifact summary ingestion via ingest_tool_result_sync with firewall - Drilldown via ArtifactRef handles (gateway + standalone paths) - Route/answer phase budgeting over job history - Follow-up prompt treatment as routable tool_call candidates - Sensitivity guidance for logs containing credentials/PII - Explicit boundaries: context consumer, not job orchestrator Also updates mkdocs nav and CHANGELOG.md.
There was a problem hiding this comment.
Pull request overview
Adds a new documentation “integration pattern” describing how to integrate Puppetmaster-style job artifacts (summaries/logs/artifacts/follow-ups) with contextweaver’s firewall, artifact store, and phase budgets—aiming to prevent raw multi-KB artifacts from being dumped into the model context.
Changes:
- Add new guide page:
docs/integration_puppetmaster.md - Add the page to the MkDocs “Guides” nav
- Add an Unreleased changelog entry for the new documentation
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
mkdocs.yml |
Adds the new Puppetmaster guide page to the Guides navigation. |
docs/integration_puppetmaster.md |
Introduces the Puppetmaster integration pattern (architecture, ingestion, budgeting, drilldown, follow-ups, sensitivity). |
CHANGELOG.md |
Records the new integration-pattern documentation under Unreleased. |
Benchmark delta (vs
|
| size | recall@k (head Δ vs base) | MRR (head Δ vs base) | p99 (ms) |
|---|---|---|---|
| 50 | ✅ 0.5649 (+0.0000) | ✅ 0.4978 (+0.0000) | ✅ 0.560 (base 0.759) |
| 83 | ✅ 0.3825 (+0.0000) | ✅ 0.3242 (+0.0000) | ✅ 0.614 (base 1.134) |
| 1000 | ✅ 0.1475 (+0.0000) | ✅ 0.1456 (+0.0000) | ✅ 32.304 (base 41.711) |
Per-backend × per-size matrix
| backend | size | recall@k (Δ) | MRR (Δ) | p99 (ms) |
|---|---|---|---|---|
| bm25 | 100 | ✅ 0.3825 (+0.0000) | ✅ 0.3399 (+0.0000) | ✅ 4.572 (base 8.140) |
| bm25 | 500 | ✅ 0.2250 (+0.0000) | ✅ 0.2165 (+0.0000) | ✅ 22.831 (base 38.989) |
| bm25 | 1000 | ✅ 0.1575 (+0.0000) | ✅ 0.1525 (+0.0000) | ✅ 66.679 (base 111.716) |
| embedding_hashing | 100 | ✅ 0.5175 (+0.0000) | ✅ 0.4360 (+0.0000) | ✅ 5.925 (base 7.225) |
| embedding_hashing | 500 | ✅ 0.2700 (+0.0000) | ✅ 0.2674 (+0.0000) | ✅ 32.874 (base 44.182) |
| embedding_hashing | 1000 | ✅ 0.2000 (+0.0000) | ✅ 0.1931 (+0.0000) | ✅ 78.263 (base 98.277) |
| embedding_st | 100 | skipped (skipped: missing sentence-transformers) | — | — |
| embedding_st | 500 | skipped (skipped: missing sentence-transformers) | — | — |
| embedding_st | 1000 | skipped (skipped: missing sentence-transformers) | — | — |
| fuzzy | 100 | skipped (skipped: missing rapidfuzz) | — | — |
| fuzzy | 500 | skipped (skipped: missing rapidfuzz) | — | — |
| fuzzy | 1000 | skipped (skipped: missing rapidfuzz) | — | — |
| tfidf | 100 | ✅ 0.3825 (+0.0000) | ✅ 0.3220 (+0.0000) | ✅ 1.206 (base 1.102) |
| tfidf | 500 | ✅ 0.2325 (+0.0000) | ✅ 0.2314 (+0.0000) | ✅ 7.427 (base 11.492) |
| tfidf | 1000 | ✅ 0.1475 (+0.0000) | ✅ 0.1456 (+0.0000) | ✅ 26.472 (base 50.755) |
Context pipeline (per scenario)
| scenario | tokens | dropped | dedup |
|---|---|---|---|
| large_catalog | 1480 (base 1514, Δ-34) | 0 (base 0, Δ+0) | 0 (base 0, Δ+0) |
| long_conversation | 2500 (base 2548, Δ-48) | 0 (base 0, Δ+0) | 0 (base 0, Δ+0) |
| mixed_payload | 488 (base 497, Δ-9) | 0 (base 0, Δ+0) | 0 (base 0, Δ+0) |
| short_conversation | 487 (base 496, Δ-9) | 0 (base 0, Δ+0) | 0 (base 0, Δ+0) |
| stress_conversation | 6590 (base 6651, Δ-61) | 11 (base 7, Δ+4) | 4 (base 4, Δ+0) |
| tiny_payload | 256 (base 267, Δ-11) | 0 (base 0, Δ+0) | 0 (base 0, Δ+0) |
Numbers come from make benchmark / make benchmark-matrix.
Latency is hardware-dependent — treat the markers as a rough guide.
See benchmarks/scorecard.md for the full picture.
Address review feedback on docs/integration_puppetmaster.md:
- Correct ingest_tool_result_sync signature in architecture diagram
- Use json.dumps + StructuredFirewall for structured raw_output
- Fix default firewall_threshold (2000, not 2048)
- Replace nonexistent tool_view with dispatch_meta_tool
- Fix artifact handle format to artifact:result:{tool_call_id}
- Fix sensitivity snippet imports, budget definition, and action
dgenio
left a comment
There was a problem hiding this comment.
Review summary — recommendation: Request Changes.
(Posting as a comment since I'm the PR author; please treat the two major items below as merge-blocking.)
The five API-accuracy issues from the previous round are all correctly resolved in d64249f — diagram signature, json.dumps + StructuredFirewall, threshold value, dispatch_meta_tool, the artifact:result: handle form, and the security-sensitive sensitivity config (sensitivity_action="redact" + floor + hooks). Nice turnaround.
Two of the code examples still won't run as written (both introduced by the fix commit), plus one unit nit — details inline:
- Ingest example —
json.dumps(...)andStructuredFirewall(...)are used but not imported →NameError. - Gateway drilldown —
dispatch_meta_tooltakesargs=, notarguments=, and isasync(needsawait). - Nit — the threshold gate counts characters, not bytes.
Everything else (imports, ContextBudget/Phase/ContextItem/ItemKind, build_sync, mgr.ingest, artifact_store.get/exists, ArtifactRef) was verified against source. Glad to re-review once the two examples are runnable.
Address review feedback on docs/integration_puppetmaster.md: - Add missing import json and StructuredFirewall import - Fix firewall threshold wording (bytes -> characters) - Correct dispatch_meta_tool signature (args, not arguments; await; reorder params)
Summary
Fixes #416
Add a
docs/integration_puppetmaster.mdintegration pattern showing how contextweaver consumes Puppetmaster-style job artifacts, worker summaries, logs, and follow-up reads without dumping raw artifacts into the model context.Changes
docs/integration_puppetmaster.md— Puppetmaster integration pattern page covering:ingest_tool_result_syncwith firewallroute/answer) over job historyArtifactRefhandles (gateway + standalone paths)tool_callcandidatesmkdocs.yml— addedPuppetmaster: integration_puppetmaster.mdto the Guides navCHANGELOG.md— added entry under## [Unreleased]Checklist
CHANGELOG.mdupdated under## [Unreleased]api/public_api.txtwithmake api(gated bymake api-check) — N/A (no API surface change)Notes for reviewers
No code changes — strictly documentation. The warning from
mkdocs buildabout MkDocs 2.0 is a pre-existing upstream notice, not introduced by this PR.A pre-existing Windows path-separator test failure in
test_skill_body_source_round_trip(scripts\\run.pyvsscripts/run.py) and pre-existing mypy errors instore/redis_artifacts.py/store/redis_event_log.pywere observed during local verification and are unrelated to this change.How verified
ruff check src/ tests/ examples/ scripts/— All checks passed!pytest tests/test_tokens.py tests/test_sensitivity.py tests/test_firewall.py -q— 69 passedpython -m pytest --cov=contextweaver --cov-report=term-missing -q(full suite) — 755 passed, 5 skipped, 1 failed (pre-existing Windowsos.sepbug intest_adapters_agent_skills.py)python -m mkdocs build --clean— completed with only pre-existing upstream MkDocs 2.0 warningpython -m mkdocs build --strict --clean— completed with only pre-existing link warnings (no errors for new page)