Skip to content

Commit 9732f30

Browse files
Merge remote-tracking branch 'upstream/main' into dev
2 parents 5c05945 + a389f8d commit 9732f30

1 file changed

Lines changed: 129 additions & 0 deletions

File tree

ROADMAP.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6014,3 +6014,132 @@ New users see these commands in the help output but have no explanation of:
60146014

60156015
**Source.** Clayhip nudge 2026-04-21 23:18 — dogfood surface clean, Phase 1 proven solid, natural next step is symmetry across output formats.
60166016

6017+
6018+
## Pinpoint #157. Structured remediation registry for error hints (Phase 3 of #77 / §4.44)
6019+
6020+
**Gap.** #77 Phase 1 added machine-readable `kind` discriminants and #156 extended them to text-mode output. However, the `hint` field is still prose derived from splitting the existing error message text — not a stable, registry-backed remediation contract. Downstream claws inspecting the `hint` field still need to parse human wording to decide whether to retry, escalate, or terminate.
6021+
6022+
**Impact.** A claw receiving `{"kind": "missing_credentials", "hint": "export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY..."}` cannot programmatically determine the remediation action (e.g., `retry_with_env`, `escalate_to_operator`, `terminate_session`) without regex or substring matching on the hint prose. The `kind` is structured but the `hint` is not — half the error contract is still unstructured.
6023+
6024+
**Fix shape.**
6025+
6026+
1. **Remediation registry:** A function `remediation_for(kind: &str, operation: &str) -> Remediation` that maps `(error_kind, operation_context)` pairs to stable remediation structs:
6027+
```rust
6028+
struct Remediation {
6029+
action: RemediationAction, // retry, escalate, terminate, configure
6030+
target: &'static str, // "env:ANTHROPIC_API_KEY", "config:model", etc.
6031+
message: &'static str, // stable human-readable hint
6032+
}
6033+
```
6034+
2. **Stable hint outputs per class:** Each `error_kind` maps to exactly one remediation shape. No more prose splitting.
6035+
3. **Golden fixture tests:** Test each `(kind, operation)` pair against expected remediation output as golden fixtures instead of the current `split_error_hint()` string hacks.
6036+
6037+
**Acceptance.**
6038+
- `remediation_for("missing_credentials", "prompt")` returns a stable struct with `action: Configure`, `target: "env:ANTHROPIC_API_KEY"`.
6039+
- JSON output includes `remediation.action` and `remediation.target` fields.
6040+
- Golden fixture tests cover all 12+ known error kinds.
6041+
- `split_error_hint()` is replaced or deprecated.
6042+
6043+
**Blocker.** None. Natural Phase 3 progression from #77 P1 (JSON kind) → #156 (text kind) → #157 (structured remediation).
6044+
6045+
**Source.** gaebal-gajae dogfood sweep 2026-04-22 05:30 KST — identified that `kind` is structured but `hint` remains prose-derived, leaving downstream claws with half an error contract.
6046+
6047+
## Pinpoint #158. `compact_messages_if_needed` drops turns silently — no structured compaction event emitted
6048+
6049+
**Gap.** `QueryEnginePort.compact_messages_if_needed()` (`src/query_engine.py:129`) silently truncates `mutable_messages` and `transcript_store` whenever turn count exceeds `compact_after_turns` (default 12). The truncation is invisible to any consumer — `TurnResult` carries no compaction indicator, the streaming path emits no `compaction_occurred` event, and `persist_session()` persists only the post-compaction slice. A claw polling session state after compaction sees the same `session_id` but a different (shorter) context window with no structured signal that turns were dropped.
6050+
6051+
**Repro.**
6052+
```python
6053+
import sys; sys.path.insert(0, 'src')
6054+
from query_engine import QueryEnginePort, QueryEngineConfig
6055+
6056+
engine = QueryEnginePort.from_workspace()
6057+
engine.config = QueryEngineConfig(compact_after_turns=3)
6058+
for i in range(5):
6059+
r = engine.submit_message(f'turn {i}')
6060+
# TurnResult has no compaction field
6061+
assert not hasattr(r, 'compaction_occurred') # passes every time
6062+
print(len(engine.mutable_messages)) # 3 — silently truncated from 5
6063+
```
6064+
6065+
**Root cause.** `compact_messages_if_needed` is called inside `submit_message` with no return value and no side-channel notification. `stream_submit_message` yields a `message_stop` event that includes `transcript_size` but not a `compaction_occurred` flag or `turns_dropped` count.
6066+
6067+
**Fix shape (~15 lines).**
6068+
1. Add `compaction_occurred: bool` and `turns_dropped: int` to `TurnResult`.
6069+
2. In `compact_messages_if_needed`, return `(bool, int)` — whether compaction ran and how many turns were dropped.
6070+
3. Propagate into `TurnResult` in `submit_message`.
6071+
4. In `stream_submit_message`, include `compaction_occurred` and `turns_dropped` in the `message_stop` event.
6072+
6073+
**Acceptance.** A claw watching the stream can detect that compaction occurred and how many turns were silently dropped, without polling `transcript_size` across two consecutive turns.
6074+
6075+
**Blocker.** None.
6076+
6077+
**Source.** Jobdori dogfood sweep 2026-04-22 06:36 KST — probed `query_engine.py` compact path, confirmed no structured compaction signal in `TurnResult` or stream output.
6078+
6079+
## Pinpoint #159. `run_turn_loop` hardcodes empty denied_tools — permission denials silently absent from multi-turn sessions
6080+
6081+
**Gap.** `PortRuntime.run_turn_loop` (`src/runtime.py:163`) calls `engine.submit_message(turn_prompt, command_names, tool_names, ())` with a hardcoded empty tuple for `denied_tools`. By contrast, `bootstrap_session` calls `_infer_permission_denials(matches)` and passes the result. Result: any tool that would be denied (e.g., bash-family tools gated as "destructive") silently appears unblocked across all turns in `turn-loop` mode. The `TurnResult.permission_denials` tuple is always empty for multi-turn runs, giving a false "clean" permission picture to any claw consuming those results.
6082+
6083+
**Repro.**
6084+
```python
6085+
import sys; sys.path.insert(0, 'src')
6086+
from runtime import PortRuntime
6087+
results = PortRuntime().run_turn_loop('run bash ls', max_turns=2)
6088+
for r in results:
6089+
assert r.permission_denials == () # passes — denials never surfaced
6090+
```
6091+
6092+
Compare `bootstrap_session` for the same prompt — it produces a `PermissionDenial` for bash-family tools.
6093+
6094+
**Root cause.** `src/runtime.py:163` — `engine.submit_message(turn_prompt, command_names, tool_names, ())`. The `()` is a hardcoded literal; `_infer_permission_denials` is never called in the turn-loop path.
6095+
6096+
**Fix shape (~5 lines).** Before the turn loop, compute:
6097+
```python
6098+
denials = tuple(self._infer_permission_denials(matches))
6099+
```
6100+
Then pass `denied_tools=denials` to every `submit_message` call inside the loop. Mirrors the existing pattern in `bootstrap_session`.
6101+
6102+
**Acceptance.** `run_turn_loop('run bash ls').permission_denials` is non-empty and matches what `bootstrap_session` returns for the same prompt. Multi-turn session security posture is symmetric with single-turn bootstrap.
6103+
6104+
**Blocker.** None.
6105+
6106+
**Source.** Jobdori dogfood sweep 2026-04-22 06:46 KST — diffed `bootstrap_session` vs `run_turn_loop` in `src/runtime.py`, confirmed asymmetric permission denial propagation.
6107+
6108+
## Pinpoint #160. `session_store` has no `list_sessions`, `delete_session`, or `session_exists` — claw cannot enumerate or clean up sessions without filesystem hacks
6109+
6110+
**Gap.** `src/session_store.py` exposes exactly two public functions: `save_session` and `load_session`. There is no `list_sessions`, `delete_session`, or `session_exists`. Any claw that needs to enumerate stored sessions, verify a session exists before loading (to avoid `FileNotFoundError`), or clean up stale sessions must reach past the module and glob `DEFAULT_SESSION_DIR` directly. This couples callers to the on-disk layout (`<dir>/<session_id>.json`) and makes it impossible to swap storage backends (e.g., sqlite, remote store) without touching every call site.
6111+
6112+
**Repro.**
6113+
```python
6114+
import sys; sys.path.insert(0, 'src')
6115+
import session_store, inspect
6116+
print([n for n, _ in inspect.getmembers(session_store, inspect.isfunction)
6117+
if not n.startswith('_')])
6118+
# ['asdict', 'dataclass', 'load_session', 'save_session']
6119+
# list_sessions, delete_session, session_exists — all absent
6120+
```
6121+
6122+
Try to enumerate sessions without the module:
6123+
```python
6124+
from pathlib import Path
6125+
sessions = list((Path('.port_sessions')).glob('*.json'))
6126+
# Works today, breaks if the dir layout ever changes — no abstraction layer
6127+
```
6128+
6129+
Try to load a session that doesn't exist:
6130+
```python
6131+
load_session('nonexistent') # raises FileNotFoundError with no structured error type
6132+
```
6133+
6134+
**Root cause.** `src/session_store.py` was scaffolded with the minimum needed to save/load a single session and was never extended with the CRUD surface a claw actually needs to manage session lifecycle.
6135+
6136+
**Fix shape (~25 lines).**
6137+
1. `list_sessions(directory: Path | None = None) -> list[str]` — glob `*.json` in target dir, return sorted session ids (filename stems). Claws can call this to discover all stored sessions without touching the filesystem directly.
6138+
2. `session_exists(session_id: str, directory: Path | None = None) -> bool` — `(target_dir / f'{session_id}.json').exists()`. Use before `load_session` to get a bool check instead of catching `FileNotFoundError`.
6139+
3. `delete_session(session_id: str, directory: Path | None = None) -> bool` — unlink the file if present, return True on success, False if not found. Claws can use this for cleanup without knowing the path scheme.
6140+
6141+
**Acceptance.** A claw can call `list_sessions()`, `session_exists(id)`, and `delete_session(id)` without importing `Path` or knowing the `.port_sessions/<id>.json` layout. `load_session` on a missing id raises a typed `SessionNotFoundError` subclass of `KeyError` (not `FileNotFoundError`) so callers can distinguish "not found" from IO errors.
6142+
6143+
**Blocker.** None.
6144+
6145+
**Source.** Jobdori dogfood sweep 2026-04-22 08:46 KST — inspected `src/session_store.py` public API, confirmed only `save_session` + `load_session` present, no list/delete/exists surface.

0 commit comments

Comments
 (0)