Skip to content

fix(hooks): self-diagnosing dispatch denials for stale-team/store mismatch#1010

Merged
michael-wojcik merged 5 commits into
Synaptic-Labs-AI:mainfrom
michael-wojcik:fix/dispatch-deny-stale-team-diagnosis
Jun 22, 2026
Merged

fix(hooks): self-diagnosing dispatch denials for stale-team/store mismatch#1010
michael-wojcik merged 5 commits into
Synaptic-Labs-AI:mainfrom
michael-wojcik:fix/dispatch-deny-stale-team-diagnosis

Conversation

@michael-wojcik

Copy link
Copy Markdown
Collaborator

Summary

Message-only cheap-win for the restart/persistence reliability cluster. When a Claude Code restart or fork mints a new session team while PACT's persisted team_name/session_id go stale, the gate-resolved task store diverges from the live store and dispatch denials show misleading messages (no Task assigned / team name unavailable) that never name the real cause. This PR makes the two restart-symptom deny sites self-diagnose the stale-team/store mismatch and tell the user how to re-align.

What landed

  • hooks/shared/stale_session.py (NEW) — extracted SSOT for detect_stale_session_block (+ its two constants) out of bootstrap_prompt_gate.py, so more than one hook can reuse the live-vs-recorded session_id mismatch detector. Behavior-preserving: bootstrap_prompt_gate re-binds the historical private name via an aliased import (object identity preserved → existing imports + the test monkeypatch stay unchanged).
  • hooks/dispatch_gate.py — surfaces the detector at the two restart-symptom deny sites (team_name_unavailable, no-task-assigned). Message-only: the gate decision is untouched; augmentation runs in main() after evaluate_dispatch returns, behind a never-raises wrapper, so any detector failure falls back to the original message and can never break dispatch. Not marker-gated (a restart leaves a present-but-stale bootstrap marker). Non-symptom denials are excluded so the re-align hint never misdirects.
  • hooks/shared/hook_infra_classifier.py_SEAM_HOOK_HELPER_CLOSURE updated for dispatch_gate's new import edge.
  • Tests (tests/test_dispatch_gate_stale_diagnosis.py, NEW) — 15 tests: both-modes invariance matrix (in-process + tmux) × enable/disable at both deny sites, paired enable/disable non-vacuity (not git-revert), and a defensive test (detector raises → original message, no escape).
  • v4.4.36 (PATCH).

Why

Part of the restart/persistence reliability cluster (#994 / #992). This is the message-only cheap-win — it improves diagnosis but does not yet repair the underlying stale-team join (where CLAUDE.md is absent/unreadable the detector returns None → graceful degradation to the original message). The durable fix (live-team reconciliation) lands in follow-up PRs.

⚠️ Do NOT auto-close #994 / #992 on merge — closure is gated on the durable fix.

Testing

Full hooks suite green via rtk proxy python -m pytest: 9397 passed, 14 skipped, 0 failed, 0 errors. Behavior-preserving extraction verified (49 bootstrap-gate tests unchanged via the is-identity alias).

Move detect_stale_session_block and its two module constants out of
bootstrap_prompt_gate.py into a new shared/stale_session.py leaf so more than
one hook can reuse the live-vs-recorded session_id mismatch detector.

Behavior-preserving: bootstrap_prompt_gate re-binds the historical private
name via an aliased import (object identity preserved), so existing imports
and the test monkeypatch stay unchanged; the now-dead os/re imports are
removed.
On a Claude Code restart or fork the platform mints a new session team while
the persisted team_name/session_id go stale, so the gate-resolved task store
diverges from the live store. The dispatch gate then denies every spawn with
a misleading message ('no Task assigned' / team name unavailable) that never
names the real cause.

Surface the existing stale-session detector at the gate's two restart-symptom
deny sites (team-name-unavailable and no-task-assigned), telling the user the
recorded team has drifted from the live session and how to re-align.

Message-only: the gate decision is unchanged. The augmentation runs in main()
after evaluate_dispatch returns, behind a never-raises wrapper, so any detector
failure (absent/unreadable project memory, unexpected input) falls back to the
original message and can never break dispatch. Detection is not marker-gated,
since a restart leaves a present-but-stale bootstrap marker. Non-symptom
denials are excluded so the re-align hint never misdirects. Adds both-modes,
paired enable/disable non-vacuity, and defensive-raise tests.
Bound the recorded and live session-id values interpolated into the
stale-session warning to 64 chars at the shared source, applied after the
full-value mismatch compare so detection is unaffected. Without the bound a
pathological multi-kilobyte recorded id in the project-memory Resume line would
inflate the gate's deny message; the cap is non-lossy for a 36-char UUID and
applies to both consumers of the shared template.

Also scope the module docstring to the CLAUDE.md-recorded-vs-live signal and
add a forward note that a context-file-keyed sibling detector will land
alongside it.
- Assert the in-process and tmux deny legs produce a byte-identical reason
  string, pinning mode-invariance directly (a future mode-keyed regression
  would diverge them).
- Add gate-site integration tests for graceful-degradation paths (project
  memory absent / project dir unset / bad session id -> original message).
- Add a dedicated unit-test module for the extracted stale-session detector
  (None branches, two-path precedence, recorded-id regex) instead of relying
  only on the bootstrap-gate suite via the aliased import.
- Assert the 64-char interpolation cap bounds an oversized recorded id.
@michael-wojcik michael-wojcik merged commit 4b14318 into Synaptic-Labs-AI:main Jun 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resume after Claude Code restart splits PACT team from live session team → dispatch_gate denies all spawns (misleading 'no Task assigned')

1 participant