Pluggable agent harnesses: claude behind interface, codex/gemini ready#58
Pluggable agent harnesses: claude behind interface, codex/gemini ready#58rr0hit wants to merge 11 commits into
Conversation
Introduces internal/harness with a Harness interface that abstracts the
agent CLI (Claude Code today; codex / gemini next). claude moves to
internal/harness/claude/ behind New(); the app package wires it via a
single defaultHarness() helper.
The interface accommodates both pre-allocating harnesses (claude mints
a UUID up front via --session-id) and self-allocating ones (codex/
gemini mint their own at startup) without per-harness branches in
flow's caller code:
- PrepareSpawn() returns "" for self-allocating harnesses; callers
branch on the empty string rather than on harness identity.
- HookEnvForSpawn(slug) returns nil for claude (DB binding is
already committed pre-spawn) and {FLOW_TASK: slug} for harnesses
that complete the bind via SessionStart hook.
Symbols moved out of internal/app:
- newUUID, EncodeCwdForClaude → claude.NewUUID, claude.EncodeCwd
- claudeRunner → claude.HeadlessRunner
- psRunner, liveClaudeSessions → claude.PSRunner, (LiveSessionIDs)
- sessionUUIDRe → claude.ValidateSessionID
- install/uninstallClaudeHook → claude.{Install,Uninstall}*Hook
- skill install/version paths → claude.SkillInstallPath etc.
Behavior is preserved end to end — the byte-identity tests in
internal/harness/claude/claude_test.go pin LaunchCmd / ResumeCmd to
the exact strings the pre-refactor do.go produced. Full test suite
(go test ./...) passes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…pers
A follow-up audit on the harness abstraction found 7 sites where
claude-specific knowledge was leaking outside internal/harness/claude/
and internal/app/harness.go. All fixed here:
- spawner.FocusSession now takes a `binary string` parameter; each
backend (iterm, terminal, zellij, kitty) filters its process / pane
list by that name instead of the hardcoded literal "claude". Caller
in do.go passes h.Binary(). Internal helpers renamed
ttyForClaudeSession → ttyForHarnessSession (and similar for the
zellij / kitty pane lookups); claudeSessionRowRe →
sessionUUIDRowRe.
- internal/app/helpers.go's currentSessionID() now reads
defaultHarness().SessionIDEnvVar() instead of CLAUDE_CODE_SESSION_ID
directly. Multi-harness probing will replace the single read later.
- Two flag-help strings in do.go and run.go that named "claude"
explicitly are now harness-agnostic ("the spawned harness").
After this, claude is named in code only in two places: the impl
package internal/harness/claude/ and the single instantiation point
internal/app/harness.go. A re-audit confirmed clean.
Deferred (per the implementation plan, not addressed here):
- internal/app/hook.go (still emits claude-shaped hook JSON)
- internal/app/skill/SKILL.md (embedded skill content)
- User-facing error message strings mentioning "Claude session"
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…transcript end-to-end Five interface-level changes: 1. PrepareSpawn → NewSessionID. Docstring now says every impl returns a real id; claude generates locally, codex/gemini will probe their CLI to mint and capture. Flow's caller has a single uniform spawn path — no empty-string branches, no deferred-bind logic. 2. HeadlessRun → SkipPermissionsRun. Seam HeadlessRunner → SkipPermissionsRunner. Inner runHeadless → runSkipPermissions. 3. LaunchOpts.SkipApprovals → SkipPermissions. Matches the method name; one word for the concept across the package. 4. HookEnvForSpawn dropped from the interface. Flow no longer injects any env vars into spawned harness processes; the only injection that survives is $FLOW_ROOT (flow's data root, not a binding correlator). Codex/Gemini deferred-bind will use the probe-and- mint approach in NewSessionID rather than a hook callback — keeping flow's env-var contract one-way (harness exports, flow reads, never the reverse). 5. TranscriptPath → RenderTranscript. The old API returned only a path string; callers in app/ then hardcoded claude's jsonl schema to parse it. That's the leak — transcript format differs per harness (claude jsonl with claude messages, codex jsonl with codex events, gemini single-object json) and the parser couldn't stay in app/. Each harness now owns end-to-end transcript handling and writes a normalized text stream. The ~270 lines of claude jsonl decoders move from internal/app/transcript.go to internal/harness/claude/transcript.go. Parser-level tests move to internal/harness/claude/transcript_test.go; cmdTranscript wiring tests stay in internal/app/. internal/app/transcript.go is now a thin dispatcher. Behavior preserved end to end — go test ./... passes uncached. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Vestige from before the harness refactor — the skip-permissions flag now propagates via LaunchOpts.SkipPermissions which each harness translates to its own flag (claude: --dangerously-skip-permissions, codex: --dangerously-bypass-…, gemini: --yolo). The local var was appended to but never read. Caught by the post-refactor audit pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a tasks.harness column so flow remembers which agent CLI owns each session's transcript / resume / close-out lifecycle. New tasks adopt the ambient harness on first bind; pinned tasks resume in their own harness regardless of where you run flow from. Schema (additive, no backfill): - ALTER TABLE tasks ADD COLUMN harness TEXT (nullable). NULL is the back-compat signal for claude — every row that existed before this column landed reads as claude in the app layer. Bootstrap binding (cmdDo, fresh path): - harnessForSpawn(task) picks the adapter: pinned task.harness wins; else the harness running THIS flow process (ambient probe via each adapter's SessionIDEnvVar); else claude default. - The chosen name persists alongside session_id atomically — future invocations read the same adapter even from a different ambient context. In-session bind (cmdDoHere): - --here probes ambient explicitly; refuses if the user isn't inside any known harness. - Cross-harness binding (task pinned to X, ambient is Y) is rejected without --force; with --force the harness column updates alongside session_id and a warning notes the orphaned prior transcript. - The bind UPDATE always writes harness, so a previously-unpinned task gets pinned on first --here too. Routing through task.harness: - cmdDone close-out sweep, cmdTranscript renderer, and the per-task [live] markers (show.go, list.go) all now use harnessForTask(t) instead of defaultHarness(). list.go batches LiveSessionIDs to one call per unique harness across the result set. - currentSessionID() probes every known harness's env var; previous hardcoded claude env-var read is gone. Helper module (internal/app/harness.go): - allHarnesses() — registry; codex/gemini drop in as one line each. - harnessByName / harnessForTask / harnessForSpawn / ambientHarness / defaultHarness — covers the four context shapes (have task & it's pinned, have task & ambient matters, have neither, just want a default). Tests cover ambient detection, harnessForTask fallback, bootstrap persistence, --here persistence, cross-harness rejection without --force, --force harness switch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Claude's on-disk transcript path is keyed by the pair (cwd, session_id): ~/.claude/projects/<encode(cwd)>/<session_id>.jsonl. Pre-this-commit flow assumed cwd == task.work_dir for that encode — true for fresh `flow do` (we spawn the tab with cwd=work_dir) but BROKEN for `flow do --here` when claude was started in a directory other than work_dir. Most visible after `flow do --here --force` binds where the user explicitly attaches a foreign session. Reproducing case: task run-2-2 with work_dir=/tmp got --here-bound from a claude session running in /Users/rohit. Transcript lookup hit ~/.claude/projects/-tmp/<sid>.jsonl (didn't exist); actual file was at ~/.claude/projects/-Users-rohit/<sid>.jsonl. Same shape for any --here bind across directory boundaries. Fix: record the cwd alongside session_id at bind time. New nullable column tasks.session_cwd: - fresh `flow do`: session_cwd = task.work_dir (equal by construction — we spawn the tab with cwd=work_dir). - `flow do --here`: session_cwd = os.Getwd() (inherited from claude's startup cwd via the shell). - legacy rows (NULL): fall back to task.work_dir for transcript lookups, matching pre-commit behavior. Transcript lookup now passes session_cwd (with work_dir fallback) to harness.RenderTranscript instead of work_dir. Error messages on lookup failure now explain which cwd was assumed, why the file might be missing, and how to recover (re-bind via `flow do --here` to record the right cwd). The claude impl distinguishes "file not found at expected path" from other open errors and names the project dir in the message. Tests cover the new column being written on both `flow do` bootstrap and `flow do --here`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
anshulsao
left a comment
There was a problem hiding this comment.
Solid extraction overall — the interface seams are well-chosen, the schema migration is clean (additive + nullable, no backfill), and make test is green on the branch. The session_cwd fix is a real correctness win that the PR body undersells: it actually makes flow do --here + flow transcript work for the --force and "started claude elsewhere" cases, which I've hit personally.
Five non-blocking-ish items below, ordered by impact. Two are [should-fix] mostly because they'd bite the moment a second harness lands; the rest are nits.
Backend flow diagrams
flow do <task> — harness pick on bootstrap
BEFORE AFTER
────── ─────
cmdDo(args) → findTask → task cmdDo(args) → findTask → task
│ │
▼ ▼
(claude assumed everywhere) [+ h = harnessForSpawn(task)]
│ 1. task.harness pinned → use it
▼ 2. ambient env-var set → use it
liveClaudeSessions() 3. else → claude
│ │
▼ ▼
spawner.FocusSession(sid) h.LiveSessionIDs() → map[id]count
│ spawner.FocusSession(sid, h.Binary())
▼ │
newUUID() ▼
│ h.NewSessionID()
▼ │
UPDATE session_id, … ▼
│ UPDATE session_id,
▼ [+ session_cwd = task.work_dir],
"claude --session-id <id> …" prompt [+ harness = h.Name()]
│ │
▼ ▼
spawner.SpawnTab(...) command = h.LaunchCmd(sid, prompt, opts)
spawner.SpawnTab(...)
flow do --here <task> — new mismatch gate
BEFORE AFTER
────── ─────
sid = $CLAUDE_CODE_SESSION_ID [+ h = ambientHarness()]
v4 UUID regex check - probes every h.SessionIDEnvVar()
│ - nil if none / multiple set
▼ │
findTask → task ▼
│ sid = os.Getenv(h.SessionIDEnvVar())
▼ [+ h.ValidateSessionID(sid)]
"session bound elsewhere?" check │
│ ▼
▼ findTask → task
UPDATE session_id, status │
▼
[+ if task.harness ≠ h.Name() and pinned:
require --force; warn that prior
transcript history orphans]
│
▼
"session bound elsewhere?" check
│
▼
UPDATE session_id,
[+ session_cwd = os.Getwd()],
[+ harness = h.Name()],
status
flow transcript <task> — data-flow rewire
BEFORE AFTER
────── ─────
sessionJSONLPath(task) [+ h = harnessForTask(task)]
= ~/.claude/projects/ │
EncodeCwdForClaude(task.work_dir)/ ▼
<session_id>.jsonl cwd = task.session_cwd
│ ?? task.work_dir ← legacy fallback
▼ │
renderTranscript(path, compact, cutoff) ▼
- jsonl decoder lives in h.RenderTranscript(cwd, sid, compact,
internal/app/transcript.go cutoff, os.Stdout)
- claude impl owns path resolution
AND format decode; jsonl decoder
moves to harness/claude/transcript.go
- decoder targets io.Writer (was stdout)
- error path: enhanced hint pointing
users at `flow do --here` for legacy
session_cwd=NULL rows
Schema delta — tasks row
BEFORE AFTER
────── ─────
tasks columns: tasks columns:
... ...
session_id TEXT session_id TEXT
session_started TEXT session_started TEXT
session_last_resumed TEXT session_last_resumed TEXT
[+ session_cwd TEXT ]
[+ harness TEXT ]
ALTER ADD COLUMN, nullable,
no backfill. NULL = back-compat
read paths (work_dir / claude).
Package layout
BEFORE AFTER
────── ─────
internal/app/ internal/app/
bootstrap.go newUUID, EncodeCwd harness.go (dispatcher: 6 helpers)
sessions.go liveClaudeSessions transcript.go (slim; delegates to h)
transcript.go ~270L jsonl decoder
internal/harness/ (new)
harness.go (Name, LaunchOpts,
14-method Harness iface)
claude/ (new)
claude.go
transcript.go (jsonl decoder relocated)
*_test.go
Vibecode dial
Vibecode dial:
●
tight ──────────────┴────────── vibecoded
↑
(mostly tight refactor; modest helper sprawl + a couple of
unowned coercion paths nudge it one notch right)
Receipts:
Files changed: 36 (all in scope — harness extraction touches
every spawn backend by design)
Lines added (code): ~1500
Lines added (test): ~656 (harness_test.go 251 + claude tests 405)
Lines added (comments): heavy — design-rationale doc blocks throughout
New abstractions: 1 (Harness interface, 14 methods)
New helper funcs: 6 (harnessForTask, harnessForSpawn, ambientHarness,
defaultHarness, harnessByName, liveSessionsForTasks)
Adjacent refactors: 0
Stale-comment edits: 0 (signature-driven; necessary)
Drive-by changes: 0
What kept it from full-tight: 6 dispatch helpers in internal/app/harness.go where 2–3 would suffice; duplicated encodeCwd / EncodeCwd; one --here comment that asserts more invariant than the code actually upholds; harnessByName's silent unknown-value coercion has a write-back path that can erase non-NULL non-claude values on --fresh bootstrap.
[should-fix] - do.go duplicate-process warning and live-session error no longer hardcode "claude"; format string uses h.Binary() so a future codex/gemini duplicate-tab warning reads correctly. - Bootstrap UPDATE in cmdDo no longer naively overwrites the harness column. The harness column is "set once on first bind" (per the doc comment in flowdb/db.go); a COALESCE clause now guards the write so a pre-existing non-NULL pin survives. `flow do --here --force` remains the explicit lane for harness switches and writes the column unconditionally there. Adds a regression test (TestCmdDoBootstrapPreservesExistingHarnessPin) that pre-pins a task to "codex" and verifies cmdDo doesn't silently coerce it back to "claude". - harnessByName doc-comment now flags the read-side coercion hazard for unknown non-empty values so future maintainers see the constraint when the registry grows. - RenderTranscript's first parameter renamed `workDir` → `cwd` in both the interface signature (internal/harness/harness.go) and the claude impl. Matches the doc comment's wording and removes the false invariant that the value is always task.work_dir (callers pass task.session_cwd when set). Error message at the file-not-found path no longer claims `session_cwd=%q` when the caller actually passed work_dir as a legacy fallback. [nit] - --here cwd-invariant comment in do.go softened: notes that a chained `cd /elsewhere && flow do --here task` in a single Bash call yields os.Getwd() != claude's startup cwd. Same blind spot the legacy work_dir lookup had; documented as a known limitation rather than asserted as an invariant. - Dropped the duplicate `encodeCwd` in internal/harness/claude/. EncodeCwd is the single function; internal callers and the test now both use it. - TestHarnessForSpawnPrefersAmbientForUnpinned renamed to TestHarnessForSpawn_AllPathsLandOnClaudeToday; doc comment now acknowledges the test can't prove precedence (ambient vs fallback are indistinguishable when both are claude). Real precedence coverage waits on a second registered adapter. - Trimmed the stray blank line inside skill.go's import block. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Thanks for the thorough pass @anshulsao — all 7 items addressed in 835a308. Per-comment replies above; net of the changes: [should-fix]
[nit]
Deferred: tightening Ready for another look when you have a minute. |
The remaining deferred item from Anshul's review. harnessByName now returns (Harness, error) so callers see unsupported pins explicitly instead of silently coercing to claude. Behavior: empty/NULL → (claude, nil) # back-compat known → (adapter, nil) unknown → (nil, error) # was: (claude, nil) Propagated through harnessForTask and harnessForSpawn. Each caller handles the error per its context: - cmdDo: errors out with the message + registered-names list. The pin survives (no UPDATE runs), so a future build that registers the pinned harness can still pick the task up. - cmdDone: warns and skips the close-out sweep but keeps the status flip — the flip is the contract, the sweep is best-effort. - cmdTranscript: errors out. - show.go [live] marker: silently skips; the task still renders. - list.go liveSessionsForTasks: silently skips; the task still renders without a [live] marker. The previous "preserve the pin via COALESCE" guard (from earlier in this PR) becomes belt-and-braces: the bootstrap UPDATE still only writes harness when NULL, but cmdDo can no longer reach that UPDATE with a coerced wrong-harness value because harnessForSpawn errors out first. Test updates: - TestHarnessForTask: "unknown name → claude fallback" case becomes "unknown name → error". - TestCmdDoBootstrapPreservesExistingHarnessPin → renamed TestCmdDoRefusesUnsupportedHarnessPin. Asserts both the explicit refusal AND the pin's survival (no session_id written either), so the future-binary upgrade path is exercised. - TestHarnessForSpawn_AllPathsLandOnClaudeToday: wrapped its resolution in a t.Helper() that fails on unexpected error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
anshulsao
left a comment
There was a problem hiding this comment.
All 7 asks addressed. CI rollup green (vet/test/build), go test ./... green locally. Approving.
Verification per comment:
| # | Tag | Ask | Fix |
|---|---|---|---|
| 1 | should-fix | do.go:177 hardcoded "claude processes" |
both stderr templates now interpolate h.Binary() (live-duplicate warning + running-elsewhere error) |
| 2 | should-fix | harnessByName silent coercion |
harnessByName → (harness, error); empty/NULL → claude+nil for back-compat, unknown non-empty → nil+error listing registered names. Plus a belt-and-braces CASE WHEN harness IS NULL OR harness = '' THEN ? ELSE harness END on the bootstrap UPDATE so even a same-name pin doesn't get re-written. TestCmdDoRefusesUnsupportedHarnessPin pins the future-pin → refuse-and-preserve contract |
| 3 | should-fix | workDir param lies |
interface + impl renamed workDir→cwd; error message switched to cwd=%q so it no longer claims a session_cwd source the caller may not have used |
| 4 | nit | --here cwd-invariant comment |
rewritten to acknowledge the chained-cd blind spot explicitly |
| 5 | nit | duplicate encodeCwd/EncodeCwd |
unexported version dropped; doc on the exported form picked up the merge |
| 6 | nit | test name oversells | renamed TestHarnessForSpawn_AllPathsLandOnClaudeToday; docstring now spells out what is and isn't testable today |
| 7 | nit | stray blank line in skill.go imports |
removed |
Two notes I like beyond the asks:
- The COALESCE-style guard on the bootstrap UPDATE is a tighter answer than what I suggested. I asked for one of (a) error path or (b) preserve-pin; you did both, which means even if a future code path resolved an unknown pin to claude, the column wouldn't get overwritten silently anymore. Belt-and-braces is the right call for a "set once on first bind" invariant.
done.go's warn-and-skip onharnessForTaskerror keeps the status flip intact when the close-out sweep can't run. That's the right tradeoff — the flip is the user-facing contract; the sweep is best-effort plumbing.
Vibecode dial — re-review
Vibecode dial:
●
tight ───────┴────────────── vibecoded
↑
(tightened from near-tight to tight after fixes —
no scope creep, fixes match asks, one extra test
per concern, no drive-bys)
Receipts on the fix delta (bdfe9c8..386511a):
Files changed: 11 (all directly traceable to one of the asks)
Lines added (code): ~150
Lines added (test): ~70 (one regression test per should-fix:
future-pin refusal; existing tests
updated to thread the new error return)
Lines added (comments): heavy — but all of it is doc on the new
error semantics (where unknown values
error vs. where they're tolerated)
New abstractions: 0
New helper funcs: 1 (registeredHarnessNames — used in the
error message; reasonable)
Adjacent refactors: 0
Stale-comment edits: 1 (the encodeCwd→EncodeCwd doc merge)
Drive-by changes: 0
What kept it from full-tight: a single new helper (registeredHarnessNames) was added to format the error message. Could have been inlined into the fmt.Errorf, but it's small and named so I'd leave it. Otherwise nothing to nit — fixes are surgical, comments are load-bearing, no whilst-we're-here cleanups landed.
…59) flow do --here used to silently bind a claude session whose jsonl was written under a cwd different from task.work_dir. Future `flow do <slug>` resumes spawned at work_dir; claude --resume looked for the jsonl under work_dir's encoded path, didn't find it, and either errored (-p mode) or silently started a fresh conversation (interactive). That's GH #59. Design: - Reverted the session_cwd column from bdfe9c8. The column was the wrong shape — it patched the symptom (record where the jsonl is) rather than the cause (work_dir and session_id allowed to disagree). Since the column was unreleased, no migration baggage. - New invariant: any task with session_id non-null has work_dir equal to the cwd that session was created at. Resume always spawns at work_dir; the invariant guarantees the jsonl is there. - Enforcement via new harness method ValidateSession(workDir, sessionID). Claude impl stats ~/.claude/projects/<encode(workDir)>/<sid>.jsonl. Gemini will share the layout; codex (sid-only) will no-op. - Gates: flow do --here refuses on validation failure; flow update task --work-dir refuses if a session is bound AND the new work_dir doesn't satisfy the invariant. The latter is the supported "fix a broken row" path — point work_dir at where the jsonl actually lives. Honest check, not naive: comparing os.Getwd() to work_dir would be fooled by `cd <work_dir> && flow do --here task` from inside a claude Bash invocation — the subprocess cwd has nothing to do with where the session jsonl lives. Statting the on-disk path is the only check that's robust to that. Run-playbook --here: the inserted run-task now takes work_dir = os.Getwd() instead of pb.WorkDir, since the run is by definition about the binding session's actual cwd. Skill update: §4.16 explicitly documents the new safety property, adds a Cwd-mismatch sub-recipe (Open in new tab / Point work_dir at this session's directory), and adds anti-patterns warning against the cd-cheat and against --force as a bypass. Tests: - TestCmdDoHereRefusesWhenTranscriptMissing exercises the cheat-resistance directly (seed at cwd, stub stat as missing, still refuses). - claude.StatFn is the indirection point so app-level tests stub the on-disk check without materializing fake jsonls. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
New commit The shift in approach: Enforcement is per-harness: new Why stat, not Gates:
Skill update: §4.16 documents the new safety property and adds a cwd-mismatch sub-recipe (Open in new tab / Point work_dir at this session's directory). Anti-patterns explicitly warn against the cd-cheat and against Existing rows that already violate the invariant (your audit found ≥6) error cleanly on next use; the A cleanup commit follows to drop the orphan |
bdfe9c8 introduced tasks.session_cwd to record where each harness session was started; eda3a3c reverted the column from the schema in favor of the cwd==work_dir invariant + Harness.ValidateSession check. Fresh DBs no longer have the column, but DBs that ran intermediate `harness-pluggable` builds during PR-58 development still carry it as an orphan. Add an idempotent ALTER TABLE DROP COLUMN to the migration block so those DBs converge to the clean shape on next open. Test exercises the happy path: ADD COLUMN to simulate a stale DB, re-open, confirm it's gone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit 3cd894c.
Summary
internal/harnesspackage — a 14-method interface abstracting the agent CLI driver (Claude Code today; Codex/Gemini ready). Existing claude-specific code moves intointernal/harness/claude/behindclaude.New().tasks.harnesscolumn for per-task adapter pinning. NULL = claude (back-compat for pre-column DBs). Set on firstflow do/flow do --herefrom the ambient harness;--forceis required to switch.tasks.session_cwdcolumn — fixes a pre-existing transcript-not-found bug for--here-bound sessions where claude was started in a directory other thantask.work_dir(commonly via--forcebinds). Session identity is now keyed on(cwd, session_id)to match claude's on-disk layout.CLAUDE_CODE_SESSION_ID, futureCODEX_THREAD_ID,GEMINI_SESSION_ID) —flow do <new-task>from inside a codex shell would pick codex when that adapter lands; falls back to claude.What this unlocks
Codex and Gemini adapters can drop in as one line each in
allHarnesses(). Their per-CLI specifics (binary name, headless flag spelling, transcript location/format, hook config path) live entirely inside their own sub-packages. Flow's caller code never branches on harness name — it dispatches via the interface or viaharnessForTask(task)/harnessForSpawn(task)/ambientHarness()helpers ininternal/app/harness.go.Design choices worth flagging
NewSessionID()mints a v4 UUID locally (passed via--session-id). Codex/Gemini will probe their CLI (codex exec/gemini -p) to mint and capture a session id from stdout before opening the user-facing tab. The interface contract is uniform — every impl returns a real id, callers have one spawn path.FLOW_TASKor any other env var on spawned harness processes. The harness's own session-id env var is the only correlator, and flow only reads it. Avoids polluting nested-session environments and keeps--here's discovery path symmetric with first-spawn binding.RenderTranscript(workDir, sessionID, compact, cutoff, w)lives on the interface; each harness handles both path resolution AND format decoding (claude jsonl, codex event log, gemini single-object json) and writes normalized text. The ~270 lines of claude jsonl decoders move intointernal/harness/claude/transcript.go.--force.flow do --here --forceon a task pinned to a different harness switches the pinning alongside the session_id rebind, with a warning that the prior transcript is orphaned from flow's view.Commits (chronological)
PrepareSpawn→NewSessionID,HeadlessRun→SkipPermissionsRun, dropHookEnvForSpawn, replaceTranscriptPath→RenderTranscriptharnesscolumn + ambient detectionsession_cwdcolumn — fix(cwd, session_id)identity for--herebindsTest plan
flowinvocation after upgrade)flow do <new-task>writes bothharnessandsession_cwdcolumns; verify withsqlite3 ~/.flow/flow.db "SELECT harness, session_cwd FROM tasks WHERE slug='<slug>'"flow do --here <task>from a directory ≠ work_dir writessession_cwd = os.Getwd(); subsequentflow transcript <task>finds the fileflow do --here --force <task>on a task pinned to a different harness emits the warning and switches the pinningharness/session_cwd) continue to render transcripts via the work_dir + claude fallbacksflow done <task>still drives the headless close-out sweep against the task's pinned harnessmake testpasses locally (verified uncached on the branch)🤖 Generated with Claude Code