fix(spawn): wait for inner agent subshell before returning PID by rubin-johnson · Pull Request #327 · nyldn/claude-octopus

rubin-johnson · 2026-04-28T21:02:33Z

Problem

probe_discover backgrounds the outer spawn_agent wrappers and tracks their PIDs:

spawn_agent ... &
pids+=($!)

Inside spawn_agent, the real work runs in an inner subshell:

(
    # ... set up env, run claude/codex/gemini agent ...
) &
pid=$!

The outer wrapper captures that inner PID, writes it to the PID file, then exits — in about 1 second. display_rich_progress tracks the outer wrapper PIDs, sees them all go dead almost immediately, concludes all agents have finished, and triggers synthesis. At that point the inner agent subshells are still running and the output files are empty or partial.

Result: every multi-provider probe returns an empty synthesis, making the whole orchestration useless.

Fix

Add wait "$pid" 2>/dev/null || true in spawn_agent after the PID file write, before the log INFO line. This keeps the outer wrapper alive until the inner agent subshell completes. display_rich_progress now sees accurate liveness and synthesis fires only after real output exists.

+    wait "$pid" 2>/dev/null || true
+
     log INFO "Agent spawned with PID: $pid"
     echo "$pid"
 }

Verification

Tested on WSL2/Debian with OCTOPUS_DISPATCH_STRATEGY=full dispatching codex + gemini + claude-sonnet. Before the fix: synthesis triggered with 0B output files. After: all agents completed with real output (codex: 255KB, claude-sonnet: 23K+16K, gemini: 8K+8K).

Summary by CodeRabbit

Chores
- Improved internal process management for background agent execution.

spawn_agent backgrounds an inner subshell ( ... ) & and captures its PID. The outer wrapper then exits in ~1s. display_rich_progress tracks the outer PIDs, sees them all dead almost immediately, and triggers synthesis before any agent has produced output. Adding wait "$pid" keeps the wrapper alive until the inner agent finishes. Progress monitoring now sees accurate liveness and synthesis fires only after real output exists.

coderabbitai · 2026-04-28T21:02:46Z

📝 Walkthrough

Walkthrough

Adds a best-effort synchronous wait on the backgrounded agent PID with stderr suppressed and failure ignored to scripts/lib/spawn.sh. This occurs before logging and echoing the spawn PID, with no other control flow or error handling modifications.

Changes

Cohort / File(s)	Summary
Agent PID Synchronization `scripts/lib/spawn.sh`	Adds best-effort synchronous wait on backgrounded agent PID with stderr suppressed before logging/echoing spawn PID.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related issues

Issue #326: The main change directly addresses the spawn agent PID-handling race condition, ensuring proper synchronization between inner provider PID and outer wrapper PID during agent spawning.

Poem

🐰 A rabbit waits with patient heart,
The agent's PID plays its part,
No races run in misaligned time,
With synchronous wait, all things align,
The spawn now flows so smooth and right! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main fix: adding a synchronous wait before returning the inner subshell PID in the spawn function, which directly addresses the core issue.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/lib/spawn.sh`:
- Around line 826-827: The unconditional wait on "$pid" in spawn_agent makes
spawn_agent blocking; change it to only wait when an explicit synchronous/probe
mode is requested (e.g., a flag like PROBE_MODE or BLOCK_ON_SPAWN set earlier in
scripts/lib/spawn.sh or passed into spawn_agent), so by default spawn_agent
returns the child PID immediately and async callers in
scripts/async-tmux-features.sh and scripts/orchestrate.sh continue to work;
alternatively, move progress/liveness tracking into the code that consumes the
PID instead of waiting here—update the check around wait "$pid" to gate it on
that mode/flag and ensure callers that need blocking behavior set that flag.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6a89dcdd-d155-44e4-86fe-48b1dcee6f0e

📥 Commits

Reviewing files that changed from the base of the PR and between 14e8ee4 and aa96043.

📒 Files selected for processing (1)

scripts/lib/spawn.sh

coderabbitai · 2026-04-28T21:05:14Z

+    wait "$pid" 2>/dev/null || true
+


⚠️ Potential issue | 🔴 Critical

Unconditional wait turns spawn_agent into a blocking API and breaks async callers

Line 826 now blocks until the child exits, so PID return is no longer immediate. This regresses async flows in scripts/async-tmux-features.sh (spawn + later coordinated wait) and makes scripts/orchestrate.sh spawn behave synchronously.

Gate this wait behind an explicit mode used only by probe/discover (or move progress/liveness tracking to inner PID consumers) so default spawn_agent behavior stays asynchronous.

Suggested minimal guard at this call site

- wait "$pid" 2>/dev/null || true + if [[ "${OCTOPUS_WAIT_FOR_AGENT_COMPLETION:-false}" == "true" ]]; then + wait "$pid" 2>/dev/null || true + fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

wait "$pid" 2>/dev/null || true

if [[ "${OCTOPUS_WAIT_FOR_AGENT_COMPLETION:-false}" == "true" ]]; then

wait "$pid" 2>/dev/null || true

fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/lib/spawn.sh` around lines 826 - 827, The unconditional wait on "$pid" in spawn_agent makes spawn_agent blocking; change it to only wait when an explicit synchronous/probe mode is requested (e.g., a flag like PROBE_MODE or BLOCK_ON_SPAWN set earlier in scripts/lib/spawn.sh or passed into spawn_agent), so by default spawn_agent returns the child PID immediately and async callers in scripts/async-tmux-features.sh and scripts/orchestrate.sh continue to work; alternatively, move progress/liveness tracking into the code that consumes the PID instead of waiting here—update the check around wait "$pid" to gate it on that mode/flag and ensure callers that need blocking behavior set that flag.

nyldn · 2026-04-28T21:16:17Z

Thanks for jumping on this. I opened #328 with a related fix that keeps spawn_agent non-blocking and moves PID correctness into the parallel callers. That preserves async/tmux behavior while still making probe/develop/YAML/retry progress track the inner provider PID instead of the short-lived wrapper.

coderabbitai Bot requested changes Apr 28, 2026

View reviewed changes

nyldn mentioned this pull request Apr 28, 2026

tangle_develop captures wrong PID; quality gate aborts at 0% before providers finish #326

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(spawn): wait for inner agent subshell before returning PID#327

fix(spawn): wait for inner agent subshell before returning PID#327
rubin-johnson wants to merge 1 commit intonyldn:mainfrom
rubin-johnson:fix/probe-pid-tracking-wait

rubin-johnson commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Uh oh!

nyldn commented Apr 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

rubin-johnson commented Apr 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

nyldn commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rubin-johnson commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading

nyldn commented Apr 28, 2026 •

edited

Loading