fix(spawn): wait for inner agent subshell before returning PID#327
fix(spawn): wait for inner agent subshell before returning PID#327rubin-johnson wants to merge 1 commit intonyldn:mainfrom
Conversation
spawn_agent backgrounds an inner subshell ( ... ) & and captures its PID. The outer wrapper then exits in ~1s. display_rich_progress tracks the outer PIDs, sees them all dead almost immediately, and triggers synthesis before any agent has produced output. Adding wait "$pid" keeps the wrapper alive until the inner agent finishes. Progress monitoring now sees accurate liveness and synthesis fires only after real output exists.
📝 WalkthroughWalkthroughAdds a best-effort synchronous Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/lib/spawn.sh`:
- Around line 826-827: The unconditional wait on "$pid" in spawn_agent makes
spawn_agent blocking; change it to only wait when an explicit synchronous/probe
mode is requested (e.g., a flag like PROBE_MODE or BLOCK_ON_SPAWN set earlier in
scripts/lib/spawn.sh or passed into spawn_agent), so by default spawn_agent
returns the child PID immediately and async callers in
scripts/async-tmux-features.sh and scripts/orchestrate.sh continue to work;
alternatively, move progress/liveness tracking into the code that consumes the
PID instead of waiting here—update the check around wait "$pid" to gate it on
that mode/flag and ensure callers that need blocking behavior set that flag.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6a89dcdd-d155-44e4-86fe-48b1dcee6f0e
📒 Files selected for processing (1)
scripts/lib/spawn.sh
| wait "$pid" 2>/dev/null || true | ||
|
|
There was a problem hiding this comment.
Unconditional wait turns spawn_agent into a blocking API and breaks async callers
Line 826 now blocks until the child exits, so PID return is no longer immediate. This regresses async flows in scripts/async-tmux-features.sh (spawn + later coordinated wait) and makes scripts/orchestrate.sh spawn behave synchronously.
Gate this wait behind an explicit mode used only by probe/discover (or move progress/liveness tracking to inner PID consumers) so default spawn_agent behavior stays asynchronous.
Suggested minimal guard at this call site
- wait "$pid" 2>/dev/null || true
+ if [[ "${OCTOPUS_WAIT_FOR_AGENT_COMPLETION:-false}" == "true" ]]; then
+ wait "$pid" 2>/dev/null || true
+ fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| wait "$pid" 2>/dev/null || true | |
| if [[ "${OCTOPUS_WAIT_FOR_AGENT_COMPLETION:-false}" == "true" ]]; then | |
| wait "$pid" 2>/dev/null || true | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/lib/spawn.sh` around lines 826 - 827, The unconditional wait on
"$pid" in spawn_agent makes spawn_agent blocking; change it to only wait when an
explicit synchronous/probe mode is requested (e.g., a flag like PROBE_MODE or
BLOCK_ON_SPAWN set earlier in scripts/lib/spawn.sh or passed into spawn_agent),
so by default spawn_agent returns the child PID immediately and async callers in
scripts/async-tmux-features.sh and scripts/orchestrate.sh continue to work;
alternatively, move progress/liveness tracking into the code that consumes the
PID instead of waiting here—update the check around wait "$pid" to gate it on
that mode/flag and ensure callers that need blocking behavior set that flag.
|
Thanks for jumping on this. I opened #328 with a related fix that keeps |
Problem
probe_discoverbackgrounds the outerspawn_agentwrappers and tracks their PIDs:Inside
spawn_agent, the real work runs in an inner subshell:( # ... set up env, run claude/codex/gemini agent ... ) & pid=$!The outer wrapper captures that inner PID, writes it to the PID file, then exits — in about 1 second.
display_rich_progresstracks the outer wrapper PIDs, sees them all go dead almost immediately, concludes all agents have finished, and triggers synthesis. At that point the inner agent subshells are still running and the output files are empty or partial.Result: every multi-provider probe returns an empty synthesis, making the whole orchestration useless.
Fix
Add
wait "$pid" 2>/dev/null || trueinspawn_agentafter the PID file write, before thelog INFOline. This keeps the outer wrapper alive until the inner agent subshell completes.display_rich_progressnow sees accurate liveness and synthesis fires only after real output exists.Verification
Tested on WSL2/Debian with
OCTOPUS_DISPATCH_STRATEGY=fulldispatching codex + gemini + claude-sonnet. Before the fix: synthesis triggered with 0B output files. After: all agents completed with real output (codex: 255KB, claude-sonnet: 23K+16K, gemini: 8K+8K).Summary by CodeRabbit