fix: multi-pass response extraction, schema sanitization, workspace safety, and SandboxJS try-catch by buger · Pull Request #583 · probelabs/visor

buger · 2026-06-24T11:11:14Z

Summary

track-execution: Replace naive single-pass response extraction with multi-pass approach that filters ProbeAgent tool output (├─┬, [task:), unwraps JSON-wrapped text ({"text":"..."} with control char handling), and falls back through generate-response entries before defaulting to "Execution completed". Fixes tasks showing raw JSON or tool tree output instead of actual AI responses.
mcp-custom-sse-server: Add fixRequiredFields recursive sanitizer that strips required entries referencing non-existent properties (Gemini rejects these schemas). Also apply normalizeInputSchema to external MCP tools (regularTools) — previously only workflow/HTTP/UTCP tools were normalized.
workflow-tool-executor: Fix argsOverrides filter from truthiness check (!argsOverrides[r]) to proper key existence (!(r in argsOverrides)) so falsy override values like 0, "", or false are correctly handled.
workspace-manager: Add safety guards to cleanupStale() — skip directories that are real git repos (have .git directory, not worktree file), validate worktree gitdir paths actually point to .git/worktrees/, and refuse to remove worktrees from parent repos outside the workspace basePath. Prevents accidental deletion of user repositories.
SandboxJS: Bump @nyariv/sandboxjs to probelabs/SandboxJS@d0d8c8a which fixes three bugs in try-catch handling for async/await: catch variable extraction (regex group 2→3), ExecReturn leaking that prevented code after try-catch from executing, and try-finally without catch silently swallowing errors.
assistant.yaml: Add session continuation prompt guidance for workflow tools.

Test plan

15 new tests for multi-pass response extraction (JSON unwrapping, tool output detection, generate-response fallback, empty history, production-like scenarios)
Updated workspace-manager test to match new safety guards
All 3444 existing unit tests pass
56 new tests in SandboxJS for async try-catch scenarios
All 230 SandboxJS tests pass (174 existing + 56 new)

🤖 Generated with Claude Code

…to SSE server Two bugs fixed: 1. Workflow output value_js received raw ReviewSummary wrappers instead of unwrapped step outputs. This caused every workflow tool (slack-search, slack-read-thread, discourse-read-thread, discourse-reply) to return "Unknown error" because outputs['step'].success was undefined (actual data was nested in .output). Script steps already unwrapped correctly via buildProviderTemplateContext, but workflow-executor's value_js, if conditions, and Liquid contexts did not. 2. executeHttpClientTool in mcp-custom-sse-server only handled bearer auth, not oauth2_client_credentials. When the AI called http_client tools with oauth2 auth (e.g. MongoDB Atlas), no token exchange happened and requests went out without Authorization headers, causing 401 errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix on_message trigger dispatch to match normal message path: seed setFirstMessage so human_input checks auto-resolve and the full intent-router → build-config → generate-response chain runs with proper tool loading (Jira MCP, Slack, etc.) - Inject trigger.inputs.text as the AI message with original Slack message appended, so triggers can give specific instructions - Fix live update race condition: serialize publish() calls via a promise queue in SlackTaskLiveUpdateSink to prevent duplicate Slack messages when tick() and complete() run concurrently - Track inflightTick promise so complete()/fail() await in-flight ticks before publishing the final update - Fix self-bot message detection for bot_message subtypes by also checking ev.bot_id against the bot's own bot_id from auth.test - Add resolveChannelName() to SlackClient for #channel-name support in scheduler output targets via conversations.list with caching - Allow cron jobs without workflow (inputs.text as user message) - Make StaticCronJob.workflow optional in types - Fix workflow output warning to only fire for undefined (not null) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add debounce-manager for throttling check executions and integrate it into level-dispatch. Supports configurable throttle settings per check via config types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add global uncaughtException handler that suppresses transient I/O errors (EIO, EPIPE, ECONNRESET, ERR_STREAM_DESTROYED) from dying child processes instead of crashing the entire visor process. Three layers of defense: - Global handler in child-process-error-handler.ts (imported early) - Worktree manager skips process.exit(1) for transient I/O errors - Stream-level error handlers on MCP transport stderr pipes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Update @probelabs/probe to v0.6.0-rc313 with enriched task telemetry (agent scope fields, full task state on events, task.items_json) - Parse task.items_json from batch events for proper titles on batch created/updated/completed/deleted operations - Collapse sub-agent scopes (engineer, code-explorer) that lack meaningful task titles into deduplicated single-line entries instead of showing repetitive generic "Engineer Task" items - Preserve sub-agent task titles when they exist (from task tool snapshots) - Group repeated sub-agent iterations under a single scope label Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Teaches the assistant to reuse inner ProbeAgent sessions via continue_session when making follow-up calls to the same tool, avoiding expensive cold-start re-execution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ce safety - track-execution: replace naive single-pass response extraction with multi-pass approach (unwrapJsonText, isToolOutput, extractBestResponseText) that filters tool output, unwraps JSON-wrapped text, and falls back through generate-response entries before defaulting to "Execution completed" - mcp-custom-sse-server: add fixRequiredFields to strip invalid entries from required arrays (Gemini rejects schemas referencing non-existent properties); apply normalizeInputSchema to external MCP tools (regularTools) - workflow-tool-executor: fix argsOverrides filter from truthiness check to proper key existence (!(r in argsOverrides)) so falsy override values work - workspace-manager: add safety guards to cleanupStale() — skip directories that are real git repos, validate worktree gitdir paths, refuse to touch parent repos outside the workspace basePath - sandboxjs: bump to probelabs/SandboxJS@d0d8c8a which fixes try-catch with async/await (catch variable extraction, ExecReturn leaking, try-finally without catch) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

buger and others added 12 commits March 26, 2026 06:34

fix: stabilize task traces and live updates

9313227

feat: add check execution throttling/debounce support

284589f

Add debounce-manager for throttling check executions and integrate it into level-dispatch. Supports configurable throttle settings per check via config types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: harden live updates and user-facing workflow errors

e6e082b

fix: surface empty assistant response failures

1acf52e

fix: load bundled default includes outside project root (#574)

3610b50

chore: merge main into fix/workflow-output-unwrapping

340eadc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: multi-pass response extraction, schema sanitization, workspace safety, and SandboxJS try-catch#583

fix: multi-pass response extraction, schema sanitization, workspace safety, and SandboxJS try-catch#583
buger wants to merge 12 commits into
mainfrom
fix/workflow-output-unwrapping

buger commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

buger commented Jun 24, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant