Skip to content

fix(decopilot): sanitize non-Error stream errors instead of leaking raw JSON#3946

Merged
pedrofrxncx merged 1 commit into
mainfrom
fix/decopilot-stream-error-sanitize
Jun 16, 2026
Merged

fix(decopilot): sanitize non-Error stream errors instead of leaking raw JSON#3946
pedrofrxncx merged 1 commit into
mainfrom
fix/decopilot-stream-error-sanitize

Conversation

@pedrofrxncx

@pedrofrxncx pedrofrxncx commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

Production logs surfaced a decopilot run failing on an upstream 504 Upstream idle timeout exceeded. Tracing the handling turned up two defects in how non-Error stream errors are processed. Gateway/provider errors frequently arrive as plain objects ({ code, message, metadata }), not Error instances, and both code paths assumed instanceof Error:

  1. Raw JSON leaked to the client. sanitizeStreamError only sanitized Error instances; everything else fell through to JSON.stringify. So the message persisted to the thread and shown to the user became the literal {"code":504,"message":"Upstream idle timeout exceeded","metadata":{"error_type":"timeout"}} blob, and the provider-URL/branding stripping was skipped entirely.
  2. Useless log line. stringifyProviderError did `${error}` for object errors, so the [runAgentLoop:...] Error line logged [object Object] and hid the real cause.

Changes

  • stream-error.ts: add extractErrorParts(error) that pulls a string message and a numeric statusCode (falling back to a numeric code) out of object errors. sanitizeStreamError now runs both Error and object errors through the same credits-detection + URL/branding-strip logic.
  • native-agent-loop-core.ts: stringifyProviderError now prefers a string message field on object errors, falling back to JSON.
  • Add stream-error.test.ts covering the 504-object path for both sanitizeStreamError and classifyStreamError, plus the credits/opaque-object branches.

Scope / not included

This does not change retry/recovery behavior. The underlying resilience gap — a mid-stream idle timeout has no auto-retry or resume, since the SDK's maxRetries only covers the initial request — is intentionally left out as a separate design discussion.

Testing

  • bun test packages/harness/src/decopilot/stream-error.test.ts native-agent-loop-core.test.ts — pass
  • tsc --noEmit on packages/harness — clean
  • bun run fmt — applied

Summary by cubic

Sanitizes non-Error stream errors so users see clean messages instead of raw JSON. Also fixes provider error logging to show real messages from object errors.

  • Bug Fixes
    • Unified error handling for Error and object errors: extract message + statusCode/code, strip provider URLs/branding, and tag credit issues with [CREDITS].
    • Improved logs: prefer object message in stringifyProviderError; fall back to JSON to avoid [object Object].
    • Tests added for 504 idle-timeout objects, credit detection, opaque objects, and error classification.

Written for commit 4669007. Summary will update on new commits.

Review in cubic

…aw JSON

Gateway/provider stream errors often arrive as plain objects rather than
`Error` instances — e.g. the `{ code, message, metadata }` shape a 504
"Upstream idle timeout exceeded" carries. Two spots mishandled this:

- `sanitizeStreamError` only sanitized `Error` instances and otherwise fell
  back to `JSON.stringify`, so the user-facing / persisted error text became
  the raw `{"code":504,...}` blob and the provider-URL/branding stripping was
  bypassed entirely. Now extract `message` + a numeric `statusCode`/`code`
  from object errors and run them through the same credits + strip logic.
- `stringifyProviderError` logged `[object Object]` for object errors (the
  `[runAgentLoop:...] Error` line), hiding the real cause. Now prefers a
  string `message` field, falling back to JSON.

Add unit tests covering the 504-object path for both sanitize and classify.
@pedrofrxncx pedrofrxncx enabled auto-merge (squash) June 16, 2026 03:29
@pedrofrxncx pedrofrxncx merged commit 04b1cc4 into main Jun 16, 2026
14 checks passed
@pedrofrxncx pedrofrxncx deleted the fix/decopilot-stream-error-sanitize branch June 16, 2026 03:32
decocms Bot pushed a commit that referenced this pull request Jun 16, 2026
PR: #3946 fix(decopilot): sanitize non-Error stream errors instead of leaking raw JSON
Bump type: patch

- @decocms/harness (packages/harness/package.json): 1.6.0 -> 1.6.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant