Skip to content

fix: default encoding_error_handler to replace in stdio_client for UTF-8 resilience#2553

Open
Maanik23 wants to merge 1 commit intomodelcontextprotocol:mainfrom
Maanik23:fix/stdio-client-utf8-resilience
Open

fix: default encoding_error_handler to replace in stdio_client for UTF-8 resilience#2553
Maanik23 wants to merge 1 commit intomodelcontextprotocol:mainfrom
Maanik23:fix/stdio-client-utf8-resilience

Conversation

@Maanik23
Copy link
Copy Markdown

@Maanik23 Maanik23 commented May 7, 2026

Fixes #2454

Summary

The stdio_client transport crashes when the spawned child process writes invalid UTF-8 bytes to stdout. The transport decodes child stdout with encoding_error_handler="strict" by default, so malformed bytes raise during TextReceiveStream iteration. That exception escapes the decoding loop and brings down the transport task group instead of surfacing the bad line as an in-stream parse error.

Asymmetry with server side

The SDK already hardened the server side for the analogous case in PR #2302 (fix: handle non-UTF-8 bytes in stdio server stdin). That change explicitly preferred:

  • Replace invalid bytes with U+FFFD
  • Let JSON validation fail on the malformed line
  • Keep the transport alive so subsequent valid messages can still be processed

The client side was still using "strict", creating an inconsistency.

Changes

src/mcp/client/stdio.py

  • Changed StdioServerParameters.encoding_error_handler default from "strict" to "replace"
  • Updated docstring to explain the rationale
  • Changed logger.exception to logger.warning for JSON parse failures (avoids noisy full tracebacks for expected validation errors on malformed input)
  • Removed # pragma: no cover from the now-reachable exception handling path

tests/client/test_stdio.py

  • Added test_stdio_client_invalid_utf8_resilience — spawns a child that writes \xff\xfe\n (invalid UTF-8) followed by a valid JSON-RPC message, then asserts:
    • First item from read stream is an Exception (JSON validation failure on the U+FFFD-replaced line)
    • Second item is a valid SessionMessage with the expected payload

Backward compatibility

  • Users who explicitly set encoding_error_handler="strict" retain the old behavior
  • The default change is safe: any code consuming the read stream already handles Exception items (the type signature is MemoryObjectReceiveStream[SessionMessage | Exception])
  • No new dependencies

…F-8 resilience

The stdio_client transport previously defaulted to encoding_error_handler=
strict, causing the transport to crash when the child process emits
invalid UTF-8 bytes.  This is asymmetric with the server-side fix in PR
modelcontextprotocol#2302, which already uses errors=replace for stdio_server.

Changes:
- Default StdioServerParameters.encoding_error_handler to replace
- Invalid bytes are now substituted with U+FFFD and the resulting line
  fails JSON validation, surfacing as an in-stream Exception
- The transport stays alive for subsequent valid messages
- Changed logger.exception to logger.warning for parse failures (avoids
  noisy tracebacks for expected validation errors)
- Removed pragma: no cover from the now-reachable exception handling path

Add regression test that spawns a child emitting invalid UTF-8 followed by
a valid JSON-RPC message, asserting both are delivered correctly.

Fixes modelcontextprotocol#2454
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

stdio_client crashes on malformed UTF-8 from child stdout instead of surfacing parse error

1 participant