Validate upstream JSON-RPC responses in transparent proxy#5288
Validate upstream JSON-RPC responses in transparent proxy#5288bishnubista wants to merge 7 commits into
Conversation
The transparent proxy forwarded malformed upstream MCP frames to clients with HTTP 200 even when the response violated JSON-RPC 2.0 structure. This adds a boundary check in NoOpResponseProcessor that rejects structurally invalid upstream frames and returns a synthetic 502 carrying a JSON-RPC error to the client, so the proxy stops being a silent amplifier for malformed (or adversarial) upstream servers. Validation runs only for streamable-http POST/200 responses that carry an MCP request signal (MCP-Protocol-Version or Mcp-Session-Id) and an application/json content type, with non-identity Content-Encoding traffic passed through untouched. Body reads are bounded to 100 MiB to match existing streamable-HTTP limits in pkg/vmcp. Rewritten error responses replace headers wholesale so upstream session/cookie/cache metadata is not smuggled into the proxy-generated error. SSE traffic is unaffected. Closes stacklok#5247 Signed-off-by: bishnubista <bista.developer@gmail.com>
4fcdb4b to
a7badd9
Compare
| if dec.More() { | ||
| return fmt.Errorf("JSON-RPC response must contain a single JSON value") | ||
| } | ||
| if err := dec.Decode(&struct{}{}); err != io.EOF { |
There was a problem hiding this comment.
The second Decode here is unreachable. When dec.More() returns false there is nothing left in the stream, so the subsequent Decode will always return io.EOF and the condition never fires. The More() check above already covers the trailing-value case completely. Nit, but worth removing the dead branch.
|
|
||
| // NoOpResponseProcessor is a processor that does nothing. | ||
| // Used for transports that don't require response processing (e.g., streamable-http). | ||
| // NoOpResponseProcessor is the default processor for non-SSE transports. |
There was a problem hiding this comment.
Scoping to non-SSE makes sense for now — SSE validation would need a fundamentally different approach. By the time ProcessResponse runs on an SSE response, the 200 OK and Content-Type: text/event-stream headers are already committed downstream, so there is no way to rewrite to 502 for a malformed mid-stream event. A proper fix would need a per-event streaming interceptor inside the SSE processor that can either synthesize an error event or close the stream — a meaningfully larger scope than this PR.
Worth opening a follow-up issue to track it explicitly, so it is clear the gap is deferred rather than forgotten.
…-jsonrpc-responses
Remove the top-level json.Decoder.More check from upstream JSON-RPC response validation. More is intended for array and object iteration, so using it as a top-level single-value guard can miss malformed trailing delimiter bytes. Keep the second Decode call as the exact-single-value check and compare its result with io.EOF via errors.Is. This rejects trailing JSON values and trailing syntax junk while accepting clean EOF after the first decoded response. Add a regression case for a valid JSON-RPC response followed by a stray closing delimiter so malformed trailing bytes cannot pass validation.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5288 +/- ##
==========================================
- Coverage 68.81% 68.79% -0.03%
==========================================
Files 627 627
Lines 63594 63700 +106
==========================================
+ Hits 43762 43822 +60
- Misses 16583 16618 +35
- Partials 3249 3260 +11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
We have middleware that does the validation and parsing. This happens outside of the transparent proxy, but still in the path. Why not use the existing middleware instead? |
Move strict JSON-RPC response body validation into pkg/mcp so the transparent proxy reuses shared MCP parsing code instead of carrying a private copy. Use parsed MCP request context from the parser middleware as the primary signal for response validation, while preserving protocol/session header fallback for batch and compatibility cases that the request parser does not currently cover. Keep the transparent proxy's ModifyResponse path responsible for rejecting malformed upstream responses with the existing bounded read and 502 rewrite.
…-jsonrpc-responses
Good point. I agree the JSON-RPC validation logic should not live as bespoke protocol code in I traced the path again: the existing MCP parser middleware runs before the transparent proxy and parses inbound requests into context, but I didn’t find an existing generic response validator that can inspect upstream response bodies and rewrite them before they reach the client. The response-side parsing I found is authz-specific or audit-only/lenient. I’ll adjust this by moving the response validation helper into |
Summary
The transparent proxy (
thv run) currently forwards upstream MCP server responses to clients with HTTP 200 even when the JSON-RPC frame is structurally invalid — missingjsonrpc:"2.0", invalididtype, non-object body, etc. The proxy sits between potentially untrusted upstream MCP servers and trusted clients, so a compromised or misconfigured upstream can push malformed frames straight through, where they may crash strict JSON-RPC parsers, be misinterpreted as legitimate responses, or trigger undefined behaviour in client-side parsers.This PR validates upstream JSON-RPC frames at the proxy boundary and rewrites invalid responses into a structured JSON-RPC error so the proxy stops being a silent amplifier for malformed (or adversarial) upstream servers.
NoOpResponseProcessor.ProcessResponse(streamable-http/ default transport). SSE is unaffected.jsonrpc=="2.0",id∈ {string, number, null}, exactly one ofresult/error;error.codemust be an integer,error.messagemust be a string.POST+200,application/json(-rpc) media type, non-identityContent-Encodingpassed through untouched, request must carry an MCP streamable-HTTP signal (MCP-Protocol-VersionorMcp-Session-Id) so non-MCP JSON traffic flowing through the catch-all is not rewritten.{"jsonrpc":"2.0","id":null,"error":{"code":-32000,"message":"Invalid upstream JSON-RPC response","data":"<detail>"}}.-32000is the JSON-RPC implementation-defined server-error code;-32603is reserved for internal JSON-RPC errors. Response headers are replaced wholesale so upstreamMcp-Session-Id,Set-Cookie,ETag,Cache-Control, etc. are not smuggled into the proxy-generated error.io.LimitReaderatmaxJSONRPCResponseBytes = 100 << 20(100 MiB), matching streamable-HTTP precedent inpkg/vmcp/clientandpkg/vmcp/session/internal/backend. Oversized responses are rejected with the same 502 shape so the proxy cannot be amplified into a memory DoS.Closes #5247
Type of change
Test plan
task test)task test-e2e)task lint-fix)Local verification:
go test ./pkg/transport/proxy/transparent/... -count=1— pass.golangci-lint run --timeout=5m ./pkg/transport/proxy/transparent/...— 0 issues.New test cases in
pkg/transport/proxy/transparent/response_processor_test.go:result: null,application/json; charset=utf-8.jsonrpc, invalididtype, non-object body,result+errorboth present, trailing JSON, fractionalerror.code.text/event-stream,application/jsonsomethingelse.Content-Encoding: gzip) passed through unchanged for both valid and malformed payloads; explicitContent-Encoding: identitystill validates.MCP-Protocol-VersionorMcp-Session-Id→ validate.Mcp-Session-Id,Set-Cookie,Etag,Cache-Controlfrom the response.maxJSONRPCResponseBytes) is rejected with a size-limit error.API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.No CRD or operator API surface is touched.
Does this introduce a user-facing change?
Yes. Clients of the transparent proxy that previously received malformed upstream responses verbatim with HTTP 200 will now receive HTTP 502 with a synthetic JSON-RPC error body. Conformant upstream MCP servers are unaffected — only structurally invalid responses are rewritten.
Special notes for reviewers
Two scope decisions called out explicitly so reviewers can push back if either is wrong for the project:
Validation runs in
ModifyResponse, aftertracingTransport.RoundTripmay have observed an upstreamMcp-Session-Idand registered proxy-side session state. This PR strips upstream session headers from the proxy-generated 502 so clients never receive a session id derived from a malformed initialize response, but it does not roll back server-side proxy session state created before validation. Moving validation earlier (or adding a rollback path for invalid initialize responses) touchestracingTransportand felt outside the scope of "structurally validate the upstream frame." Happy to open a follow-up — or fold it into this PR if maintainers prefer.Backward-compat clients that POST without
MCP-Protocol-Versionon the very first initialize will not trigger validation — same as today's behaviour, so no regression. If broader coverage is preferred,tracingTransport.RoundTripalready buffers and parses the request body to detectinitialize; a small context-propagated marker would letProcessResponsevalidate those frames too. Easy follow-up.Open to feedback on:
pkg/vmcpprecedent; can tighten to 10 MiB if maintainers prefer an explicit security default).