fix(mcp): Fix legacy MCP SSE connections#2301
Conversation
There was a problem hiding this comment.
Code Review
This pull request transitions MCP JSON-RPC request IDs from numeric values to strings to improve compatibility with various gateways, while maintaining support for numeric echoes. It also routes legacy SSE endpoint URLs directly to SseTransport instead of HttpTransport. The review feedback highlights a potential issue where custom headers (such as authentication) are ignored when bypassing HttpTransport for legacy SSE connections. Additionally, several performance optimizations are suggested to pass references instead of cloning string IDs when constructing JSON payloads.
| if is_legacy_sse_endpoint_url(url) { | ||
| Box::new( | ||
| SseTransport::connect( | ||
| client, | ||
| url.clone(), | ||
| cancel_token.clone(), | ||
| Duration::from_secs(connect_timeout_secs), | ||
| ) | ||
| .await?, | ||
| ) |
There was a problem hiding this comment.
By bypassing HttpTransport and directly connecting via SseTransport::connect for legacy SSE URLs, any custom headers configured in config.headers (such as Authorization or API keys) will be completely ignored. SseTransport currently does not accept or apply custom headers in its connection loop or message sending. If a legacy SSE server is behind an API gateway or requires authentication headers, the connection will fail. Consider updating SseTransport to accept and apply these custom headers.
| self.send(serde_json::json!({ | ||
| "jsonrpc": "2.0", | ||
| "id": init_id, | ||
| "id": init_id.clone(), |
There was a problem hiding this comment.
You can avoid cloning init_id here by passing a reference &init_id to the serde_json::json! macro. Since serde_json::json! only needs to serialize the value, a reference is sufficient and avoids an unnecessary heap allocation. This also leaves init_id owned and available to be passed to self.recv(init_id) later without cloning.
| "id": init_id.clone(), | |
| "id": &init_id, |
| self.send(serde_json::json!({ | ||
| "jsonrpc": "2.0", | ||
| "id": list_id, | ||
| "id": list_id.clone(), |
| self.send(serde_json::json!({ | ||
| "jsonrpc": "2.0", | ||
| "id": list_id, | ||
| "id": list_id.clone(), |
| self.send(serde_json::json!({ | ||
| "jsonrpc": "2.0", | ||
| "id": list_id, | ||
| "id": list_id.clone(), |
| self.send(serde_json::json!({ | ||
| "jsonrpc": "2.0", | ||
| "id": list_id, | ||
| "id": list_id.clone(), |
| self.send(serde_json::json!({ | ||
| "jsonrpc": "2.0", | ||
| "id": call_id, | ||
| "id": call_id.clone(), |
|
Addressed the review feedback in
Verification:
|
| if response_id_matches(value.get("id"), &expected_id) { | ||
| if let Some(error) = value.get("error") { | ||
| if is_mcp_stale_session_body(&error.to_string()) { | ||
| anyhow::bail!("MCP session expired: {error}"); | ||
| } | ||
| } | ||
| return Ok(value); |
There was a problem hiding this comment.
Stale-session check fires on application-level tool errors
error.to_string() serializes the full JSON-RPC error object (e.g. {"code":-32000,"message":"The 'session_id' field is invalid"}). is_mcp_stale_session_body then checks the entire JSON string for "session" + "invalid/expired", so it will match whenever a parameter name, field key, or message text happens to contain those words — even when the error is a domain-level tool failure with no relation to the MCP transport session.
When that false-positive fires the recv call bails with "MCP session expired", McpPool teardown/reconnects, retries the tool call, gets the same error again, and finally propagates it — masking the real error message on the first attempt and adding an unnecessary reconnect on every call to that tool.
| if response_id_matches(value.get("id"), &expected_id) { | |
| if let Some(error) = value.get("error") { | |
| if is_mcp_stale_session_body(&error.to_string()) { | |
| anyhow::bail!("MCP session expired: {error}"); | |
| } | |
| } | |
| return Ok(value); | |
| if response_id_matches(value.get("id"), &expected_id) { | |
| if let Some(error) = value.get("error") { | |
| // Only inspect the human-readable `message` field, not | |
| // the full serialised JSON. Serialising the entire error | |
| // object means any parameter name or field key that | |
| // contains "session_id", "invalid_session", etc. would | |
| // be mistaken for a transport-level session expiry. | |
| let msg = error | |
| .get("message") | |
| .and_then(serde_json::Value::as_str) | |
| .unwrap_or(""); | |
| if is_mcp_stale_session_body(msg) { | |
| anyhow::bail!("MCP session expired: {error}"); | |
| } | |
| } | |
| return Ok(value); |
Summary
Bugfix
This fixes legacy SSE MCP servers that expose an initial GET /sse stream with an endpoint event, but reject direct Streamable HTTP POSTs to /sse. One observed failure was a 403 api-key not found response from the wrong POST path even though the correct SSE flow does not require an API key.
Verification
Greptile Summary
This PR fixes legacy MCP SSE connections by replacing the old URL path heuristic (
/ssesuffix detection) with an explicittransport: \"sse\"config field and CLI flag, and addresses the previously noted custom-header forwarding gap for SSE transports. It also sends JSON-RPC request IDs as strings (accepting numeric echoes for backwards compatibility), and adds stale-session detection with a single reconnect-and-retry inMcpPool.McpServerConfiggains atransportfield;/mcp add sseand--transport ssewrite it;is_legacy_sse_transportreads it. Streamable-HTTP servers whose URL ends in/sseare no longer misrouted.SseTransport::connectnow accepts and forwardsheadersto both the initialGET /sseand the subsequentPOST, fixing silent header loss for auth-gated legacy endpoints.StreamableSendError::StaleSessionis added;McpPool::call_toolreconnects and retries once on transport-level session expiry (404/session-keyword HTTP responses and closed SSE streams).Confidence Score: 4/5
Safe to merge after addressing the stale-session false-positive in recv(); all other changes are well-tested and targeted.
The recv() function now serialises the full JSON-RPC error object to a string before checking for session-expiry keywords. Any tool that returns an error whose parameter name or message text contains 'session' and 'invalid' or 'expired' (common in auth-heavy APIs) will have that error silently reclassified as a transport session expiry, triggering an unnecessary reconnect, swallowing the original error message on the first attempt, and surfacing it only on the second call.
crates/tui/src/mcp.rs — specifically the recv() stale-session body check and the related is_mcp_stale_session_body helper reuse.
Important Files Changed
Sequence Diagram
sequenceDiagram participant App participant McpPool participant McpConnection participant Transport Note over App,Transport: Legacy SSE path (transport = "sse") App->>McpPool: call_tool(name, args) McpPool->>McpConnection: "connect() [is_legacy_sse_transport=true]" McpConnection->>Transport: SseTransport::connect(url, headers) Transport->>Transport: GET /sse (with custom headers) Transport-->>McpConnection: endpoint event → endpoint_url McpPool->>McpConnection: call_tool → send + recv McpConnection->>Transport: POST endpoint_url (with custom headers) Note over App,Transport: Streamable HTTP path (transport = None) App->>McpPool: call_tool(name, args) McpPool->>McpConnection: "connect() [is_legacy_sse_transport=false]" McpConnection->>Transport: HttpTransport::new → try_establish_session GET Transport-->>McpConnection: Mcp-Session-Id header (optional) McpPool->>McpConnection: call_tool → send + recv McpConnection->>Transport: POST /url (session-id if known) alt Stale session (404 / session-keyword body) Transport-->>McpConnection: StaleSession error McpConnection-->>McpPool: MCP Streamable HTTP session expired McpPool->>McpPool: connections.remove + get_or_connect (reconnect) McpPool->>McpConnection: call_tool retry endReviews (5): Last reviewed commit: "Retry MCP calls after stale SSE connecti..." | Re-trigger Greptile