fix(claude-sdk): emit token-by-token streaming to the client#720
fix(claude-sdk): emit token-by-token streaming to the client#720peterus wants to merge 2 commits into
Conversation
Three coordinated changes so the SDK's partial output actually reaches the UI: 1. server/claude-sdk.js: set sdkOptions.includePartialMessages = true so the Claude Agent SDK emits SDKPartialAssistantMessage events (stream_event) alongside the consolidated assistant messages. 2. claude-sessions.provider.ts: unwrap stream_event envelopes (event.type === 'content_block_delta'/'content_block_stop') into the existing stream_delta / stream_end normalized kinds. The previous flat-event branch never matched the real SDK shape. 3. server/claude-sdk.js: drop text parts from the consolidated assistant message once they were streamed for the same turn — otherwise the client would render both the streamed buffer (finalized by stream_end) and a duplicate text message. Tool-use and thinking blocks pass through unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughPartial-message streaming is conditionally enabled for WebSocket/SSE "writer" consumers via Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Writer (WS/SSE)
participant Server as App Server
participant SDK as Claude SDK
participant Provider as Anthropic Normalizer
Client->>Server: open writer session / subscribe
Server->>SDK: enable includePartialMessages for writer
SDK->>Provider: receive stream_event(content_block_delta with text_delta)
Provider-->>Server: emit stream_delta (text fragment)
Server-->>Client: send streamed text fragments
SDK->>Provider: receive stream_event(content_block_stop)
Provider-->>Server: emit stream_end
SDK->>Server: send consolidated assistant message
Server->>Server: detect textWasStreamed -> strip 'text' parts from consolidated payload
Server-->>Client: send consolidated message without duplicate text
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Enables true token-by-token streaming for the Claude provider by turning on partial message emission in the Claude Agent SDK and wiring the SDK’s streaming event shape into the existing stream_delta / stream_end pipeline while preventing duplicate final text rendering.
Changes:
- Enable
includePartialMessagesin the Claude Agent SDK options to emit partial assistant output events. - Normalize SDK
stream_eventenvelopes into existingstream_delta/stream_endmessages in the Claude sessions provider. - Deduplicate streamed text by removing
textparts from the consolidated assistant message after streaming has occurred.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| server/claude-sdk.js | Enables partial message streaming and strips consolidated assistant text to prevent duplicate rendering. |
| server/modules/providers/list/claude/claude-sessions.provider.ts | Unwraps SDK stream_event envelopes into normalized streaming events. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…owing - Only enable includePartialMessages and the consolidated-text stripping for streaming writers (WebSocket, SSE). ResponseCollector and the git commit-message generator call queryClaudeSDK without streaming and rely on the consolidated assistant text payload. - Use readObjectRecord for stream_event and event.delta narrowing, matching the rest of the provider's defensive parsing. Addresses Copilot review feedback on siteboon#720. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/claude-sdk.js`:
- Around line 687-699: The duplicate-suppression currently only runs when
message.message?.content is an array, so textWasStreamed can remain true for
non-array assistant consolidations; update the consolidation logic around
writerStreams/textWasStreamed to always clear streamed text and reset
textWasStreamed regardless of payload shape: inspect message.type ===
'assistant' and message.message?.content, if it's an array filter out parts with
part.type === 'text' (as done now), and if it's a single content object handle
the single-object case (remove or replace the text content) before assigning
transformedMessage; in all cases ensure textWasStreamed = false after the
assistant consolidation step so duplicate suppression is reset (refer to
writerStreams, textWasStreamed, message.message?.content, transformedMessage).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: dc9798d1-bda9-4d3d-bc39-2d38484ebab4
📒 Files selected for processing (2)
server/claude-sdk.jsserver/modules/providers/list/claude/claude-sessions.provider.ts
| if ( | ||
| writerStreams && | ||
| textWasStreamed && | ||
| message.type === 'assistant' && | ||
| Array.isArray(message.message?.content) | ||
| ) { | ||
| const filtered = message.message.content.filter((part) => part.type !== 'text'); | ||
| transformedMessage = { | ||
| ...transformedMessage, | ||
| message: { ...message.message, content: filtered }, | ||
| }; | ||
| textWasStreamed = false; | ||
| } |
There was a problem hiding this comment.
Make duplicate-suppression reset independent of array-only assistant payloads.
If streamed text is seen but the consolidated assistant payload is not an array, this branch won’t clear duplicate text and won’t reset textWasStreamed. Handle both payload shapes and always reset after the assistant consolidation step.
💡 Suggested hardening patch
- if (
- writerStreams &&
- textWasStreamed &&
- message.type === 'assistant' &&
- Array.isArray(message.message?.content)
- ) {
- const filtered = message.message.content.filter((part) => part.type !== 'text');
- transformedMessage = {
- ...transformedMessage,
- message: { ...message.message, content: filtered },
- };
- textWasStreamed = false;
- }
+ if (
+ writerStreams &&
+ textWasStreamed &&
+ message.type === 'assistant' &&
+ message.message
+ ) {
+ const content = message.message.content;
+ if (Array.isArray(content)) {
+ const filtered = content.filter((part) => part.type !== 'text');
+ transformedMessage = {
+ ...transformedMessage,
+ message: { ...message.message, content: filtered },
+ };
+ } else if (typeof content === 'string') {
+ transformedMessage = {
+ ...transformedMessage,
+ message: { ...message.message, content: [] },
+ };
+ }
+ textWasStreamed = false;
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@server/claude-sdk.js` around lines 687 - 699, The duplicate-suppression
currently only runs when message.message?.content is an array, so
textWasStreamed can remain true for non-array assistant consolidations; update
the consolidation logic around writerStreams/textWasStreamed to always clear
streamed text and reset textWasStreamed regardless of payload shape: inspect
message.type === 'assistant' and message.message?.content, if it's an array
filter out parts with part.type === 'text' (as done now), and if it's a single
content object handle the single-object case (remove or replace the text
content) before assigning transformedMessage; in all cases ensure
textWasStreamed = false after the assistant consolidation step so duplicate
suppression is reset (refer to writerStreams, textWasStreamed,
message.message?.content, transformedMessage).
Summary
Assistant responses arrive as a single block instead of streaming token by token. Three coordinated changes are needed to make streaming actually flow through the existing
stream_delta/stream_endpipeline:server/claude-sdk.js— setsdkOptions.includePartialMessages = trueso the Claude Agent SDK emitsSDKPartialAssistantMessageevents during a turn. Without this flag the SDK only emits the final consolidated assistant message.server/modules/providers/list/claude/claude-sessions.provider.ts— unwrap the{ type: 'stream_event', event: { ... } }envelope into the existingstream_delta/stream_endnormalized kinds. The previous flat-event branch (raw.type === 'content_block_delta') never matched the real SDK shape and was effectively dead code.server/claude-sdk.js— once text was streamed for a turn, striptextparts from the consolidatedSDKAssistantMessageso the client does not render both the streamed buffer (finalized bystream_end) and a duplicate text message.tool_useandthinkingblocks pass through unchanged so tool grouping and thinking rendering still work.Splitting these into separate PRs would leave intermediate states broken (flag-only would emit unhandled stream events; provider-only would have no stream events to unwrap; either without the dedup would double-render the response).
Test plan
Read,Bash) and is grouped under its parent assistant messagemessage.type === 'result'path unchanged)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes