Skip to content

Align OpenAI-compatible streaming reasoning_content with non-streaming responses#6373

Open
jewoodev wants to merge 1 commit into
spring-projects:mainfrom
jewoodev:gh-5898-streaming-reasoning-content
Open

Align OpenAI-compatible streaming reasoning_content with non-streaming responses#6373
jewoodev wants to merge 1 commit into
spring-projects:mainfrom
jewoodev:gh-5898-streaming-reasoning-content

Conversation

@jewoodev

@jewoodev jewoodev commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Why

OpenAiChatModel already preserves provider reasoning metadata such as DeepSeek's reasoning_content on the non-streaming call() path, but the OpenAI-compatible streaming path drops the same fields before they can be captured in reasoningContent, leaving that metadata empty on streamed responses. ChunkMerger.chunkToChatCompletion rebuilds each streamed chunk's message from content/refusal/toolCalls only, so provider-specific delta fields such as DeepSeek's reasoning_content (or OpenRouter's reasoning) never reach the message that getReasoningContent reads. The next-turn replay in createRequest then has nothing to re-attach, so providers that require reasoning on assistant tool-call messages reject the request with "thinking is enabled but reasoning_content is missing in assistant tool call message".

How

The fix stays inside OpenAiChatModel: chunkToChatCompletion now carries the delta's additional properties onto the rebuilt message (matching the non-streaming path), mergeDeltas concatenates reasoning fragments across merged tool-call chunks, and the streaming pipeline accumulates reasoning per choice so the final emitted response holds the full reasoning content — surviving the last-wins metadata aggregation that MessageAggregator-based advisors use to build the assistant message they re-inject on the next turn. As a consequence, intermediate streamed responses carry the reasoning accumulated so far rather than the per-chunk fragment (previously this metadata was always empty on the streaming path).

Test

Unit tests reproduce the failure with mocked streaming chunks: reasoning followed by a split tool call (parameterized over DeepSeek's reasoning_content and OpenRouter's reasoning) and reasoning without tools. The new regression cases fail on main with empty reasoningContent and pass with this change. The tool-call test also replays the aggregated assistant message through createRequest and asserts the reasoning_content wire field is re-attached:

./mvnw test -pl models/spring-ai-openai -Dtest=OpenAiChatModelTests

The native DeepSeek starter loses reasoning at a different point (MessageAggregator flattening of DeepSeekAssistantMessage, #6026), which this PR intentionally does not touch.

See #5898

@jewoodev jewoodev changed the title Preserve streaming reasoning_content through chunk merge and aggregation Align OpenAI-compatible streaming reasoning_content with non-streaming responses Jun 10, 2026
@jewoodev jewoodev force-pushed the gh-5898-streaming-reasoning-content branch 2 times, most recently from 0b1fa9d to 293bfd9 Compare June 11, 2026 03:15
@jewoodev jewoodev changed the title Align OpenAI-compatible streaming reasoning_content with non-streaming responses Align OpenAI-compatible streaming reasoning_content with non-streaming responses Jun 11, 2026

private static String stringProperty(Delta delta, String key) {
JsonValue value = delta._additionalProperties().get(key);
return value == null ? "" : (String) value.asString().orElse("");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very keen on transforming a null into "". If a value isn't there, it isn't there. Or does the downstream code rely on a value being present in all cases?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The downstream replay path only needs the value when actual reasoning content is present, so missing provider fields do not need to be modeled as empty strings here. I updated the merge logic to leave missing fields absent and only concatenate/propagate reasoning fields when the incoming chunk carries a non-empty fragment.

In the OpenAI-compatible streaming path, ChunkMerger.chunkToChatCompletion
rebuilt the message from content/refusal/toolCalls only, so
provider-specific delta fields such as reasoning_content (DeepSeek) or
reasoning (OpenRouter) were never copied onto the message that
getReasoningContent reads. The reasoningContent metadata was therefore
always empty in streaming, and the next-turn replay in createRequest had
nothing to re-attach, so providers that require reasoning on assistant
tool-call messages reject the replayed request.

- Copy delta additional properties onto the rebuilt message, matching
  the non-streaming path
- Concatenate reasoning fragments across merged tool-call chunks
- Accumulate reasoning across the stream so the final response carries
  the full reasoning content through last-wins metadata aggregation

See spring-projects#5898

Signed-off-by: jewoodev <jewoos15@naver.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants