Skip to content

Strip orphan tool_calls from chat memory in streaming tool-call turns#6356

Open
jewoodev wants to merge 1 commit into
spring-projects:mainfrom
jewoodev:fix/6340-strip-orphan-tool-calls-memory
Open

Strip orphan tool_calls from chat memory in streaming tool-call turns#6356
jewoodev wants to merge 1 commit into
spring-projects:mainfrom
jewoodev:fix/6340-strip-orphan-tool-calls-memory

Conversation

@jewoodev

@jewoodev jewoodev commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

With stream() + ToolCallingAdvisor.streamToolCallResponses(true) + MessageChatMemoryAdvisor, a tool-calling turn persists an AssistantMessage that carries both the final text and the intermediate tool_calls. Replaying that history on the next turn fails on OpenAI-compatible backends with HTTP 400 (An assistant message with 'tool_calls' must be followed by tool messages).

MessageChatMemoryAdvisor (order HIGHEST_PRECEDENCE + 200) sits outside ToolCallingAdvisor (+300), so with streamToolCallResponses(true) the multi-round flux — the tool-call round concatWith the recursive post-tool text round — reaches the memory advisor and is folded through a single ChatClientMessageAggregator. The aggregator is built to fold a single response stream into one AssistantMessage, so it merges the two rounds into AssistantMessage(text + tool_calls), and after() persisted that orphan verbatim (there is no following ToolResponseMessage in memory).

This sanitizes each assistant message before it is persisted as cross-turn history, as a framework default. For an aggregated/base assistant message that still carries text, it keeps the text, metadata and media and drops only the tool_calls; a pure tool-call frame with no text is an intra-turn artifact and is dropped entirely. The tool-call round-trip is intra-turn structure owned by the tool-calling loop, not long-term conversation history.

Verified with a unit-level reproduction that folds a two-round stream through the real ChatClientMessageAggregator and asserts the persisted memory contains no orphan tool_calls, plus units pinning the text/metadata/media preservation and pure-frame drop:

./mvnw test -pl spring-ai-client-chat -am -Dtest=MessageChatMemoryAdvisorTests -Dsurefire.failIfNoSpecifiedTests=false

Closes #6340

When stream() is combined with
ToolCallingAdvisor.streamToolCallResponses(true), the MessageAggregator
folds the tool-call round and the recursive post-tool text round into a
single AssistantMessage(text + tool_calls). MessageChatMemoryAdvisor.after()
then persisted that message verbatim. Its tool_calls are orphaned in
memory (there is no following ToolResponseMessage), so replaying the
history on the next turn can fail on OpenAI-compatible backends with HTTP
400 ("An assistant message with 'tool_calls' must be followed by tool
messages").

Fix: sanitize each assistant message before it is persisted as
cross-turn history. For an aggregated/base assistant message that still
carries text, keep the text, metadata and media and drop only the
tool_calls; a pure tool-call frame with no text is an intra-turn
artifact and is dropped entirely. The tool-call round-trip is intra-turn
structure owned by the tool-calling loop, not long-term conversation
history.

Adds unit tests covering MessageChatMemoryAdvisor persistence behavior:
- tool_calls are stripped while the assistant text is kept
- text, metadata and media are preserved when tool_calls are stripped
- a pure tool-call frame is dropped even when it carries metadata/media
- an assistant message with no tool_calls is stored unchanged
- a unit-level reproduction through the real ChatClientMessageAggregator
  folding path (mocked StreamAdvisorChain) asserting replay-clean memory

Closes spring-projects#6340

Signed-off-by: jewoodev <jewoos15@naver.com>
@jewoodev jewoodev changed the title Strip orphan tool_calls from chat memory in streaming tool-call turns Strip orphan tool_calls from chat memory in streaming tool-call turns Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MessageChatMemoryAdvisor persists malformed history on streaming + tool calls (next turn Bad Request 400)

1 participant