feat(chat): redesign smart_summary as Minimal Trail with phase-specific history#3131
Merged
Conversation
Track the effective search query used (original or regenerated) and set it on the assistant ChatMessage via setSearchQuery(), mirroring the streamChatEnhanced flow (Task 2). SUMMARY and UNCLEAR intents leave the field null, which is correct.
…mart_summary redesign
… split Add two new protected methods and two private helpers to ChatClient that shape conversation history differently for the Intent Detection and Answer Generation prompts. In smart_summary mode, Intent uses paired (user, assistant) rendered lines while Answer uses Q-prefixed combined lines; other modes delegate to the existing buildAssistantHistoryContent logic.
Narrow the catch block in getHistoryTitlesMaxCount to NumberFormatException only, and fix the 4 Task-6 test config overrides to return defaultValue directly instead of delegating to super.getOrDefault (which NPEs because SimpleImpl.prop is uninitialized in test context).
…hase to answer history Replace the single extractHistory(session) call in chat() and streamChatEnhanced() with two phase-specific calls: extractHistoryForIntent and extractHistoryForAnswer. All intent/query-regeneration operations use historyForIntent; all answer-generation operations (generate*, streamGenerate*) use historyForAnswer.
…tests Remove 11 obsolete smart_summary tests and the dead TestableChatClient.extractHistory override; migrate 8 testExtractHistory calls to testExtractHistoryForAnswer; rewrite test_extractHistory_defaultMode_isSmartSummary to assert the new intent/answer phase rendering contract.
Mirror the new chat-related label keys (retrying/waiting/hit_count/ fallback_*/warning_token_exhausted) and the new rag.chat.history.titles.max.count config (default 5), plus the expanded history-mode comment, into the generated Java sources.
Describe the routing behavior (chunk/error -> inner; retry/waiting/warning -> phaseCallback tagged with phase) at the top of the constructor doc.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the head-tail
smart_summaryhistory mode with a Minimal Trail structure that drops assistant body text and keeps only{searchQuery, retrievedTitles}per turn. Each turn is now rendered differently for Intent Detection (searched: "..." -> found: [...]) vs Answer Generation (Q: "..." (searched: "...", refs: [...])).full,source_titles,source_titles_and_urls,truncated,none) unchangedrag.chat.history.titles.max.count=5Design
fullmode.Implementation
extractHistoryForIntent/extractHistoryForAnswer/renderIntentHistoryTurn/renderAnswerHistoryTurn/escapeForLinebuildSmartSummaryContentand the legacy singleextractHistory(ChatSession)(thesmart_summarycase inbuildAssistantHistoryContentnow throwsIllegalStateExceptiondefensively)ChatMessage.searchQueryfield, populated from the final successful (re)search query in bothchat()andstreamChatEnhanced()",\n,\rin user-supplied parts (search query, user query) to prevent format injectionTest plan
mvn test)titlesMaxCount <= 0guardBreaking change note
The default
smart_summarymode now produces a different (much smaller) prompt for assistant history. Operators wanting behavior closer to the old mode can setrag.chat.history.assistant.content=full. Nolegacy_smart_summaryalias is provided.