Skip to content

feat(chat): redesign smart_summary as Minimal Trail with phase-specific history#3131

Merged
marevol merged 13 commits into
masterfrom
feature/smart-summary-redesign
May 4, 2026
Merged

feat(chat): redesign smart_summary as Minimal Trail with phase-specific history#3131
marevol merged 13 commits into
masterfrom
feature/smart-summary-redesign

Conversation

@marevol
Copy link
Copy Markdown
Contributor

@marevol marevol commented May 4, 2026

Summary

Replaces the head-tail smart_summary history mode with a Minimal Trail structure that drops assistant body text and keeps only {searchQuery, retrievedTitles} per turn. Each turn is now rendered differently for Intent Detection (searched: "..." -> found: [...]) vs Answer Generation (Q: "..." (searched: "...", refs: [...])).

  • ~80% reduction in per-turn history size (e.g., OpenAI: ~900 chars → ~150 chars/turn)
  • Strengthens "reference indication" follow-ups (e.g., "the 3rd document?") which the previous head-tail mode often dropped
  • No additional LLM calls; rendering is pure regex/string ops
  • Other history modes (full, source_titles, source_titles_and_urls, truncated, none) unchanged
  • New config: rag.chat.history.titles.max.count=5

Design

  • Trade-off: P4 (comparison/diff) follow-ups become weaker since past assistant body is no longer kept; users wanting prior behavior can switch to full mode.
  • P1 (深掘り), P3 (絞り込み), P5 (文脈引継ぎ) covered by the search-query trail; P2 (参照指示) is now strengthened by ordered title preservation.

Implementation

  • 11 atomic commits (feat / refactor / fix / test / config)
  • New: extractHistoryForIntent / extractHistoryForAnswer / renderIntentHistoryTurn / renderAnswerHistoryTurn / escapeForLine
  • Removed: buildSmartSummaryContent and the legacy single extractHistory(ChatSession) (the smart_summary case in buildAssistantHistoryContent now throws IllegalStateException defensively)
  • Added: ChatMessage.searchQuery field, populated from the final successful (re)search query in both chat() and streamChatEnhanced()
  • Escapes ", \n, \r in user-supplied parts (search query, user query) to prevent format injection

Test plan

  • All 5615 unit tests pass (mvn test)
  • 9 new render-helper tests cover happy path, titles overflow, missing query, escapes for both user query and search query, and titlesMaxCount <= 0 guard
  • 4 new phase-specific extractHistory tests cover intent/answer pairing semantics, orphan-assistant skip, non-smart_summary delegation
  • Existing tests for other history modes (full / source_titles / source_titles_and_urls / truncated / none) still pass
  • Manual verification with a running Fess + LLM: P1 (深掘り), P2 (参照指示), P3 (絞り込み), P5 (文脈引継ぎ)
  • Update fess-docs (separate PR)

Breaking change note

The default smart_summary mode now produces a different (much smaller) prompt for assistant history. Operators wanting behavior closer to the old mode can set rag.chat.history.assistant.content=full. No legacy_smart_summary alias is provided.

marevol added 11 commits May 4, 2026 14:23
Track the effective search query used (original or regenerated) and set
it on the assistant ChatMessage via setSearchQuery(), mirroring the
streamChatEnhanced flow (Task 2). SUMMARY and UNCLEAR intents leave the
field null, which is correct.
… split

Add two new protected methods and two private helpers to ChatClient that
shape conversation history differently for the Intent Detection and Answer
Generation prompts. In smart_summary mode, Intent uses paired (user,
assistant) rendered lines while Answer uses Q-prefixed combined lines;
other modes delegate to the existing buildAssistantHistoryContent logic.
Narrow the catch block in getHistoryTitlesMaxCount to NumberFormatException
only, and fix the 4 Task-6 test config overrides to return defaultValue
directly instead of delegating to super.getOrDefault (which NPEs because
SimpleImpl.prop is uninitialized in test context).
…hase to answer history

Replace the single extractHistory(session) call in chat() and streamChatEnhanced()
with two phase-specific calls: extractHistoryForIntent and extractHistoryForAnswer.
All intent/query-regeneration operations use historyForIntent; all answer-generation
operations (generate*, streamGenerate*) use historyForAnswer.
…tests

Remove 11 obsolete smart_summary tests and the dead TestableChatClient.extractHistory
override; migrate 8 testExtractHistory calls to testExtractHistoryForAnswer; rewrite
test_extractHistory_defaultMode_isSmartSummary to assert the new intent/answer phase
rendering contract.
@marevol marevol self-assigned this May 4, 2026
@marevol marevol added this to the 15.7.0 milestone May 4, 2026
marevol added 2 commits May 4, 2026 18:01
Mirror the new chat-related label keys (retrying/waiting/hit_count/
fallback_*/warning_token_exhausted) and the new
rag.chat.history.titles.max.count config (default 5), plus the
expanded history-mode comment, into the generated Java sources.
Describe the routing behavior (chunk/error -> inner;
retry/waiting/warning -> phaseCallback tagged with phase) at the
top of the constructor doc.
@marevol marevol merged commit 0e3aa34 into master May 4, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant