Skip to content

feat(rag): notify browser of LLM retry/waiting/fallback/warning and search hit count#3130

Merged
marevol merged 18 commits into
masterfrom
feature/ai-chat-progress-notifications
May 3, 2026
Merged

feat(rag): notify browser of LLM retry/waiting/fallback/warning and search hit count#3130
marevol merged 18 commits into
masterfrom
feature/ai-chat-progress-notifications

Conversation

@marevol
Copy link
Copy Markdown
Contributor

@marevol marevol commented May 3, 2026

Summary

Surfaces previously-invisible in-progress events from the AI Search (RAG/Chat) flow to the browser UI via new SSE event types and an extended phase-complete payload.

New SSE events

Event Payload When
retry {phase, operation, attempt, maxAttempts, sleepMs, cause?} LLM HTTP call retry (fired once per retry attempt by the LLM plugin)
waiting {phase, reason, elapsedMs, timeoutMs} When streamChatWithConcurrencyControl blocks on an exhausted permit
fallback {phase, reason, originalQuery, newQuery} Before re-running search with a regenerated query (no_results / no_relevant_results)
warning {phase, code, detail} When intent detection silently falls back due to reasoning-model token exhaustion

The existing event: phase (status: complete) payload is now extended for the search phase to include hitCount so the browser can show "N documents found" before the longer evaluate/fetch phases finish.

Changes

  • LlmStreamCallback — adds default no-op onRetry / onWaiting / onWarning. @FunctionalInterface preserved.
  • ChatPhaseCallback — adds default no-op onRetry / onWaiting / onFallback / onWarning and a payload-aware onPhaseComplete(String, Map<String,Object>) that defaults to delegating to the legacy single-arg form.
  • PhaseAwareStreamCallback (new) — bridges LLM-layer LlmStreamCallback events to phase-aware ChatPhaseCallback events.
  • ChatClient.streamChatEnhanced — wraps every LlmStreamCallback lambda with PhaseAwareStreamCallback, emits onFallback before query regeneration, includes hitCount in search phase completion, fires onWarning when intent detection falls back.
  • AbstractLlmClient.streamChatWithConcurrencyControl — fires onWaiting when the concurrency permit is unavailable.
  • IntentDetectionResult — adds isFallback() flag (additive; only fallbackSearch(...) sets true).
  • ChatApiManager — emits the new SSE events. New emitSseEventSafely + putIfNotNull helpers reduce boilerplate.
  • chat.js / chat.jsp — adds listeners for the new events and i18n labels.
  • fess_label*.properties — adds 6 new keys across all 17 language files (English source for non-translated languages, Japanese translations included).

Backwards compatibility

All interface additions are default no-op methods. Existing implementers (including the 3 LLM plugins) compile and behave unchanged. Plugin PRs to actually fire onRetry follow this PR — see "Companion PRs" below.

Companion PRs (must merge AFTER this one and a new SNAPSHOT is published)

  • codelibs/fess-llm-openai — invoke LlmStreamCallback#onRetry from executeWithRetry
  • codelibs/fess-llm-ollama — same
  • codelibs/fess-llm-gemini — same

Until those merge, the retry SSE event will not fire (rest of the events fire from this PR alone).

Test plan

  • mvn test — 8 touched test classes, 72 tests, 0 failures
    • LlmStreamCallbackTest (1)
    • ChatPhaseCallbackTest (9 — 7 preexisting preserved + 2 new)
    • PhaseAwareStreamCallbackTest (4)
    • AbstractLlmClientWaitingTest (2)
    • ChatClientFallbackTest (1)
    • ChatClientHitCountTest (2)
    • ChatClientWarningTest (2)
    • ChatApiManagerTest (51 — 47 preexisting preserved + 4 new)
  • mvn formatter:format && mvn license:format — clean (no-op)
  • node -c chat.js — JS syntax OK

marevol added 18 commits May 3, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant