Skip to content

Tts fallback#797

Open
priyanshi-2003 wants to merge 2 commits into
juspay:releasefrom
priyanshi-2003:tts-fallback
Open

Tts fallback#797
priyanshi-2003 wants to merge 2 commits into
juspay:releasefrom
priyanshi-2003:tts-fallback

Conversation

@priyanshi-2003

@priyanshi-2003 priyanshi-2003 commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

eat: add TTS fallback circuit breaker

  • app/services/fallback/init.py: enable tts in _FALLBACK_DEFAULTS
    (fallback_provider="cartesia"), add check_and_reset_tts_fallback()
    poller, update initialize_fallback_tasks() to register both STT and
    TTS reset tasks independently
  • app/ai/voice/agents/breeze_buddy/tts/init.py: add TTSServiceResult
    and get_tts_service_with_fallback() — proactive routing when circuit is
    open, init-time fallback with record_failure() on primary error
  • app/ai/voice/agents/breeze_buddy/agent/pipeline.py: use
    get_tts_service_with_fallback() in create_services(), return
    TTSServiceResult in third position
  • app/ai/voice/agents/breeze_buddy/agent/init.py: store tts_provider
    from TTSServiceResult, add _tts_failure_recorded / _mid_call_tts_alert_sent
    state, detect TTS processor errors in on_pipeline_error, add
    _send_mid_call_tts_alert() Slack alert helper

Summary by CodeRabbit

  • New Features

    • Added automatic fallback for STT and TTS services when primary providers fail during calls.
    • Slack alerting now notifies teams when speech-to-text or text-to-speech fails mid-call.
  • Improvements

    • Enhanced error recovery with configurable fallback thresholds and timing windows.
    • STT provider now supports configurable automatic reconnection on error.

Devansh-1218 and others added 2 commits May 20, 2026 18:23
- Make fallback provider-agnostic (remove soniox hardcode)
- Log EndFrame errors instead of silently swallowing them
- Move FallbackSettings dataclass and _FALLBACK_DEFAULTS to services/fallback
- BB_FALLBACK_CONFIG returns typed FallbackSettings from services/fallback
- BB_FALLBACK_RAW_CONFIG in dynamic.py returns raw dict via json.loads pattern
- Remove no_delay from DeepgramConfig constructor (field not supported by pipecat)
- Deduplicate mid-call STT alert with _mid_call_alert_sent guard
- Fix reset alert timing: poll every 60s via notify_on_expiry() instead of
  deleting active key early; Redis TTL is sole authority on fallback expiry
- app/services/fallback/__init__.py: enable tts in _FALLBACK_DEFAULTS
  (fallback_provider=cartesia), add check_and_reset_tts_fallback()
  poller, update initialize_fallback_tasks() to register both STT and
  TTS reset tasks independently
- app/ai/voice/agents/breeze_buddy/tts/__init__.py: add TTSServiceResult
  and get_tts_service_with_fallback() — proactive routing when circuit is
  open, init-time fallback with record_failure() on primary error
- app/ai/voice/agents/breeze_buddy/agent/pipeline.py: use
  get_tts_service_with_fallback() in create_services(), return
  TTSServiceResult in third position
- app/ai/voice/agents/breeze_buddy/agent/__init__.py: store tts_provider
  from TTSServiceResult, add _tts_failure_recorded / _mid_call_tts_alert_sent
  state, detect TTS processor errors in on_pipeline_error, add
  _send_mid_call_tts_alert() Slack alert helper
Copilot AI review requested due to automatic review settings June 1, 2026 07:43
@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

This PR introduces a Redis-backed circuit-breaker fallback system for STT and TTS services in the Breeze Buddy voice agent. When a service exceeds configured failure thresholds, it automatically routes to a fallback provider. The Agent detects mid-call failures, records them to the circuit, and sends Slack alerts without blocking call termination.

Changes

Service Fallback Circuit Breaker and Configuration

Layer / File(s) Summary
Fallback configuration and type system
app/services/fallback/__init__.py
FallbackSettings dataclass and BB_FALLBACK_CONFIG function merge defaults with DevCycle/Redis configuration for STT and TTS fallback settings.
ServiceFallback circuit-breaker implementation
app/services/fallback/__init__.py
ServiceFallback class records failures atomically via Lua script, activates with TTL when thresholds are met, sends deduplicated Slack alerts for failures/activation/reset, and exposes is_active() for routing decisions.
Background fallback polling and task registration
app/services/fallback/__init__.py
Background tasks check_and_reset_stt_fallback and check_and_reset_tts_fallback poll Redis for circuit expiry; initialize_fallback_tasks conditionally registers them when fallback is enabled.

STT Service Fallback Integration

Layer / File(s) Summary
STT service result type and internal builder
app/ai/voice/agents/breeze_buddy/stt/__init__.py
STTServiceResult dataclass wraps provider name and service instance; _build_stt_provider constructs provider-specific services without fallback logic.
STT service creation with fallback routing
app/ai/voice/agents/breeze_buddy/stt/__init__.py
create_stt_from_config and get_stt_service now consult fallback circuit and route to fallback when active or on primary init failure.

TTS Service Fallback Integration

Layer / File(s) Summary
TTS service with fallback routing
app/ai/voice/agents/breeze_buddy/tts/__init__.py
New get_tts_service_with_fallback function returns TTSServiceResult and routes based on circuit status, with primary→fallback failover on init error.

Agent-Level Failure Detection and Mid-Call Alerting

Layer / File(s) Summary
Agent instance fields and alert helpers
app/ai/voice/agents/breeze_buddy/agent/__init__.py
Per-call state tracks STT/TTS fallback failures and mid-call alerts; helper methods send Slack alerts for STT and TTS failures.
Pipeline error handler with fallback routing
app/ai/voice/agents/breeze_buddy/agent/__init__.py
Enhanced on_pipeline_error detects STT vs TTS failures, records the first per-call failure to the fallback circuit, sends fire-and-forget Slack alerts, and enqueues EndFrame to terminate the call.
Agent service initialization and pipeline wiring
app/ai/voice/agents/breeze_buddy/agent/__init__.py
Agent.run receives STTServiceResult and TTSServiceResult from create_services, stores provider names on the agent instance, and passes service implementations to build_pipeline.

Pipeline Service Creation Updates

Layer / File(s) Summary
STT and TTS service creation in pipeline
app/ai/voice/agents/breeze_buddy/agent/pipeline.py
create_services imports get_tts_service_with_fallback, uses stt_result and tts_result variables, and returns result objects instead of raw services.

Configuration, Utilities, and Monitoring

Layer / File(s) Summary
Dynamic configuration functions
app/core/config/dynamic.py
BB_STT_SERVICE reads provider name from Redis; BB_FALLBACK_RAW_CONFIG parses JSON fallback config with safe defaults.
Background task registration at startup
app/main.py
initialize_fallback_tasks is called during FastAPI startup to register background circuit-expiry polling.
Fire-and-forget task scheduling utility
app/ai/voice/agents/breeze_buddy/utils/common.py
fire_and_forget function creates and retains asyncio tasks to prevent premature garbage collection during non-blocking Slack alerts.
Soniox reconnect-on-error configuration
app/ai/voice/stt/soniox/config.py
SonioxConfig.reconnect_on_error flag is propagated to the service constructor.
Configuration type conversion and Slack alert extensions
app/services/live_config/utils.py, app/services/slack/alert.py
convert_type gains dict/JSON parsing support; Alert.send() gains optional tag_users parameter to override default mentions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 When services stumble mid-call with a cry,
The circuit breaks in, with a fallback nearby,
Slack alerts hop along, no time to despair,
Redis keeps track while we route with great care! 🔄✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title 'Tts fallback' is vague and overly generic. While it references TTS fallback, it does not clearly convey what specific functionality was added, changed, or improved, making it difficult for reviewers to understand the scope at a glance. Consider revising to a more descriptive title that captures the main implementation, such as 'Add TTS service fallback with circuit-breaker pattern' or 'Implement TTS fallback and error recovery'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch tts-fallback

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR introduces a Redis-backed circuit-breaker style fallback mechanism for STT/TTS, with Slack alerting and background reset tasks, and wires it into Breeze Buddy’s service creation and error handling.

Changes:

  • Add a generic Redis-backed ServiceFallback with failure counting, activation TTL, and Slack alerts; register background tasks to notify on fallback expiry.
  • Add STT/TTS “build with fallback” routing so calls can proactively use fallback providers or fall back on init failures.
  • Extend Slack alert tagging and live-config type conversion utilities to support new configuration and alerting needs.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
app/services/slack/alert.py Adds per-call override for Slack user tagging.
app/services/live_config/utils.py Adds dict parsing support to dynamic type conversion.
app/services/fallback/init.py New Redis-backed fallback state machine, Slack alerts, and background tasks.
app/main.py Registers fallback reset background tasks at app startup.
app/core/config/dynamic.py Adds dynamic config accessors for STT provider and fallback config JSON.
app/ai/voice/stt/soniox/config.py Adds Soniox reconnect-on-error configuration passthrough.
app/ai/voice/agents/breeze_buddy/utils/common.py Adds fire_and_forget helper to keep async tasks from being GC’d.
app/ai/voice/agents/breeze_buddy/tts/init.py Adds TTS init-time fallback routing and result wrapper.
app/ai/voice/agents/breeze_buddy/stt/init.py Adds STT init-time fallback routing and result wrapper; updates legacy provider selection.
app/ai/voice/agents/breeze_buddy/agent/pipeline.py Updates pipeline service creation to use fallback-enabled STT/TTS builders.
app/ai/voice/agents/breeze_buddy/agent/init.py Records mid-call STT/TTS failures into fallback circuit and ends call with alerts.

Comment on lines +148 to +156
async def _send_failure_alert(self, count: int, error_msg: str) -> None:
provider = self.config.primary_provider_name.capitalize()
threshold = self.config.failure_threshold
try:
await slack_alert.send(
title=f"⚠️ STT Failure on {provider} ({count}/{threshold})",
fields=[{"name": "Fail Count", "value": f"{count}/{threshold}"}],
sections=(
[{"title": "Error", "text": f"```{error_msg}```"}]
Comment on lines +255 to +260
async def record_failure(
self,
error_msg: str = "",
call_sid: str = "",
context: str = "unknown",
) -> bool:
Comment on lines +397 to +400
if active:
# Still within the fallback window — record that we've seen it.
await redis.set(self._key_seen_active, "1")
return
Comment on lines +23 to +27
def fire_and_forget(coro) -> None:
"""Schedule a coroutine as a fire-and-forget task, preventing GC."""
task = asyncio.create_task(coro)
_background_tasks.add(task)
task.add_done_callback(_background_tasks.discard)
Comment on lines +39 to +45
elif target_type == dict:
if isinstance(value, dict):
return value
try:
return json.loads(value)
except (ValueError, TypeError):
return None
Comment on lines +30 to +32
# Slack tag used for all fallback alerts
_FALLBACK_TAG = "@breeze-sentinals"
_ALERT_TAG = f"{_FALLBACK_TAG},{SLACK_TAG_USERS}" if SLACK_TAG_USERS else _FALLBACK_TAG
Comment on lines +293 to +298
async def _send_mid_call_stt_alert(self) -> None:
"""Send Slack alert when STT fails mid-call and call must end."""
from app.core.config.static import SLACK_TAG_USERS

_fallback_tag = "@breeze-sentinals"
tag = f"{_fallback_tag},{SLACK_TAG_USERS}" if SLACK_TAG_USERS else _fallback_tag

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (5)
app/ai/voice/agents/breeze_buddy/utils/common.py (1)

23-27: 💤 Low value

Annotate the coro parameter.

The GC-retention pattern is correct, but the coro parameter is unannotated. Add a type hint to satisfy the signature-typing requirement.

♻️ Proposed type hint
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Coroutine, Dict, List, Optional, Tuple, cast
@@
-def fire_and_forget(coro) -> None:
+def fire_and_forget(coro: Coroutine[Any, Any, Any]) -> None:
     """Schedule a coroutine as a fire-and-forget task, preventing GC."""

As per coding guidelines: "Add type hints on all function signatures".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/utils/common.py` around lines 23 - 27,
Annotate the untyped coroutine parameter in fire_and_forget: change the
signature to accept a typed coroutine (e.g., def fire_and_forget(coro:
Coroutine[Any, Any, Any]) -> None) and add the required imports (from typing
import Any, Coroutine) at the top; keep the existing GC-retention logic using
_background_tasks and task.add_done_callback unchanged.
app/main.py (1)

184-186: 💤 Low value

Comment is inaccurate.

The comment says "STT fallback reset tasks" but initialize_fallback_tasks registers both STT and TTS fallback tasks.

📝 Proposed fix
-            # Initialize STT fallback reset tasks
+            # Initialize STT/TTS fallback reset tasks
             await initialize_fallback_tasks(_background_scheduler)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/main.py` around lines 184 - 186, The comment above the call to
initialize_fallback_tasks is misleading—update it to accurately state that the
function registers both STT and TTS fallback reset tasks; locate the call to
initialize_fallback_tasks (and the preceding comment string) and change the
comment text from "STT fallback reset tasks" to something like "STT and TTS
fallback reset tasks" (or similar concise wording) so it reflects both
responsibilities.
app/ai/voice/agents/breeze_buddy/tts/__init__.py (1)

214-219: ⚡ Quick win

Use @dataclass for consistency with STTServiceResult.

STTServiceResult in stt/__init__.py uses @dataclass, but TTSServiceResult is a plain class. Using dataclass reduces boilerplate and ensures consistent behavior (automatic __repr__, __eq__, etc.).

Proposed fix

Add the import at the top of the file:

from dataclasses import dataclass

Then replace the class:

+@dataclass
 class TTSServiceResult:
     """Wraps a TTS service instance with the resolved provider name."""
-
-    def __init__(self, provider: str, service: object):
-        self.provider = provider
-        self.service = service
+    provider: str
+    service: object
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/tts/__init__.py` around lines 214 - 219,
TTSServiceResult is implemented as a plain class while STTServiceResult uses
`@dataclass`; convert TTSServiceResult to a dataclass to reduce boilerplate and
ensure consistent behavior by importing dataclass from dataclasses and
decorating the TTSServiceResult class with `@dataclass` and defining provider: str
and service: object as annotated fields (leave class name TTSServiceResult
intact so usages remain valid).
app/ai/voice/agents/breeze_buddy/agent/pipeline.py (1)

108-122: ⚡ Quick win

Update docstring and return type to reflect result objects.

The function now returns STTServiceResult and TTSServiceResult wrapper objects instead of raw services. The docstring and return type annotation should be updated for clarity:

Proposed fix
+from app.ai.voice.agents.breeze_buddy.stt import STTServiceResult
+
 async def create_services(
     configurations: Optional[ConfigurationModel],
     include_llm: bool = True,
-) -> tuple[Optional[Any], Optional[Any], Optional[Any]]:
+) -> tuple[Optional[STTServiceResult], Optional[Any], Optional[TTSServiceResult]]:
     """Create STT, LLM, and TTS services.

     Args:
         configurations: Template configuration model
         include_llm: When False, skip LLM creation (stream mode). LLM will be None.

     Returns:
-        Tuple of (stt_service, llm_service_or_None, tts_service). For realtime
-        / speech-to-speech LLMs, both ``stt_service`` and ``tts_service`` are
+        Tuple of (stt_result, llm_service_or_None, tts_result). For realtime
+        / speech-to-speech LLMs, both ``stt_result`` and ``tts_result`` are
         ``None`` because the realtime LLM handles audio in/out natively.
     """
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/agent/pipeline.py` around lines 108 - 122,
The create_services signature and docstring must reflect that STT and TTS are
returned as wrapper objects: change the return type annotation to
tuple[Optional[STTServiceResult], Optional[Any], Optional[TTSServiceResult]] (or
the concrete LLM result type if available) and update the docstring to state
that the first element is an STTServiceResult (or None), the second is the LLM
service/result (or None when include_llm is False or when a realtime LLM handles
audio), and the third is a TTSServiceResult (or None); update mentions of
stt_service and tts_service in the docstring to indicate they are wrapper/result
objects and keep ConfigurationModel and include_llm description intact.
app/ai/voice/agents/breeze_buddy/agent/__init__.py (1)

728-745: 💤 Low value

Consider extracting keyword tuples to module-level constants.

The hardcoded keyword tuples for detecting STT vs TTS errors are inline within the handler. Extracting these to module-level constants would improve maintainability when new providers are added and make the handler body more readable.

♻️ Example refactor

At module level (after imports):

_STT_PROCESSOR_KEYWORDS = ("stt", "soniox", "deepgram", "transcri", "google", "sarvam")
_TTS_PROCESSOR_KEYWORDS = ("tts", "elevenlabs", "cartesia", "gemini")

Then in the handler:

-            stt_keywords = (
-                "stt",
-                "soniox",
-                "deepgram",
-                "transcri",
-                "google",
-                "sarvam",
-            )
-            tts_keywords = (
-                "tts",
-                "elevenlabs",
-                "cartesia",
-                "gemini",
-            )
-            is_stt_error = any(kw in processor_str for kw in stt_keywords)
-            is_tts_error = any(kw in processor_str for kw in tts_keywords)
+            is_stt_error = any(kw in processor_str for kw in _STT_PROCESSOR_KEYWORDS)
+            is_tts_error = any(kw in processor_str for kw in _TTS_PROCESSOR_KEYWORDS)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/agent/__init__.py` around lines 728 - 745,
Extract the inline keyword tuples into module-level constants and reference them
from the handler: create constants (e.g. _STT_PROCESSOR_KEYWORDS = ("stt",
"soniox", "deepgram", "transcri", "google", "sarvam") and
_TTS_PROCESSOR_KEYWORDS = ("tts", "elevenlabs", "cartesia", "gemini")) near the
top of the module (after imports), then replace the inline tuples used to
compute processor_str, is_stt_error and is_tts_error with these constants so the
handler reads is_stt_error = any(kw in processor_str for kw in
_STT_PROCESSOR_KEYWORDS) and is_tts_error = any(kw in processor_str for kw in
_TTS_PROCESSOR_KEYWORDS).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/ai/voice/agents/breeze_buddy/agent/__init__.py`:
- Around line 319-343: The fallback Slack tag in _send_mid_call_tts_alert
contains a typo ("`@breeze-sentinals`"); update the _fallback_tag value to
"`@breeze-sentinels`" so the tag correctly reads "sentinels" (ensure
SLACK_TAG_USERS concatenation logic with tag remains unchanged), then run tests
or a quick manual verify to confirm alerts now target the correct user group.
- Around line 293-318: Fix the typo in the fallback Slack tag inside async
method _send_mid_call_stt_alert: change the string value of _fallback_tag from
"`@breeze-sentinals`" to the correct "`@breeze-sentinels`" so the tag variable used
in slack_alert.send() correctly targets the intended Slack user group; update
the _fallback_tag declaration near the top of _send_mid_call_stt_alert to the
corrected spelling.

In `@app/ai/voice/agents/breeze_buddy/tts/__init__.py`:
- Line 42: The import line in app.ai.voice.agents.breeze_buddy.tts.__init__.py
is over 88 chars and breaks Black formatting; split the import from
app.services.fallback across multiple lines (either using parentheses or
one-per-line) so the names BB_FALLBACK_CONFIG, ServiceFallback, and
ServiceFallbackConfig are each wrapped to respect line-length (follow the same
multiline pattern used in stt.__init__.py).

In `@app/services/fallback/__init__.py`:
- Line 24: The import line importing BB_FALLBACK_RAW_CONFIG, BB_STT_SERVICE, and
BB_TTS_SERVICE from app.core.config.dynamic needs to be reformatted to satisfy
Black (split across multiple lines); update the import in __init__.py so each
imported symbol (or a logical grouping) is on its own line or use parentheses
with line breaks around the names to conform to Black's multi-line import
formatting for the module import statement.
- Around line 148-167: The alert text in ServiceFallback._send_failure_alert
incorrectly hardcodes "STT"; change it to use the service name from the config
(e.g., service = self.config.service_name.capitalize()) and substitute that
variable into both the title and fallback_text (and any other hardcoded "STT"
occurrences) so alerts reflect the actual service (TTS or STT); keep provider
usage (self.config.primary_provider_name) and existing count/threshold
formatting as-is and preserve the error section and exception handling.
- Around line 275-276: The fallback branch that does `count = await
redis.incr(self._key_failure_count)` when `run_script` returns None must also
set an expiry so the counter doesn't persist forever; after incrementing the key
call the Redis expire command (e.g., `await
redis.expire(self._key_failure_count, ...)`) using the same TTL used elsewhere
for failure counters (for example `self._failure_window_seconds` or the existing
TTL constant) so the key gets a time-to-live consistent with the normal path.
- Around line 402-416: The reset path can race because multiple pods can see
seen_active=True and each delete the sentinel then call _clear_provider_health
and _send_reset_alert; modify the block that reads/deletes self._key_seen_active
to perform an atomic check-and-delete (use redis.getdel(self._key_seen_active)
if available, or execute a small Lua script that returns and deletes the key in
one step) and only proceed with _clear_provider_health and _send_reset_alert
when the atomic operation indicates this caller actually removed the sentinel;
alternatively implement an NX-based deduplication key (similar to _key_notified)
that you set with SET NX and TTL and only the owner proceeds to call
_send_reset_alert. Ensure you reference and update uses of
self._key_seen_active, _clear_provider_health, and _send_reset_alert
accordingly.

In `@app/services/live_config/utils.py`:
- Around line 39-45: The branch handling target_type == dict must guard the
json.loads result against non-dict JSON; change the block in the function that
checks target_type == dict so you first check target_type is dict (to satisfy
ruff) then if value is a dict return it, otherwise parse with parsed =
json.loads(value) and return parsed only if isinstance(parsed, dict) else return
None; this ensures json scalars/arrays don't slip through and preserves the
original behavior when value is already a dict.

---

Nitpick comments:
In `@app/ai/voice/agents/breeze_buddy/agent/__init__.py`:
- Around line 728-745: Extract the inline keyword tuples into module-level
constants and reference them from the handler: create constants (e.g.
_STT_PROCESSOR_KEYWORDS = ("stt", "soniox", "deepgram", "transcri", "google",
"sarvam") and _TTS_PROCESSOR_KEYWORDS = ("tts", "elevenlabs", "cartesia",
"gemini")) near the top of the module (after imports), then replace the inline
tuples used to compute processor_str, is_stt_error and is_tts_error with these
constants so the handler reads is_stt_error = any(kw in processor_str for kw in
_STT_PROCESSOR_KEYWORDS) and is_tts_error = any(kw in processor_str for kw in
_TTS_PROCESSOR_KEYWORDS).

In `@app/ai/voice/agents/breeze_buddy/agent/pipeline.py`:
- Around line 108-122: The create_services signature and docstring must reflect
that STT and TTS are returned as wrapper objects: change the return type
annotation to tuple[Optional[STTServiceResult], Optional[Any],
Optional[TTSServiceResult]] (or the concrete LLM result type if available) and
update the docstring to state that the first element is an STTServiceResult (or
None), the second is the LLM service/result (or None when include_llm is False
or when a realtime LLM handles audio), and the third is a TTSServiceResult (or
None); update mentions of stt_service and tts_service in the docstring to
indicate they are wrapper/result objects and keep ConfigurationModel and
include_llm description intact.

In `@app/ai/voice/agents/breeze_buddy/tts/__init__.py`:
- Around line 214-219: TTSServiceResult is implemented as a plain class while
STTServiceResult uses `@dataclass`; convert TTSServiceResult to a dataclass to
reduce boilerplate and ensure consistent behavior by importing dataclass from
dataclasses and decorating the TTSServiceResult class with `@dataclass` and
defining provider: str and service: object as annotated fields (leave class name
TTSServiceResult intact so usages remain valid).

In `@app/ai/voice/agents/breeze_buddy/utils/common.py`:
- Around line 23-27: Annotate the untyped coroutine parameter in
fire_and_forget: change the signature to accept a typed coroutine (e.g., def
fire_and_forget(coro: Coroutine[Any, Any, Any]) -> None) and add the required
imports (from typing import Any, Coroutine) at the top; keep the existing
GC-retention logic using _background_tasks and task.add_done_callback unchanged.

In `@app/main.py`:
- Around line 184-186: The comment above the call to initialize_fallback_tasks
is misleading—update it to accurately state that the function registers both STT
and TTS fallback reset tasks; locate the call to initialize_fallback_tasks (and
the preceding comment string) and change the comment text from "STT fallback
reset tasks" to something like "STT and TTS fallback reset tasks" (or similar
concise wording) so it reflects both responsibilities.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 61f0fbaa-099b-4887-96d5-af5c153ce9a4

📥 Commits

Reviewing files that changed from the base of the PR and between 311ae9b and 00c51ce.

📒 Files selected for processing (11)
  • app/ai/voice/agents/breeze_buddy/agent/__init__.py
  • app/ai/voice/agents/breeze_buddy/agent/pipeline.py
  • app/ai/voice/agents/breeze_buddy/stt/__init__.py
  • app/ai/voice/agents/breeze_buddy/tts/__init__.py
  • app/ai/voice/agents/breeze_buddy/utils/common.py
  • app/ai/voice/stt/soniox/config.py
  • app/core/config/dynamic.py
  • app/main.py
  • app/services/fallback/__init__.py
  • app/services/live_config/utils.py
  • app/services/slack/alert.py

Comment on lines +293 to +318
async def _send_mid_call_stt_alert(self) -> None:
"""Send Slack alert when STT fails mid-call and call must end."""
from app.core.config.static import SLACK_TAG_USERS

_fallback_tag = "@breeze-sentinals"
tag = f"{_fallback_tag},{SLACK_TAG_USERS}" if SLACK_TAG_USERS else _fallback_tag
provider = (self.stt_provider or "unknown").capitalize()
try:
await slack_alert.send(
title="🚨 STT Failed — Call Ended (Breeze Buddy)",
fields=[
{"name": "Provider", "value": provider},
{"name": "Call SID", "value": self.call_sid or "unknown"},
],
sections=[
{
"title": "What Happened",
"text": "STT failed mid-call. Call could not continue.",
}
],
fallback_text=f"STT failed, call ended — {self.call_sid or 'unknown'}",
tag_users=tag,
)
except Exception as e:
logger.warning(f"Failed to send mid-call STT alert: {e}")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Typo in Slack tag: "sentinals" should be "sentinels".

The tag @breeze-sentinals appears to be misspelled. This will result in incorrect Slack user group tagging.

✏️ Proposed fix
     async def _send_mid_call_stt_alert(self) -> None:
         """Send Slack alert when STT fails mid-call and call must end."""
         from app.core.config.static import SLACK_TAG_USERS

-        _fallback_tag = "`@breeze-sentinals`"
+        _fallback_tag = "`@breeze-sentinels`"
         tag = f"{_fallback_tag},{SLACK_TAG_USERS}" if SLACK_TAG_USERS else _fallback_tag
🧰 Tools
🪛 Ruff (0.15.14)

[warning] 316-316: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/agent/__init__.py` around lines 293 - 318,
Fix the typo in the fallback Slack tag inside async method
_send_mid_call_stt_alert: change the string value of _fallback_tag from
"`@breeze-sentinals`" to the correct "`@breeze-sentinels`" so the tag variable used
in slack_alert.send() correctly targets the intended Slack user group; update
the _fallback_tag declaration near the top of _send_mid_call_stt_alert to the
corrected spelling.

Comment on lines +319 to +343
async def _send_mid_call_tts_alert(self) -> None:
"""Send Slack alert when TTS fails mid-call and call must end."""
from app.core.config.static import SLACK_TAG_USERS

_fallback_tag = "@breeze-sentinals"
tag = f"{_fallback_tag},{SLACK_TAG_USERS}" if SLACK_TAG_USERS else _fallback_tag
provider = (self.tts_provider or "unknown").capitalize()
try:
await slack_alert.send(
title="🚨 TTS Failed — Call Ended (Breeze Buddy)",
fields=[
{"name": "Provider", "value": provider},
{"name": "Call SID", "value": self.call_sid or "unknown"},
],
sections=[
{
"title": "What Happened",
"text": "TTS failed mid-call. Call could not continue.",
}
],
fallback_text=f"TTS failed, call ended — {self.call_sid or 'unknown'}",
tag_users=tag,
)
except Exception as e:
logger.warning(f"Failed to send mid-call TTS alert: {e}")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Same typo in TTS alert method.

Same "sentinals" → "sentinels" fix needed here.

✏️ Proposed fix
     async def _send_mid_call_tts_alert(self) -> None:
         """Send Slack alert when TTS fails mid-call and call must end."""
         from app.core.config.static import SLACK_TAG_USERS

-        _fallback_tag = "`@breeze-sentinals`"
+        _fallback_tag = "`@breeze-sentinels`"
         tag = f"{_fallback_tag},{SLACK_TAG_USERS}" if SLACK_TAG_USERS else _fallback_tag
🧰 Tools
🪛 Ruff (0.15.14)

[warning] 342-342: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/agent/__init__.py` around lines 319 - 343,
The fallback Slack tag in _send_mid_call_tts_alert contains a typo
("`@breeze-sentinals`"); update the _fallback_tag value to "`@breeze-sentinels`" so
the tag correctly reads "sentinels" (ensure SLACK_TAG_USERS concatenation logic
with tag remains unchanged), then run tests or a quick manual verify to confirm
alerts now target the correct user group.

SARVAM_API_KEY,
)
from app.core.logger import logger
from app.services.fallback import BB_FALLBACK_CONFIG, ServiceFallback, ServiceFallbackConfig

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix Black formatting violation.

This import line exceeds the 88-character limit. Split it across multiple lines to match the pattern in stt/__init__.py:

Proposed fix
-from app.services.fallback import BB_FALLBACK_CONFIG, ServiceFallback, ServiceFallbackConfig
+from app.services.fallback import (
+    BB_FALLBACK_CONFIG,
+    ServiceFallback,
+    ServiceFallbackConfig,
+)

As per coding guidelines: Use Black for code formatting with line-length=88.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from app.services.fallback import BB_FALLBACK_CONFIG, ServiceFallback, ServiceFallbackConfig
from app.services.fallback import (
BB_FALLBACK_CONFIG,
ServiceFallback,
ServiceFallbackConfig,
)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/tts/__init__.py` at line 42, The import line
in app.ai.voice.agents.breeze_buddy.tts.__init__.py is over 88 chars and breaks
Black formatting; split the import from app.services.fallback across multiple
lines (either using parentheses or one-per-line) so the names
BB_FALLBACK_CONFIG, ServiceFallback, and ServiceFallbackConfig are each wrapped
to respect line-length (follow the same multiline pattern used in
stt.__init__.py).

from dataclasses import dataclass

from app.core.background_tasks import BackgroundTaskScheduler
from app.core.config.dynamic import BB_FALLBACK_RAW_CONFIG, BB_STT_SERVICE, BB_TTS_SERVICE

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix Black formatting to resolve pipeline failure.

The pipeline indicates Black would reformat this multi-line import. Split across multiple lines per Black's formatting rules.

🔧 Proposed fix
-from app.core.config.dynamic import BB_FALLBACK_RAW_CONFIG, BB_STT_SERVICE, BB_TTS_SERVICE
+from app.core.config.dynamic import (
+    BB_FALLBACK_RAW_CONFIG,
+    BB_STT_SERVICE,
+    BB_TTS_SERVICE,
+)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/services/fallback/__init__.py` at line 24, The import line importing
BB_FALLBACK_RAW_CONFIG, BB_STT_SERVICE, and BB_TTS_SERVICE from
app.core.config.dynamic needs to be reformatted to satisfy Black (split across
multiple lines); update the import in __init__.py so each imported symbol (or a
logical grouping) is on its own line or use parentheses with line breaks around
the names to conform to Black's multi-line import formatting for the module
import statement.

Comment on lines +148 to +167
async def _send_failure_alert(self, count: int, error_msg: str) -> None:
provider = self.config.primary_provider_name.capitalize()
threshold = self.config.failure_threshold
try:
await slack_alert.send(
title=f"⚠️ STT Failure on {provider} ({count}/{threshold})",
fields=[{"name": "Fail Count", "value": f"{count}/{threshold}"}],
sections=(
[{"title": "Error", "text": f"```{error_msg}```"}]
if error_msg
else []
),
fallback_text=f"STT failure on {provider} ({count}/{threshold})",
tag_users=_ALERT_TAG,
)
except Exception as e:
logger.warning(
f"Service fallback ({self.config.service_name}) "
f"failure alert failed: {e}"
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Hardcoded "STT" in failure alert breaks TTS fallback notifications.

This generic ServiceFallback class is used for both STT and TTS services, but _send_failure_alert hardcodes "STT" in the title and fallback text. When a TTS failure occurs, operators will receive confusing alerts saying "STT Failure on Cartesia".

🐛 Proposed fix
     async def _send_failure_alert(self, count: int, error_msg: str) -> None:
         provider = self.config.primary_provider_name.capitalize()
         threshold = self.config.failure_threshold
+        service = self.config.service_name.upper()
         try:
             await slack_alert.send(
-                title=f"⚠️ STT Failure on {provider} ({count}/{threshold})",
+                title=f"⚠️ {service} Failure on {provider} ({count}/{threshold})",
                 fields=[{"name": "Fail Count", "value": f"{count}/{threshold}"}],
                 sections=(
                     [{"title": "Error", "text": f"```{error_msg}```"}]
                     if error_msg
                     else []
                 ),
-                fallback_text=f"STT failure on {provider} ({count}/{threshold})",
+                fallback_text=f"{service} failure on {provider} ({count}/{threshold})",
                 tag_users=_ALERT_TAG,
             )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def _send_failure_alert(self, count: int, error_msg: str) -> None:
provider = self.config.primary_provider_name.capitalize()
threshold = self.config.failure_threshold
try:
await slack_alert.send(
title=f"⚠️ STT Failure on {provider} ({count}/{threshold})",
fields=[{"name": "Fail Count", "value": f"{count}/{threshold}"}],
sections=(
[{"title": "Error", "text": f"```{error_msg}```"}]
if error_msg
else []
),
fallback_text=f"STT failure on {provider} ({count}/{threshold})",
tag_users=_ALERT_TAG,
)
except Exception as e:
logger.warning(
f"Service fallback ({self.config.service_name}) "
f"failure alert failed: {e}"
)
async def _send_failure_alert(self, count: int, error_msg: str) -> None:
provider = self.config.primary_provider_name.capitalize()
threshold = self.config.failure_threshold
service = self.config.service_name.upper()
try:
await slack_alert.send(
title=f"⚠️ {service} Failure on {provider} ({count}/{threshold})",
fields=[{"name": "Fail Count", "value": f"{count}/{threshold}"}],
sections=(
[{"title": "Error", "text": f"
🧰 Tools
🪛 Ruff (0.15.14)

[warning] 163-163: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/services/fallback/__init__.py` around lines 148 - 167, The alert text in
ServiceFallback._send_failure_alert incorrectly hardcodes "STT"; change it to
use the service name from the config (e.g., service =
self.config.service_name.capitalize()) and substitute that variable into both
the title and fallback_text (and any other hardcoded "STT" occurrences) so
alerts reflect the actual service (TTS or STT); keep provider usage
(self.config.primary_provider_name) and existing count/threshold formatting
as-is and preserve the error section and exception handling.

Comment on lines +275 to +276
if count is None:
count = await redis.incr(self._key_failure_count)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fallback path omits TTL, risking permanent counter.

If run_script returns None (e.g., scripting disabled or error), the fallback incr increments the counter but never sets the TTL. This counter could persist indefinitely, causing incorrect threshold calculations or permanent fallback activation.

🐛 Proposed fix: Set TTL in fallback path
             if count is None:
-                count = await redis.incr(self._key_failure_count)
+                # Lua script unavailable - fallback to separate INCR + EXPIRE
+                count = await redis.incr(self._key_failure_count)
+                if count == 1:
+                    await redis.expire(
+                        self._key_failure_count, self.config.failure_window_secs
+                    )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/services/fallback/__init__.py` around lines 275 - 276, The fallback
branch that does `count = await redis.incr(self._key_failure_count)` when
`run_script` returns None must also set an expiry so the counter doesn't persist
forever; after incrementing the key call the Redis expire command (e.g., `await
redis.expire(self._key_failure_count, ...)`) using the same TTL used elsewhere
for failure counters (for example `self._failure_window_seconds` or the existing
TTL constant) so the key gets a time-to-live consistent with the normal path.

Comment on lines +402 to +416
# Not active — check if it *was* active (sentinel present).
seen = bool(await redis.exists(self._key_seen_active))
if not seen:
return # Never activated during this server lifetime.

# TTL just expired — clear sentinel and fire the reset alert.
await redis.delete(self._key_seen_active)
await self._clear_provider_health(
redis, self.config.primary_provider_name.lower()
)
logger.info(
f"Service fallback ({self.config.service_name}) TTL expired — "
"sending reset alert"
)
await self._send_reset_alert()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Race condition may cause duplicate reset alerts.

Multiple pods polling simultaneously can all observe active=False and seen_active=True, then each delete the sentinel and send the reset alert. Unlike activation (which uses NX), the reset path has no atomic guard.

🐛 Proposed fix: Use atomic getdel or NX deduplication
-            # Not active — check if it *was* active (sentinel present).
-            seen = bool(await redis.exists(self._key_seen_active))
-            if not seen:
-                return  # Never activated during this server lifetime.
-
-            # TTL just expired — clear sentinel and fire the reset alert.
-            await redis.delete(self._key_seen_active)
+            # Not active — atomically check-and-clear sentinel.
+            # GETDEL returns the value if it existed and deletes it atomically.
+            seen = await redis.getdel(self._key_seen_active)
+            if not seen:
+                return  # Never activated or another pod already handled reset.
+
+            # TTL just expired — we won the race; fire the reset alert.
             await self._clear_provider_health(
                 redis, self.config.primary_provider_name.lower()
             )

If getdel is unavailable, use a Lua script or NX-based dedup key similar to _key_notified.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/services/fallback/__init__.py` around lines 402 - 416, The reset path can
race because multiple pods can see seen_active=True and each delete the sentinel
then call _clear_provider_health and _send_reset_alert; modify the block that
reads/deletes self._key_seen_active to perform an atomic check-and-delete (use
redis.getdel(self._key_seen_active) if available, or execute a small Lua script
that returns and deletes the key in one step) and only proceed with
_clear_provider_health and _send_reset_alert when the atomic operation indicates
this caller actually removed the sentinel; alternatively implement an NX-based
deduplication key (similar to _key_notified) that you set with SET NX and TTL
and only the owner proceeds to call _send_reset_alert. Ensure you reference and
update uses of self._key_seen_active, _clear_provider_health, and
_send_reset_alert accordingly.

Comment on lines +39 to +45
elif target_type == dict:
if isinstance(value, dict):
return value
try:
return json.loads(value)
except (ValueError, TypeError):
return None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

json.loads can return a non-dict for target_type == dict.

A JSON scalar/array (e.g. "5"int, "[1,2]"list) parses successfully but isn't a dict, so the function returns a value that violates the requested target type. Guard the parsed result with isinstance.

🛡️ Proposed guard
     elif target_type == dict:
         if isinstance(value, dict):
             return value
         try:
-            return json.loads(value)
+            parsed = json.loads(value)
+            return parsed if isinstance(parsed, dict) else None
         except (ValueError, TypeError):
             return None

Note: Ruff also flags target_type == dict (E721) on Line 39; is/isinstance is preferred, though the rest of this function uses == for consistency — align as you see fit.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
elif target_type == dict:
if isinstance(value, dict):
return value
try:
return json.loads(value)
except (ValueError, TypeError):
return None
elif target_type == dict:
if isinstance(value, dict):
return value
try:
parsed = json.loads(value)
return parsed if isinstance(parsed, dict) else None
except (ValueError, TypeError):
return None
🧰 Tools
🪛 Ruff (0.15.14)

[error] 39-39: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/services/live_config/utils.py` around lines 39 - 45, The branch handling
target_type == dict must guard the json.loads result against non-dict JSON;
change the block in the function that checks target_type == dict so you first
check target_type is dict (to satisfy ruff) then if value is a dict return it,
otherwise parse with parsed = json.loads(value) and return parsed only if
isinstance(parsed, dict) else return None; this ensures json scalars/arrays
don't slip through and preserves the original behavior when value is already a
dict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants