Skip to content

Audio generate refactor#811

Merged
Lungsangg merged 7 commits into
developfrom
audio-generate-refactor
Jun 25, 2026
Merged

Audio generate refactor#811
Lungsangg merged 7 commits into
developfrom
audio-generate-refactor

Conversation

@Lungsangg

Copy link
Copy Markdown
Member

No description provided.

@Lungsangg Lungsangg requested a review from tenkus47 June 25, 2026 11:48
@greptile-apps

greptile-apps Bot commented Jun 25, 2026

Copy link
Copy Markdown

Confidence Score: 4/5

Safe to merge with the noted fix: the day-audio path will re-invoke the worker on every request for subtasks that haven't been individually pre-generated.

The _generate_audio_segments function generates audio via the worker for uncached subtasks but never writes the resulting s3_key back to subtask.audio_url. Because _generate_audio_segments has no db parameter, the if subtask.audio_url: short-circuit is never populated through this code path, so every call to generate_plan_audio_service re-hits the worker for each affected subtask (up to 120 s each) and creates new S3 objects that may accumulate.

pecha_api/plans/cms/cms_plans_service.py — specifically the _generate_audio_segments function and its interaction with generate_plan_audio_service.

Reviews (3): Last reviewed commit: "Add tests for worker_client.py - 100% co..." | Re-trigger Greptile

Comment on lines 257 to 262
with SessionLocal() as db:
plan_item: PlanItem = get_plan_day_by_id_any_plan(db=db, day_id=day_id)

audio_segments, subtask_refs = _generate_audio_segments(
audio_segments, subtask_refs = await _generate_audio_segments(
plan_item.tasks, audio_type, language, voice_name
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 DB connection held open during multiple external HTTP calls in generate_plan_audio_service

_generate_audio_segments is await-ed inside the with SessionLocal() as db: block. For every subtask that lacks a cached audio_url, it calls generate_audio_from_text (up to 120 s each) while holding the DB connection. A plan day with several text subtasks could hold the connection for many minutes. This is the same pool-exhaustion risk as in _generate_subtask_audio. Fetching the plan item, closing the session, running all worker calls, then re-opening for the DB writes would eliminate the problem.

Comment on lines +17 to +29
async with httpx.AsyncClient(timeout=120.0) as client:
response = await client.post(
f"{worker_url}/audio/generate",
json={
"text": text,
"language": language,
"type": audio_type.value if hasattr(audio_type, 'value') else audio_type,
"voice_name": voice_name.value if hasattr(voice_name, 'value') else voice_name,
"s3_key_prefix": s3_key_prefix,
},
)
response.raise_for_status()
return response.json()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Worker errors surface as unhandled exceptions

response.raise_for_status() throws httpx.HTTPStatusError on 4xx/5xx responses; a network failure throws httpx.RequestError. Neither is caught here or in either call site (_generate_subtask_audio, _generate_audio_segments). The previous implementation explicitly mapped ValueError → 400 and RuntimeError → 502. With this refactor, any worker failure will propagate as an unhandled exception, and callers will receive a generic 500 Internal Server Error with no actionable message. A try/except block that converts httpx.HTTPStatusError to an HTTPException(502) and httpx.RequestError to an HTTPException(503) would restore equivalent behavior.

Comment thread pecha_api/config.py
Comment on lines +98 to +99
# Worker API Configuration
WORKER_API_URL="",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 When WORKER_API_URL is not configured, its value is "". httpx will then try to POST to "/audio/generate", raise an httpx.UnsupportedProtocol or httpx.InvalidURL, and the error message will say nothing useful about a missing config value. A startup assertion or an early guard in generate_audio_from_text that raises a clear RuntimeError("WORKER_API_URL is not configured") would make misconfigured deployments immediately obvious.

Suggested change
# Worker API Configuration
WORKER_API_URL="",
# Worker API Configuration
# Must be set in the environment; left blank here so a missing config fails
# loudly (httpx InvalidURL) rather than silently sending to a wrong host.
WORKER_API_URL="",

@sonarqubecloud

Copy link
Copy Markdown

@Lungsangg Lungsangg merged commit 5050e54 into develop Jun 25, 2026
7 checks passed
@Lungsangg Lungsangg deleted the audio-generate-refactor branch June 25, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants