feat(lesson): read-aloud (TTS) for lessons by astrapi69 · Pull Request #2 · astrapi69/adaptive-learner

astrapi69 · 2026-06-02T20:21:45Z

Lesson read-aloud (TTS)

Wires text-to-speech into the Lesson viewer, built entirely on the existing voice lib (lib/voice/speech-synthesis + voicePref). Every TTS surface self-hides when the browser lacks speechSynthesis or the user disabled TTS in Voice Settings, so flows that don't use it are unaffected.

What's included

Commit	Feature
C1	`useReadAloud` engine hook + inline `ReadAloudButton` (lucide speaker icon, pulse, language-aware voice, speed multiplier); `speak()` gains an `onBoundary` option
C2	Wired into theory steps + all 5 exercise prompts via the dispatcher (`ttsLang` + `codeMode`); suppressed on code/formula content; `markdownToSpeech` strips markdown/code for clean speech
C3	"Auto read-aloud" header toggle (persisted) — speaks each step on display in the lesson's target language
C4	Inline 0.5 / 0.75 / 1 / 1.25× speed controls (shown only during playback, remembered, restart-at-new-rate) + no-voice warning
C5	Follow-along word highlight via `onboundary` (`.tts-active` accent wash; static underline under `prefers-reduced-motion`); theory swaps to a spanned read-along view while reading
C6	`lesson.tts.` i18n in all 8 catalogs (real umlauts in de) + `R` keyboard shortcut* (ignored in inputs)
C7	Continuous theory reading: "Read all" reads a run of consecutive theory steps as one utterance and auto-advances the viewer at each boundary, stopping at the next exercise
C8	Floating mini-player (prev step / play-pause / next step / stop + "Step X of N theory steps") — step-based skip; `pause`/`resume` added to the engine
tests	Integration (page-level, all 5 exercise types + code suppression) + a Dexie smoke spec for the read-aloud surface

Design decisions (flagging)

No per-tile speaker buttons inside Matching/Picture tiles — those tiles are <button>s, and nesting a button is invalid HTML and would hijack the tile click. Pronunciation is carried by the prompt-level control, theory read-aloud, and auto-read. A non-button affordance (long-press, or a speaker in post-answer feedback) is a clean follow-up if wanted.
Continuous mode uses step-level advance, not word-level highlight (word highlight stays a single-step feature) — keeps the concatenated-utterance offsets simple and the viewer behavior predictable.
Mini-player ships step-based skip first (per the request's recommendation) — more useful for learning than an arbitrary 10s jump, and the Web Speech API can't seek. Time-based seek deferred.

Verification

3128 Vitest tests pass, tsc clean, npm run build + Dexie build clean, backend i18n audit green.
The Dexie smoke spec compiles + wires (vite preview starts, runner reaches browser launch) but could not be executed in the authoring environment because the chromium binary can't be downloaded (network-restricted). Run it via make test-dexie-smoke — it joins the existing Dexie-mode gate.

https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

Generated by Claude Code

Foundation for lesson read-aloud, built on the existing voice lib: - speech-synthesis.ts: speak() gains an onBoundary option so callers can follow along word-by-word (used by the highlight in C5). - useReadAloud: the lesson-level TTS engine — resolves the voice from the saved preference name or the closest match for the requested lang, applies saved rate/pitch x an inline speed multiplier (0.5/0.75/1/1.25x, remembered in localStorage), exposes speaking / activeId / boundaryIndex / voiceAvailable, and stops on unmount. - ReadAloudButton: a compact speaker-icon (lucide Volume2/Square) play/stop toggle for one piece of text. Same visibility gates as SpeechButton (no support / TTS off / empty text -> no render); lang-aware voice; pulses while speaking (pulse disabled under prefers-reduced-motion). - CSS for the button + accent pulse + reduced-motion guard. Tests: speed persistence + offered set; button visibility gates, speak/stop toggle, lang propagation, and rate x speed. aria-labels use lesson.tts.* keys (full i18n lands in C6; t() fallbacks cover it). https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

- ExerciseDispatcher threads ttsLang (the lesson target language) + codeMode to every renderer via the shared controlled-props contract; FreeText/Cloze keep their own codeMode too. - All 5 exercise renderers render a prompt-level ReadAloudButton when a ttsLang is supplied, suppressed for code/formula content. Review + AdaptiveLesson pass no ttsLang, so they stay TTS-free. - Theory steps gain a "Read aloud" control above the body; the body's Markdown is projected to clean speech text via the new markdownToSpeech helper (fenced code dropped, syntax stripped). - CSS: .exercise-prompt-row (prompt + button on one line) and .lesson-theory-tts. Per-tile speaker buttons inside the clickable Matching/Picture tiles are intentionally NOT added — they are <button>s, and nesting a button is invalid HTML and would hijack the tile click. The prompt-level control + the summary answers (later commit) carry the pronunciation value instead. Tests: prompt button renders with ttsLang for free_text/matching/ cloze; suppressed under codeMode; absent without ttsLang. Existing exercise + lesson + two-phase suites stay green. https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

- useReadAloud gains persisted auto-read prefs (readLessonAutoRead / writeLessonAutoRead; off by default). - Lesson page drives the lesson-level engine: when auto-read is on it speaks each step on display in the lesson's target language — theory body (markdown stripped via markdownToSpeech) and exercise prompt; code/formula exercises are skipped. A ref guard makes the effect safe to re-run without re-speaking the same step; turning auto-read off stops playback and resets the guard. - A pill toggle ("Auto read-aloud", aria-pressed) renders in the controls row under the progress bar when TTS is supported. Tests (with a mocked speechSynthesis): toggle renders; theory body + exercise prompt are spoken on display in the target language with markdown stripped; nothing speaks when off; toggle flips aria-pressed and persists. Existing lesson/exercise suites stay green (they run without a synth mock, so the TTS UI stays hidden there). https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

- useReadAloud mirrors speed + speaking into refs and remembers the last spoken text, so setSpeed() can restart the current stream at the new rate immediately (the Web Speech API has no live rate change). speak() now reads the rate from the speed ref. - Lesson controls row shows a 0.5/0.75/1/1.25x speed group ONLY while a stream is playing; the choice persists (readLessonSpeed) and the active speed is aria-pressed. Voice selection is already language- aware (the lang prop -> pickVoice); when the target language has no installed voice the engine reports voiceAvailable=false and the row surfaces a friendly "no voice for {language}" notice (playback still runs with the engine default). Tests: speed control hidden while idle, shown during playback with all four speeds; picking a speed persists it, restarts the read at the new rate, and marks the active button aria-pressed. https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

- New ReadAlongText: renders the plain speech text as word spans and applies .tts-active to the word whose char range contains the engine's onboundary charIndex (exported tokenizeForReadAlong + activeTokenIndex are pure + unit-tested). - TheoryStep is now engine-driven: its Read aloud / Stop button calls the shared useReadAloud engine (so BOTH manual clicks and auto-read emit word boundaries), keyed by a per-step utterance id. While that step is being read the rich Markdown is swapped for the follow-along view; Markdown returns when idle. - Auto-read uses the same theory-{id} utterance id so the highlight also tracks during auto-read. - CSS: .tts-active accent wash with a 120ms transition; under prefers-reduced-motion the wash + transition are dropped for a static accent underline. Tests: tokenizer + activeTokenIndex ranges; active-word render; manual theory button reads + swaps in the follow-along view + flips to Stop; auto-read renders the follow-along view. Existing lesson suite green (Markdown still shows when not reading). https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

- New lesson.tts.* keys in all 8 catalogs (read_aloud, stop, auto_read, speed, no_voice {language}, reading), real umlauts in de; synced to the frontend JSON. The button/toggle/speed/no-voice UI now resolves real strings instead of English fallbacks. - Keyboard shortcut: pressing "R" (no modifier, not in an input / textarea / select / contenteditable) toggles read-aloud of the current step via the engine. Auto-read + shortcut share one currentStepSpeech() payload builder (theory body / non-code prompt, with the theory-{id} utterance id so the highlight tracks). Tests: R reads the current step + swaps in the follow-along, second R stops; R is ignored while typing in an input. Backend i18n audit green. https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

- New pure helpers collectTheoryRun (concatenate a run of consecutive theory steps into ONE utterance text + per-step char offsets) and runStepForChar (map a boundary charIndex back to a step index). - Lesson "Read all" button (shown only on a theory step that begins a run of >=2) speaks the whole run as one utterance and auto-advances the viewer as the engine crosses each step boundary, stopping at the next exercise. Clicking again (or the run ending) stops + clears. - New lesson.tts.read_all in all 8 catalogs (real umlauts in de). Tests: markdownToSpeech (headings/emphasis/inline-code stripped, code blocks dropped, links/images collapsed, empty cases); collectTheoryRun run boundaries + offsets; runStepForChar mapping; Read all renders only when a run exists, speaks the concatenation, and auto-advances on a boundary event. https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

Ships the recommended step-based controls first (time-based seek deferred — more useful for learning + the Web Speech API can't seek): - useReadAloud gains pause()/resume()/paused (the engine stays "speaking" while paused). - New theoryBlockAround helper: the contiguous theory block around a step + its 1-based position/total, for the "Step X of N theory steps" readout and prev/next availability. - LessonTtsMiniPlayer: a floating bottom bar shown while the engine is active — previous theory step (re-read) / play-pause / next theory step / stop + position readout. Pure + presentational. - Lesson wires it: prev/next call readTheoryStepAt (navigate + re-read with the theory-{id} utterance so the follow-along highlight tracks); play/pause toggles the engine; stop ends playback. - New lesson.tts.{play,pause,prev_step,next_step,step_position} in all 8 catalogs (real umlauts in de). CSS for the floating pill bar. Tests: theoryBlockAround block/position/null cases; mini-player render + all four callbacks + edge disabling + paused state; Lesson page hides the player until reading, shows it with the block position, next re-reads the next step, play/pause toggles pause. Full suite 3121 green; build clean; i18n audit green. https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

Integration (Vitest, page level): renders the REAL Lesson page (real ExerciseDispatcher + renderers; only useLesson/getStorage/synth mocked) at each exercise step and pins that the prompt read-aloud button is threaded through for all 5 exercise types, suppressed for a code exercise, and absent entirely when TTS is disabled in Settings. Smoke (Playwright, Dexie build, no backend): downloads fr-a1-from-en, opens 01-greetings and exercises the read-aloud surface end to end — theory control reads + swaps in the follow-along + shows the floating mini-player; mini-player Stop ends playback; the "R" shortcut toggles read-aloud; auto-read speaks each step on display. speechSynthesis is injected via addInitScript so the run doesn't depend on installed voices (which would otherwise end utterances immediately). Note: the smoke spec compiles + wires (vite preview starts, the runner reaches browser launch) but could not be executed in this environment because the chromium binary can't be downloaded (network-restricted). Run it with `make test-dexie-smoke` (it joins the existing Dexie gate). https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV

claude added 9 commits June 2, 2026 18:38

astrapi69 mentioned this pull request Jun 2, 2026

test(lesson-tts): QA audit + regression pins (B1–B3, C1g) #3

Merged

astrapi69 merged commit dd35f62 into main Jun 3, 2026
8 of 10 checks passed

astrapi69 deleted the feature/lesson-tts-read-aloud branch June 3, 2026 06:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(lesson): read-aloud (TTS) for lessons#2

feat(lesson): read-aloud (TTS) for lessons#2
astrapi69 merged 9 commits into
mainfrom
feature/lesson-tts-read-aloud

astrapi69 commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

astrapi69 commented Jun 2, 2026

Lesson read-aloud (TTS)

What's included

Design decisions (flagging)

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants