feat(lesson): read-aloud (TTS) for lessons#2
Merged
Conversation
Foundation for lesson read-aloud, built on the existing voice lib: - speech-synthesis.ts: speak() gains an onBoundary option so callers can follow along word-by-word (used by the highlight in C5). - useReadAloud: the lesson-level TTS engine — resolves the voice from the saved preference name or the closest match for the requested lang, applies saved rate/pitch x an inline speed multiplier (0.5/0.75/1/1.25x, remembered in localStorage), exposes speaking / activeId / boundaryIndex / voiceAvailable, and stops on unmount. - ReadAloudButton: a compact speaker-icon (lucide Volume2/Square) play/stop toggle for one piece of text. Same visibility gates as SpeechButton (no support / TTS off / empty text -> no render); lang-aware voice; pulses while speaking (pulse disabled under prefers-reduced-motion). - CSS for the button + accent pulse + reduced-motion guard. Tests: speed persistence + offered set; button visibility gates, speak/stop toggle, lang propagation, and rate x speed. aria-labels use lesson.tts.* keys (full i18n lands in C6; t() fallbacks cover it). https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
- ExerciseDispatcher threads ttsLang (the lesson target language) + codeMode to every renderer via the shared controlled-props contract; FreeText/Cloze keep their own codeMode too. - All 5 exercise renderers render a prompt-level ReadAloudButton when a ttsLang is supplied, suppressed for code/formula content. Review + AdaptiveLesson pass no ttsLang, so they stay TTS-free. - Theory steps gain a "Read aloud" control above the body; the body's Markdown is projected to clean speech text via the new markdownToSpeech helper (fenced code dropped, syntax stripped). - CSS: .exercise-prompt-row (prompt + button on one line) and .lesson-theory-tts. Per-tile speaker buttons inside the clickable Matching/Picture tiles are intentionally NOT added — they are <button>s, and nesting a button is invalid HTML and would hijack the tile click. The prompt-level control + the summary answers (later commit) carry the pronunciation value instead. Tests: prompt button renders with ttsLang for free_text/matching/ cloze; suppressed under codeMode; absent without ttsLang. Existing exercise + lesson + two-phase suites stay green. https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
- useReadAloud gains persisted auto-read prefs (readLessonAutoRead /
writeLessonAutoRead; off by default).
- Lesson page drives the lesson-level engine: when auto-read is on it
speaks each step on display in the lesson's target language —
theory body (markdown stripped via markdownToSpeech) and exercise
prompt; code/formula exercises are skipped. A ref guard makes the
effect safe to re-run without re-speaking the same step; turning
auto-read off stops playback and resets the guard.
- A pill toggle ("Auto read-aloud", aria-pressed) renders in the
controls row under the progress bar when TTS is supported.
Tests (with a mocked speechSynthesis): toggle renders; theory body +
exercise prompt are spoken on display in the target language with
markdown stripped; nothing speaks when off; toggle flips aria-pressed
and persists. Existing lesson/exercise suites stay green (they run
without a synth mock, so the TTS UI stays hidden there).
https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
- useReadAloud mirrors speed + speaking into refs and remembers the
last spoken text, so setSpeed() can restart the current stream at
the new rate immediately (the Web Speech API has no live rate
change). speak() now reads the rate from the speed ref.
- Lesson controls row shows a 0.5/0.75/1/1.25x speed group ONLY while
a stream is playing; the choice persists (readLessonSpeed) and the
active speed is aria-pressed. Voice selection is already language-
aware (the lang prop -> pickVoice); when the target language has no
installed voice the engine reports voiceAvailable=false and the row
surfaces a friendly "no voice for {language}" notice (playback still
runs with the engine default).
Tests: speed control hidden while idle, shown during playback with all
four speeds; picking a speed persists it, restarts the read at the new
rate, and marks the active button aria-pressed.
https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
- New ReadAlongText: renders the plain speech text as word spans and
applies .tts-active to the word whose char range contains the
engine's onboundary charIndex (exported tokenizeForReadAlong +
activeTokenIndex are pure + unit-tested).
- TheoryStep is now engine-driven: its Read aloud / Stop button calls
the shared useReadAloud engine (so BOTH manual clicks and auto-read
emit word boundaries), keyed by a per-step utterance id. While that
step is being read the rich Markdown is swapped for the follow-along
view; Markdown returns when idle.
- Auto-read uses the same theory-{id} utterance id so the highlight
also tracks during auto-read.
- CSS: .tts-active accent wash with a 120ms transition; under
prefers-reduced-motion the wash + transition are dropped for a
static accent underline.
Tests: tokenizer + activeTokenIndex ranges; active-word render;
manual theory button reads + swaps in the follow-along view + flips to
Stop; auto-read renders the follow-along view. Existing lesson suite
green (Markdown still shows when not reading).
https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
- New lesson.tts.* keys in all 8 catalogs (read_aloud, stop,
auto_read, speed, no_voice {language}, reading), real umlauts in de;
synced to the frontend JSON. The button/toggle/speed/no-voice UI now
resolves real strings instead of English fallbacks.
- Keyboard shortcut: pressing "R" (no modifier, not in an input /
textarea / select / contenteditable) toggles read-aloud of the
current step via the engine. Auto-read + shortcut share one
currentStepSpeech() payload builder (theory body / non-code prompt,
with the theory-{id} utterance id so the highlight tracks).
Tests: R reads the current step + swaps in the follow-along, second R
stops; R is ignored while typing in an input. Backend i18n audit green.
https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
- New pure helpers collectTheoryRun (concatenate a run of consecutive theory steps into ONE utterance text + per-step char offsets) and runStepForChar (map a boundary charIndex back to a step index). - Lesson "Read all" button (shown only on a theory step that begins a run of >=2) speaks the whole run as one utterance and auto-advances the viewer as the engine crosses each step boundary, stopping at the next exercise. Clicking again (or the run ending) stops + clears. - New lesson.tts.read_all in all 8 catalogs (real umlauts in de). Tests: markdownToSpeech (headings/emphasis/inline-code stripped, code blocks dropped, links/images collapsed, empty cases); collectTheoryRun run boundaries + offsets; runStepForChar mapping; Read all renders only when a run exists, speaks the concatenation, and auto-advances on a boundary event. https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
Ships the recommended step-based controls first (time-based seek
deferred — more useful for learning + the Web Speech API can't seek):
- useReadAloud gains pause()/resume()/paused (the engine stays
"speaking" while paused).
- New theoryBlockAround helper: the contiguous theory block around a
step + its 1-based position/total, for the "Step X of N theory
steps" readout and prev/next availability.
- LessonTtsMiniPlayer: a floating bottom bar shown while the engine is
active — previous theory step (re-read) / play-pause / next theory
step / stop + position readout. Pure + presentational.
- Lesson wires it: prev/next call readTheoryStepAt (navigate + re-read
with the theory-{id} utterance so the follow-along highlight tracks);
play/pause toggles the engine; stop ends playback.
- New lesson.tts.{play,pause,prev_step,next_step,step_position} in all
8 catalogs (real umlauts in de). CSS for the floating pill bar.
Tests: theoryBlockAround block/position/null cases; mini-player render
+ all four callbacks + edge disabling + paused state; Lesson page hides
the player until reading, shows it with the block position, next
re-reads the next step, play/pause toggles pause. Full suite 3121
green; build clean; i18n audit green.
https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
Integration (Vitest, page level): renders the REAL Lesson page (real ExerciseDispatcher + renderers; only useLesson/getStorage/synth mocked) at each exercise step and pins that the prompt read-aloud button is threaded through for all 5 exercise types, suppressed for a code exercise, and absent entirely when TTS is disabled in Settings. Smoke (Playwright, Dexie build, no backend): downloads fr-a1-from-en, opens 01-greetings and exercises the read-aloud surface end to end — theory control reads + swaps in the follow-along + shows the floating mini-player; mini-player Stop ends playback; the "R" shortcut toggles read-aloud; auto-read speaks each step on display. speechSynthesis is injected via addInitScript so the run doesn't depend on installed voices (which would otherwise end utterances immediately). Note: the smoke spec compiles + wires (vite preview starts, the runner reaches browser launch) but could not be executed in this environment because the chromium binary can't be downloaded (network-restricted). Run it with `make test-dexie-smoke` (it joins the existing Dexie gate). https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Lesson read-aloud (TTS)
Wires text-to-speech into the Lesson viewer, built entirely on the existing voice lib (
lib/voice/speech-synthesis+voicePref). Every TTS surface self-hides when the browser lacksspeechSynthesisor the user disabled TTS in Voice Settings, so flows that don't use it are unaffected.What's included
useReadAloudengine hook + inlineReadAloudButton(lucide speaker icon, pulse, language-aware voice, speed multiplier);speak()gains anonBoundaryoptionttsLang+codeMode); suppressed on code/formula content;markdownToSpeechstrips markdown/code for clean speechonboundary(.tts-activeaccent wash; static underline underprefers-reduced-motion); theory swaps to a spanned read-along view while readinglesson.tts.*i18n in all 8 catalogs (real umlauts in de) +Rkeyboard shortcut (ignored in inputs)pause/resumeadded to the engineDesign decisions (flagging)
<button>s, and nesting a button is invalid HTML and would hijack the tile click. Pronunciation is carried by the prompt-level control, theory read-aloud, and auto-read. A non-button affordance (long-press, or a speaker in post-answer feedback) is a clean follow-up if wanted.Verification
tscclean,npm run build+ Dexie build clean, backend i18n audit green.make test-dexie-smoke— it joins the existing Dexie-mode gate.https://claude.ai/code/session_019JZ1Ridnhg4hmv6AcSVdcV
Generated by Claude Code