deepgram · rohan-tessl · May 7, 2026
@@ -7,16 +7,11 @@ description: Use when writing or reviewing Python code in this repo that calls D
 
 Analytics overlays applied to `/v1/listen` transcription: summarize, topics, intents, sentiment, language detection, diarization, redaction, entities. Same endpoint / same client methods as STT — enable features via params.
 
-## When to use this product
-
-- You have **audio** (file, URL, or live stream) and want analytics alongside the transcript.
-- REST is the primary path — most analytics are REST-only.
-
 **Use a different skill when:**
-- You want a pure transcript with no analytics → `deepgram-python-speech-to-text`.
-- Your input is already transcribed text → `deepgram-python-text-intelligence` (`/v1/read`).
-- You need conversational turn-taking → `deepgram-python-conversational-stt`.
-- You need a full interactive agent → `deepgram-python-voice-agent`.
+- Pure transcript with no analytics → `deepgram-python-speech-to-text`.
+- Input is already transcribed text → `deepgram-python-text-intelligence` (`/v1/read`).
+- Conversational turn-taking → `deepgram-python-conversational-stt`.
+- Full interactive agent → `deepgram-python-voice-agent`.
 
 ## Feature availability: REST vs WSS
 
@@ -55,13 +50,13 @@ response = client.listen.v1.media.transcribe_url(
     model="nova-3",
     smart_format=True,
     punctuate=True,
-    diarize=True,              # speaker separation
-    summarize="v2",            # "v2" for the current model; True also accepted on /v1/listen
+    diarize=True,
+    summarize="v2",
     topics=True,
     intents=True,
     sentiment=True,
     detect_language=True,
-    redact=["pci", "pii"],     # or Sequence[str]
+    redact=["pci", "pii"],
     language="en-US",
 )
 
@@ -98,44 +93,23 @@ response = client.listen.v1.media.transcribe_file(
 
 ## Quick start — diarization with word-level timings
 
-Enable speaker separation and word-level timestamps in a single request, then iterate the per-word objects to build a speaker-labelled transcript with timing.
-
 ```python
 response = client.listen.v1.media.transcribe_url(
     url="https://dpgr.am/spacewalk.wav",
     model="nova-3",
-    diarize=True,        # tag each word with a speaker id
-    smart_format=True,   # punctuated_word for cleaner output
+    diarize=True,
+    smart_format=True,
     punctuate=True,
 )
 
 words = response.results.channels[0].alternatives[0].words or []
-
-# Per-word: speaker, timestamps, confidence
-for w in words:
-    speaker = getattr(w, "speaker", None)
-    text = w.punctuated_word or w.word
-    print(f"[speaker {speaker}] {text}  ({w.start:.2f}s–{w.end:.2f}s, conf={w.confidence:.2f})")
-
-# Group consecutive words by speaker into utterances
 from itertools import groupby
 for speaker, group in groupby(words, key=lambda w: getattr(w, "speaker", None)):
     text = " ".join((w.punctuated_word or w.word) for w in group)
     print(f"Speaker {speaker}: {text}")
 ```
 
-Per-word fields available on each entry:
-
-| Field | Type | Description |
-|---|---|---|
-| `word` | `str` | Lowercase token |
-| `punctuated_word` | `str \| None` | Token with smart-formatted casing/punctuation (when `smart_format=True`) |
-| `start`, `end` | `float` | Audio timestamps in seconds |
-| `confidence` | `float` | 0.0–1.0 confidence |
-| `speaker` | `int \| None` | Speaker id (when `diarize=True`); `None` if diarization disabled |
-| `speaker_confidence` | `float \| None` | Speaker-id confidence |
-
-For a higher-level breakdown, set `utterances=True` to get pre-grouped speaker turns at `response.results.utterances`. Set `paragraphs=True` for a `paragraphs` view organised by speaker turn boundaries.
+Each word object has: `word`, `punctuated_word`, `start`/`end` (float seconds), `confidence`, `speaker` (int, when `diarize=True`), `speaker_confidence`. For pre-grouped speaker turns use `utterances=True` (`response.results.utterances`) or `paragraphs=True`.
 
 ## Quick start — WSS subset (diarize / redact / entities only)
 
@@ -151,27 +125,32 @@ with client.listen.v1.connect(model="nova-3", diarize=True, redact=["pii"]) as c
     conn.send_finalize()
 ```
 
+## Validation & recovery
+
+After transcription, verify analytics fields are populated:
+
+```python
+r = response.results
+if r.summary is None and summarize_was_requested:
+    # Feature silently ignored -- likely passed on WSS (REST-only).
+    # Recovery: re-run via REST instead of WSS.
+    response = client.listen.v1.media.transcribe_url(url=..., summarize="v2", ...)
+```
+
+For `redact`, confirm redacted markers appear in the transcript (e.g., search for `[REDACTED]`). A missing marker means encoding mismatch or unsupported redact value.
+
 ## Key parameters
 
 `summarize`, `topics`, `intents`, `sentiment`, `detect_language`, `diarize`, `redact`, `custom_topic`, `custom_topic_mode`, `custom_intent`, `custom_intent_mode`, `detect_entities`, plus all the standard STT params (`model`, `language`, `encoding`, `sample_rate`, ...).
 
-`redact` is typed as `Optional[str]` in the current generated SDK (`src/deepgram/listen/v1/media/client.py`). Pass a single redaction mode such as `"pci"`, `"pii"`, `"numbers"`, or `"phi"`. Multi-mode redaction at the transport level is supported by sending `redact` as a repeated query parameter — check `src/deepgram/types/listen_v1redact.py` for the current type and fall back to raw query-param construction (or multiple calls) if you need several modes. The earlier `Union[str, Sequence[str]]` override is no longer carried in `.fernignore`.
+`redact` is typed as `Optional[str]` in the generated SDK. Pass a single mode (`"pci"`, `"pii"`, `"numbers"`, `"phi"`). For multi-mode, use repeated query params or multiple calls -- see `src/deepgram/types/listen_v1redact.py`.
 
 ## API reference (layered)
 
-1. **In-repo reference**: `reference.md` — "Listen V1 Media" (REST params include all analytics flags), "Listen V1 Connect" (WSS-supported subset).
-2. **OpenAPI (REST)**: https://developers.deepgram.com/openapi.yaml
-3. **AsyncAPI (WSS)**: https://developers.deepgram.com/asyncapi.yaml
-4. **Context7**: library ID `/llmstxt/developers_deepgram_llms_txt`.
-5. **Product docs**:
-   - https://developers.deepgram.com/docs/stt-intelligence-feature-overview
-   - https://developers.deepgram.com/docs/summarization
-   - https://developers.deepgram.com/docs/topic-detection
-   - https://developers.deepgram.com/docs/intent-recognition
-   - https://developers.deepgram.com/docs/sentiment-analysis
-   - https://developers.deepgram.com/docs/language-detection
-   - https://developers.deepgram.com/docs/redaction
-   - https://developers.deepgram.com/docs/diarization
+1. **In-repo reference**: `reference.md` -- "Listen V1 Media" (REST), "Listen V1 Connect" (WSS subset).
+2. **OpenAPI / AsyncAPI**: https://developers.deepgram.com/openapi.yaml, https://developers.deepgram.com/asyncapi.yaml
+3. **Context7**: library ID `/llmstxt/developers_deepgram_llms_txt`.
+4. **Product docs**: https://developers.deepgram.com/docs/stt-intelligence-feature-overview (overview); per-feature pages at `/docs/summarization`, `/docs/topic-detection`, `/docs/intent-recognition`, `/docs/sentiment-analysis`, `/docs/language-detection`, `/docs/redaction`, `/docs/diarization`.
 
 ## Gotchas
 
@@ -195,12 +174,4 @@ with client.listen.v1.connect(model="nova-3", diarize=True, redact=["pii"]) as c
 - `deepgram-python-conversational-stt` — Flux for turn-taking
 - `deepgram-python-voice-agent` — interactive assistants
 
-## Central product skills
-
-For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
-
-```bash
-npx skills add deepgram/skills
-```
-
-This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
+For cross-language Deepgram product knowledge, install the central skills: `npx skills add deepgram/skills`.
@@ -7,16 +7,10 @@ description: Use when writing or reviewing Python code in this repo that calls D
 
 Turn-aware streaming STT at `/v2/listen` — optimized for conversational audio (end-of-turn detection, eager EOT, barge-in scenarios).
 
-## When to use this product
-
-- You're building a **conversational UI** and need explicit turn boundaries.
-- You want **Flux models** (optimized for human-to-human or human-to-agent conversation).
-- You want lower latency turn signals than v1 utterance_end.
-
 **Use a different skill when:**
-- You want general-purpose transcription (captions, batch, non-conversational) → `deepgram-python-speech-to-text`.
-- You want a full interactive agent (STT + LLM + TTS) → `deepgram-python-voice-agent`.
-- You want analytics (summarize/sentiment) → `deepgram-python-audio-intelligence`.
+- General-purpose transcription (captions, batch, non-conversational) → `deepgram-python-speech-to-text`.
+- Full interactive agent (STT + LLM + TTS) → `deepgram-python-voice-agent`.
+- Analytics (summarize/sentiment) → `deepgram-python-audio-intelligence`.
 
 ## Authentication
 
@@ -74,6 +68,26 @@ with client.listen.v2.connect(
     conn.start_listening()
 ```
 
+## Error recovery
+
+On `ListenV2FatalError`, the connection is terminal -- open a new one. For transient disconnects (`EventType.CLOSE` without a prior fatal), reconnect with exponential backoff:
+
+```python
+import time
+
+def run_with_reconnect(max_retries=5):
+    for attempt in range(max_retries):
+        try:
+            with client.listen.v2.connect(model="flux-general-en", encoding="linear16", sample_rate="16000") as conn:
+                # ... register handlers, send audio ...
+                conn.start_listening()
+                break  # clean exit
+        except Exception as e:
+            wait = min(2 ** attempt, 30)
+            print(f"Disconnected ({e}), retrying in {wait}s...")
+            time.sleep(wait)
+```
+
 ## Key parameters
 
 | Param | Notes |
@@ -143,12 +157,4 @@ async with client.listen.v2.connect(model="flux-general-en", ...) as conn:
 - `deepgram-python-speech-to-text` — v1 general-purpose STT (REST + WSS)
 - `deepgram-python-voice-agent` — full interactive assistant
 
-## Central product skills
-
-For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
-
-```bash
-npx skills add deepgram/skills
-```
-
-This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
+For cross-language Deepgram product knowledge, install the central skills: `npx skills add deepgram/skills`.
@@ -7,18 +7,9 @@ description: Use when writing or reviewing Python code in this repo that calls D
 
 Administrative REST endpoints at `api.deepgram.com/v1/projects`, `/v1/models`, and reusable agent configuration storage. Project-scoped resources live under `client.manage.v1.projects.*` (keys, members, members.invites, usage, billing, models, requests). Global models at `client.manage.v1.models`. Think-model discovery at `client.agent.v1.settings.think.models`. Reusable agent configs at `client.voice_agent.configurations.*`.
 
-## When to use this product
-
-- **Discover / pin models**: `client.manage.v1.models.list()` returns the active STT/TTS set.
-- **Project admin**: list/get/update/delete/leave projects.
-- **API key lifecycle**: list/create/delete project keys.
-- **Member + invite management**: add/remove members, manage roles, send/revoke invites.
-- **Usage + billing**: query request volume, balances.
-- **Reusable Voice Agent configs**: persist the **`agent` block** of a Settings message on the server, reference by `agent_id`. The stored blob is the `agent` object only (listen / think / speak providers + prompt), not the full `AgentV1Settings`.
-
 **Use a different skill when:**
-- You want to actually talk to an agent → `deepgram-python-voice-agent`.
-- You want to transcribe or synthesize → STT/TTS skills.
+- Running an agent interactively → `deepgram-python-voice-agent`.
+- Transcribing or synthesizing → STT/TTS skills.
 
 ## Authentication
 
@@ -85,33 +76,24 @@ See `examples/51-55` for each sub-module.
 
 ## Quick start — Voice Agent configurations
 
+**Important:** The stored config is the `agent` block only (listen/think/speak providers + prompt) as a JSON string, NOT the full `AgentV1Settings`. Top-level fields like `audio` go in the live Settings message at connect time. The returned `agent_id` replaces the inline `agent` object in future Settings messages. Configs are immutable -- create a new one to change behavior; only metadata is mutable.
+
 ```python
-# List reusable configs
+import json
 configs = client.voice_agent.configurations.list(project_id=pid)
 
-# Create: `config` is a JSON string of the `agent` BLOCK ONLY — not the full
-# Settings message. Do NOT include top-level Settings fields like `audio`;
-# those are sent at connect-time in the live Settings message. The stored
-# `agent_id` later replaces the inline `agent` object in a Settings message.
-import json
 config_json = json.dumps({
     "listen": {"provider": {"type": "deepgram", "model": "nova-3"}},
     "think":  {"provider": {"type": "open_ai", "model": "gpt-4o-mini"}, "prompt": "..."},
     "speak":  {"provider": {"type": "deepgram", "model": "aura-2-asteria-en"}},
 })
 created = client.voice_agent.configurations.create(
-    project_id=pid,
-    config=config_json,
-    metadata={"label": "support-en"},
+    project_id=pid, config=config_json, metadata={"label": "support-en"},
 )
 print(created.agent_id)
 
-# Update metadata (immutable config body — create a new one to change behavior)
 client.voice_agent.configurations.update(project_id=pid, agent_id=created.agent_id, metadata={"label": "v2"})
-
-# Get / delete
 one = client.voice_agent.configurations.get(project_id=pid, agent_id=created.agent_id)
-# client.voice_agent.configurations.delete(project_id=pid, agent_id=...)
 ```
 
 Think-provider model discovery (which LLMs Agent supports):
@@ -140,18 +122,28 @@ projects = await client.manage.v1.projects.list()
    - https://developers.deepgram.com/reference/voice-agent/agent-configurations/create-agent-configuration
    - https://developers.deepgram.com/reference/voice-agent/think-models
 
+## Destructive operation guard
+
+Delete operations (projects, keys, agent configs) are **irreversible**. Always verify the resource before deleting:
+
+```python
+# Confirm before deleting a key
+key = client.manage.v1.projects.keys.list(project_id=pid)
+target = next((k for k in key.api_keys if k.api_key_id == kid), None)
+assert target is not None, f"Key {kid} not found"
+print(f"Deleting key: {target.comment}")
+client.manage.v1.projects.keys.delete(project_id=pid, key_id=kid)
+```
+
 ## Gotchas
 
 1. **`Token` auth, not `Bearer`.**
-2. **Project-scoped resources are nested under `.projects.*`.** There is no top-level `client.manage.v1.keys` / `.members` / `.invites` / `.usage` / `.billing`. Use `client.manage.v1.projects.keys`, `...projects.members`, `...projects.members.invites`, `...projects.usage`, `...projects.billing.balances`, and `...projects.requests` for request logs. The only top-level `client.manage.v1.*` namespaces are `projects` and `models`.
-3. **Think-model discovery is on the Agent client**, not Manage: `client.agent.v1.settings.think.models.list()`. There is no `client.manage.v1.agent.*`.
-4. **Agent config body is a JSON STRING on create**, not a nested object. Pass `config=json.dumps(...)`.
-5. **Agent config is the `agent` block only**, not the full Settings message. Do not include top-level fields like `audio` — those go in the live Settings message at connect time.
-6. **Agent configs are immutable** — you cannot edit the config body. Create a new one to change behavior. Only metadata is mutable.
-7. **Use `include_outdated=True`** on `models.list()` when pinning older models.
-8. **Delete is irreversible.** Wire tests typically comment out destructive calls.
-9. **Project-scoped vs global models**: `client.manage.v1.models.list()` returns all; `client.manage.v1.projects.models.list(project_id=...)` returns what the project can access.
-10. **Returned agent configs are uninterpolated** — raw stored JSON string. Parse before use.
+2. **Project-scoped resources are nested under `.projects.*`.** No top-level `client.manage.v1.keys` etc. Use `client.manage.v1.projects.keys`, `...projects.members`, `...projects.members.invites`, `...projects.usage`, `...projects.billing.balances`, `...projects.requests`.
+3. **Think-model discovery is on the Agent client**, not Manage: `client.agent.v1.settings.think.models.list()`.
+4. **Agent config body is a JSON STRING on create**: pass `config=json.dumps(...)`. See the Voice Agent configurations section above for full details.
+5. **Use `include_outdated=True`** on `models.list()` when pinning older models.
+6. **Project-scoped vs global models**: `client.manage.v1.models.list()` returns all; `client.manage.v1.projects.models.list(project_id=...)` returns what the project can access.
+7. **Returned agent configs are uninterpolated** -- raw stored JSON string. Parse before use.
 
 ## Example files in this repo
 
@@ -170,12 +162,4 @@ projects = await client.manage.v1.projects.list()
 
 - `deepgram-python-voice-agent` — run an agent (use a config created here)
 
-## Central product skills
-
-For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
-
-```bash
-npx skills add deepgram/skills
-```
-
-This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
+For cross-language Deepgram product knowledge, install the central skills: `npx skills add deepgram/skills`.