Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions api-reference/server/services/stt/moonshine.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
---
title: "Moonshine"
description: "Speech-to-text service implementation using locally-downloaded Moonshine ONNX models"
---

## Overview

`MoonshineSTTService` provides offline speech recognition using Moonshine's small, fast ASR models running locally on the CPU via ONNX Runtime. No GPU required, no API key needed - models download once on first use and are cached locally for privacy-focused transcription.

<CardGroup cols={2}>
<Card
title="Moonshine STT API Reference"
icon="code"
href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.moonshine.stt.html"
>
Pipecat's API methods for Moonshine STT integration
</Card>
<Card
title="Moonshine Example"
icon="play"
href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-moonshine.py"
>
Complete example with Moonshine STT
</Card>
<Card
title="Moonshine Documentation"
icon="book"
href="https://github.com/moonshine-ai/moonshine"
>
Moonshine ASR model details and research
</Card>
<Card
title="Moonshine Voice Package"
icon="microphone"
href="https://pypi.org/project/moonshine-voice/"
>
Python package for Moonshine models
</Card>
</CardGroup>

## Installation

```bash
uv add "pipecat-ai[moonshine]"
```

## Prerequisites

### Local Model Setup

Before using Moonshine STT service, you need:

1. **Model Selection**: Choose appropriate Moonshine model size (tiny, base, small-streaming, medium-streaming)
2. **Storage Space**: Ensure sufficient disk space for model downloads (models are cached after first use)
3. **CPU Resources**: Moonshine runs efficiently on CPU via ONNX Runtime

### Configuration Options

- **Model Size**: Balance between accuracy and performance based on your needs
- **Language Support**: Moonshine supports English, Spanish, and other languages
- **No API Key**: Runs entirely locally for complete privacy

<Tip>
No API keys or GPU required - Moonshine runs efficiently on CPU for complete privacy.
</Tip>

## Configuration

<ParamField path="settings" type="MoonshineSTTService.Settings" default="None">
Runtime-configurable settings for the STT service. See [MoonshineSTTService Settings](#moonshinesttsettings) below.
</ParamField>

## MoonshineSTTSettings

Runtime-configurable settings passed via the `settings` constructor argument using `MoonshineSTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter | Type | Default | Description |
| ---------- | ----------------- | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `model` | `str \| Model` | `Model.SMALL_STREAMING` | Moonshine model architecture. Available models: `TINY`, `BASE`, `TINY_STREAMING`, `BASE_STREAMING`, `SMALL_STREAMING` (default), `MEDIUM_STREAMING`. |
| `language` | `Language \| str` | `Language.EN` | Language for transcription. Moonshine supports English, Spanish, and other languages. The base language code is used (e.g., "en" from "en-US"). _(Inherited from base STT settings.)_ |

## Usage

### Basic Setup

```python
from pipecat.services.moonshine.stt import MoonshineSTTService

stt = MoonshineSTTService()
```

### With Custom Model

```python
from pipecat.services.moonshine.stt import MoonshineSTTService, Model

stt = MoonshineSTTService(
settings=MoonshineSTTService.Settings(
model=Model.MEDIUM_STREAMING,
),
)
```

### With Custom Language

```python
from pipecat.services.moonshine.stt import MoonshineSTTService, Model
from pipecat.transcriptions.language import Language

stt = MoonshineSTTService(
settings=MoonshineSTTService.Settings(
model=Model.SMALL_STREAMING,
language=Language.ES,
),
)
```

### With Model as String

```python
from pipecat.services.moonshine.stt import MoonshineSTTService

stt = MoonshineSTTService(
settings=MoonshineSTTService.Settings(
model="base",
),
)
```

## Notes

- **First run downloads**: The selected model downloads from the Moonshine model hub on first use and is cached locally. Later runs load it from the cache.
- **Segmented transcription**: `MoonshineSTTService` extends `SegmentedSTTService`, meaning it processes complete audio segments after VAD detects the user has stopped speaking.
- **CPU-only**: Moonshine runs efficiently on CPU via ONNX Runtime, so no GPU is required. This makes it ideal for resource-constrained environments.
- **Audio format**: Expects 16-bit mono PCM audio at 16 kHz sample rate.
- **Model variants**: The streaming-capable models (`TINY_STREAMING`, `SMALL_STREAMING`, `MEDIUM_STREAMING`) can be run in batch mode just like the non-streaming variants. The larger streaming models (`SMALL_STREAMING`, `MEDIUM_STREAMING`) are only available in streaming form.
- **Language support**: Moonshine supports multiple languages (English, Spanish, and others). The service uses the base language code (e.g., "en" from "en-US").
- **No external dependencies**: Unlike API-based STT services, Moonshine requires no API keys or network connectivity after the initial model download.
1 change: 1 addition & 0 deletions api-reference/server/services/supported-services.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ Speech-to-Text services receive and audio input and output transcriptions.
| [Gradium](/api-reference/server/services/stt/gradium) | `uv add "pipecat-ai[gradium]"` |
| [Groq (Whisper)](/api-reference/server/services/stt/groq) | `uv add "pipecat-ai[groq]"` |
| [Mistral](/api-reference/server/services/stt/mistral) | `uv add "pipecat-ai[mistral]"` |
| [Moonshine](/api-reference/server/services/stt/moonshine) | `uv add "pipecat-ai[moonshine]"` |
| [NVIDIA](/api-reference/server/services/stt/nvidia) | `uv add "pipecat-ai[nvidia]"` |
| [OpenAI](/api-reference/server/services/stt/openai) | `uv add "pipecat-ai[openai]"` |
| [Sarvam](/api-reference/server/services/stt/sarvam) | `uv add "pipecat-ai[sarvam]"` |
Expand Down
1 change: 1 addition & 0 deletions docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -348,6 +348,7 @@
"api-reference/server/services/stt/gradium",
"api-reference/server/services/stt/groq",
"api-reference/server/services/stt/mistral",
"api-reference/server/services/stt/moonshine",
"api-reference/server/services/stt/nvidia",
"api-reference/server/services/stt/openai",
"api-reference/server/services/stt/sarvam",
Expand Down
Loading