pipecat-ai · markbackman · Jun 3, 2026
diff --git a/api-reference/server/services/s2s/openai.mdx b/api-reference/server/services/s2s/openai.mdx
@@ -126,6 +126,15 @@ _Deprecated in v0.0.105. Use `settings=OpenAIRealtimeLLMService.Settings(session
   `"high"` provides more detail.
 </ParamField>
 
+<ParamField path="user_audio_preroll_secs" type="float | None" default="None">
+  In manual turn-detection mode (`turn_detection=False`, locally-driven turns), how much recent
+  audio to replay after an interruption clears the input audio buffer, so the speech onset isn't lost.
+  Defaults to `None`: auto-sized to the upstream VAD's `start_secs` plus a small margin, falling
+  back to `0.5` seconds when no VAD is present. Auto-sizing assumes VAD drives turn starts (the
+  default `VADUserTurnStartStrategy`); set this explicitly if you use a non-VAD turn-start strategy.
+  No effect when server-side turn detection is enabled.
+</ParamField>
+
 <ParamField path="**kwargs" type="Any">
   Additional arguments passed to parent LLMService.
 </ParamField>
@@ -350,6 +359,7 @@ await task.queue_frame(
 - **Model is connection-level**: The `model` parameter is set via the WebSocket URL at connection time and cannot be changed during a session.
 - **Output modalities are single-mode**: The API supports either `["text"]` or `["audio"]` output, not both simultaneously.
 - **Turn detection options**: Use `TurnDetection` for traditional VAD, `SemanticTurnDetection` for AI-based turn detection, or `False` to disable server-side detection and manage turns manually.
+- **Manual turn detection pre-roll**: When server-side turn detection is disabled (`turn_detection=False`), the service maintains a rolling audio buffer that is replayed after interruptions to preserve speech onsets. Configure the buffer duration with `user_audio_preroll_secs` or let it auto-size from the upstream VAD's `start_secs`.
 - **Audio output format**: The service outputs 24kHz PCM audio by default.
 - **Video support**: Video frames can be sent to the model for multimodal input. Control the detail level with `video_frame_detail` and pause/resume with `set_video_input_paused()`.
 - **Transcription frames**: User speech transcription frames are always emitted upstream when input audio transcription is configured.