docs(ai-chat): warn against chat.defer for onTurnStart message persistence

ericallam · ericallam · commit 1fe31667ae6e · 2026-04-30T21:03:25.000+01:00
`chat.defer(db.chat.update(...))` in `onTurnStart` is fire-and-forget — the hook resolves and streaming begins before the write lands. A mid-stream page refresh then reads `[]` from the DB, the resumed SSE stream pushes the assistant into an empty array, and the user's message disappears from the rendered conversation.

- patterns/database-persistence.mdx: replace the misleading "optionally use chat.defer" line with an awaited persistence + a Warning showing wrong/right examples and the failure mode. Update the minimal pseudocode to use await.
- features.mdx (chat.defer reference): swap the misleading example (db.chat.update inside onTurnStart) for an analytics-tracking example. Add a Warning cross-linking back to the persistence doc.

Reserve chat.defer for writes whose timing has no resume implication.
diff --git a/docs/ai-chat/features.mdx b/docs/ai-chat/features.mdx
@@ -162,14 +162,14 @@ onTurnComplete: async ({ chatId }) => {
 
 Use `chat.defer()` to run background work in parallel with streaming. The deferred promise runs alongside the LLM response and is awaited (with a 5s timeout) before `onTurnComplete` fires.
 
-This moves non-blocking work (DB writes, analytics, etc.) out of the critical path:
+This moves non-blocking work (analytics, audit logs, search-index writes, cache warming) out of the critical path:
 
 ```ts
 export const myChat = chat.agent({
   id: "my-chat",
-  onTurnStart: async ({ chatId, uiMessages }) => {
-    // Persist messages without blocking the LLM call
-    chat.defer(db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } }));
+  onTurnStart: async ({ chatId, runId }) => {
+    // Analytics — fire-and-forget, irrelevant to resume.
+    chat.defer(analytics.track("turn_started", { chatId, runId }));
   },
   run: async ({ messages, signal }) => {
     return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal });
@@ -179,6 +179,10 @@ export const myChat = chat.agent({
 
 `chat.defer()` can be called from anywhere during a turn — hooks, `run()`, or nested helpers. All deferred promises are collected and awaited together before `onTurnComplete`.
 
+<Warning>
+**Don't use `chat.defer()` for the message-history write in `onTurnStart`.** That write must land *before* the model starts streaming, otherwise a mid-stream page refresh will read `[]` from your DB and lose the user's message from the rendered conversation. See [Database persistence — `onTurnStart`](/ai-chat/patterns/database-persistence#onturnstart). Reserve `chat.defer` for writes whose timing has no resume implication.
+</Warning>
+
 ---
 
 ## Custom data parts
diff --git a/docs/ai-chat/patterns/database-persistence.mdx b/docs/ai-chat/patterns/database-persistence.mdx
@@ -48,8 +48,25 @@ If you skip preload, do the equivalent in **`onChatStart`** when **`preloaded`**
 
 ### `onTurnStart`
 
-- Persist **`uiMessages`** (full accumulated history including the new user turn) **before** streaming starts — so a mid-stream refresh still shows the user’s message.
-- Optionally use [`chat.defer()`](/ai-chat/features#chat-defer) so the write does not block the model if your driver is slow.
+- **`await`** persist **`uiMessages`** (full accumulated history including the new user turn) **before** the hook returns — `chat.agent` does not begin streaming until `onTurnStart` resolves, so this is what bounds "user message is durable before the stream".
+
+<Warning>
+**Don't use [`chat.defer()`](/ai-chat/features#chat-defer) for the message write here.** `chat.defer` is fire-and-forget — the hook resolves before the write lands and the stream starts immediately. If the user refreshes mid-stream, the next page load reads `[]` from your DB, the resumed SSE stream pushes the assistant into an empty array, and the user's message disappears from the rendered conversation forever.
+
+```ts
+// ❌ Bad — non-blocking write, mid-stream refresh drops the user message.
+onTurnStart: async ({ chatId, uiMessages }) => {
+  chat.defer(db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } }));
+},
+
+// ✅ Good — awaited, durable before the model starts.
+onTurnStart: async ({ chatId, uiMessages }) => {
+  await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } });
+},
+```
+
+`chat.defer` is for writes whose timing doesn't matter for resume — analytics, audit logs, search-index updates, etc. Anything the next page load reads needs to land before the stream begins.
+</Warning>
 
 ### `onTurnComplete`
 
@@ -128,7 +145,8 @@ chat.agent({
   },
 
   onTurnStart: async ({ chatId, uiMessages }) => {
-    chat.defer(saveConversationMessages(chatId, uiMessages));
+    // Awaited, not chat.defer — see the warning in `onTurnStart` above.
+    await saveConversationMessages(chatId, uiMessages);
   },
 
   onTurnComplete: async ({ chatId, uiMessages, chatAccessToken, lastEventId }) => {