Skip to content

Commit 0813087

Browse files
committed
docs(ai-chat): large-payloads pattern + ChatChunkTooLargeError reference
The chat output stream caps each record at ~1 MiB, and chat.agent now throws a typed ChatChunkTooLargeError when a chunk overruns. Document both the typed error and the two workaround patterns: - ID-reference: persist large values to your store, emit only an id + preview through the chat stream, fetch the full payload on demand. - Out-of-band streams.writer(): a separate run-scoped channel for transient/per-turn data the chat stream shouldn't carry. Pages - New patterns/large-payloads.mdx covering the cause, the typed error, both patterns, and what doesn't trigger the cap (chat.history, chat.inject, chat.defer). - error-handling.mdx gains a short ChatChunkTooLargeError section that cross-links the new patterns page. - docs.json adds the new patterns page to the Agents → Patterns sidebar.
1 parent 61933d4 commit 0813087

3 files changed

Lines changed: 198 additions & 0 deletions

File tree

docs/ai-chat/error-handling.mdx

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -393,10 +393,19 @@ To add retry-like behavior:
393393
6. **Use `onFailure` for run-level monitoring** (Sentry, monitoring dashboards).
394394
7. **For known transient errors (rate limits, network)**, consider a fallback model inside `run()` instead of failing the turn.
395395

396+
## `ChatChunkTooLargeError`
397+
398+
A specific run-failing error worth flagging on its own. Anything written through the chat output is one record on the underlying realtime stream, capped at ~1 MiB per record. A single chunk over the cap throws `ChatChunkTooLargeError` (named export from `@trigger.dev/sdk`). The most common trigger is a tool whose result object is large enough to overflow as one `tool-output-available` chunk.
399+
400+
The error carries `chunkType`, `chunkSize`, and `maxSize`. Catch with the `isChatChunkTooLargeError` guard and route oversized values out-of-band.
401+
402+
See [Large payloads in chat.agent](/ai-chat/patterns/large-payloads) for the two patterns that work around the cap (ID-reference + run-scoped `streams.writer()`).
403+
396404
## See also
397405

398406
- [`uiMessageStreamOptions.onError`](/ai-chat/backend#error-handling-with-onerror) — stream error handler details
399407
- [Custom actions](/ai-chat/backend#actions) — implement undo/retry actions
400408
- [`chat.history`](/ai-chat/backend#chat-history) — rollback to a previous message
409+
- [Large payloads](/ai-chat/patterns/large-payloads) — handling the ~1 MiB per-chunk cap
401410
- [Database persistence](/ai-chat/patterns/database-persistence) — saving conversation state
402411
- [Standard task hooks](/tasks/overview)`onFailure`, `onComplete`, `onWait`, etc.
Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
---
2+
title: "Large payloads in chat.agent"
3+
sidebarTitle: "Large payloads"
4+
description: "Why a single chunk on the chat stream is capped at ~1 MiB, what error you'll see, and the two patterns that work around it: ID references and out-of-band run streams."
5+
---
6+
7+
The realtime stream that backs `chat.agent` enforces a **per-record cap of ~1 MiB** (`1048576` bytes minus a small envelope reserve). Anything written through the chat output — auto-piped LLM chunks, `chat.response.write`, `chat.store.set`, custom `writer.write` parts — counts as one record per chunk and is rejected if it crosses the cap.
8+
9+
This is a platform-level limit and cannot be raised per project or per stream.
10+
11+
## What you'll see
12+
13+
When a chunk crosses the cap, the run fails with a typed [`ChatChunkTooLargeError`](/ai-chat/error-handling):
14+
15+
```
16+
ChatChunkTooLargeError: chat.agent chunk of type "tool-output-available" is 2000126 bytes,
17+
over the realtime stream's per-record cap of 1047552 bytes. For oversized payloads
18+
(e.g. large tool outputs), write the value to your own store and emit only an id/url
19+
through the chat stream — see https://trigger.dev/docs/ai-chat/patterns/large-payloads.
20+
```
21+
22+
The error includes:
23+
24+
- `chunkType` — discriminant on the chunk that failed (e.g. `tool-output-available`, `data-handover`, `text-delta`).
25+
- `chunkSize` — UTF-8 byte count of the JSON-serialized record.
26+
- `maxSize` — the effective cap.
27+
28+
You can catch and re-throw / log it explicitly:
29+
30+
```ts
31+
import { ChatChunkTooLargeError, isChatChunkTooLargeError } from "@trigger.dev/sdk";
32+
33+
try {
34+
await someWrite();
35+
} catch (err) {
36+
if (isChatChunkTooLargeError(err)) {
37+
logger.error("Oversized chunk", { type: err.chunkType, size: err.chunkSize });
38+
}
39+
throw err;
40+
}
41+
```
42+
43+
## Most common cause: large tool outputs
44+
45+
If you return a `streamText` result from `run()`, the AI SDK auto-pipes its `UIMessageStream` into the chat output. A tool whose result object is large (a fetched HTML body, a CSV blob, an image as base64, a deep DB row dump) gets emitted as one `tool-output-available` chunk — and that's the chunk that overruns.
46+
47+
**Diagnose first**: log tool sizes during development.
48+
49+
```ts
50+
const fetchPage = tool({
51+
inputSchema: z.object({ url: z.string().url() }),
52+
execute: async ({ url }) => {
53+
const html = await (await fetch(url)).text();
54+
if (html.length > 500_000) {
55+
logger.warn("Large tool output", { tool: "fetchPage", bytes: html.length });
56+
}
57+
return { html };
58+
},
59+
});
60+
```
61+
62+
If the size is unbounded by input, fix the tool — not the stream.
63+
64+
## Pattern 1: ID-reference (recommended)
65+
66+
Store the large value in your own database (or object store) and emit only an identifier through the chat stream. The frontend fetches the full payload separately on demand.
67+
68+
This keeps the chat stream small, predictable, and resumable, and lets you reuse the value across turns or sessions without re-streaming it.
69+
70+
<CodeGroup>
71+
72+
```ts task.ts
73+
import { chat } from "@trigger.dev/sdk/ai";
74+
import { tool } from "ai";
75+
import { z } from "zod";
76+
77+
const fetchPage = tool({
78+
description: "Fetch a URL and store the HTML for later inspection.",
79+
inputSchema: z.object({ url: z.string().url() }),
80+
execute: async ({ url }) => {
81+
const html = await (await fetch(url)).text();
82+
const docId = await db.documents.create({
83+
data: { url, html, byteSize: html.length },
84+
});
85+
86+
// Tool result is small — just an id and metadata.
87+
// The model and the UI both work with this lightweight handle.
88+
return {
89+
docId,
90+
url,
91+
byteSize: html.length,
92+
preview: html.slice(0, 500),
93+
};
94+
},
95+
});
96+
```
97+
98+
```ts api/document/[id]/route.ts
99+
// Frontend fetches the full document on demand.
100+
import { auth, currentUser } from "@/lib/auth";
101+
102+
export async function GET(_req: Request, { params }: { params: { id: string } }) {
103+
const user = await currentUser();
104+
const doc = await db.documents.findUniqueOrThrow({
105+
where: { id: params.id, userId: user.id },
106+
});
107+
return new Response(doc.html, { headers: { "content-type": "text/html" } });
108+
}
109+
```
110+
111+
```tsx component.tsx
112+
function ToolResultCard({ part }: { part: ToolUIPart<"fetchPage"> }) {
113+
const { docId, url, byteSize, preview } = part.output;
114+
return (
115+
<div>
116+
<p>{url}{(byteSize / 1024).toFixed(0)} KB</p>
117+
<pre>{preview}…</pre>
118+
<a href={`/api/document/${docId}`}>Open full HTML</a>
119+
</div>
120+
);
121+
}
122+
```
123+
124+
</CodeGroup>
125+
126+
The same pattern works for `chat.response.write` — push the heavy value to your DB, then emit a small data part with the id:
127+
128+
```ts
129+
const id = await db.attachments.create({ data: { content: hugeReport } });
130+
chat.response.write({ type: "data-report", data: { id, summary: shortSummary } });
131+
```
132+
133+
<Tip>
134+
Persist the large value **before** you emit the id chunk. If the chunk reaches the UI before the row is written, the frontend gets a 404 on the follow-up fetch.
135+
</Tip>
136+
137+
## Pattern 2: Out-of-band `streams.writer()`
138+
139+
If the value is **only useful for the lifetime of the run** (a long log tail, a transient progress dump, a per-turn debug trace) and you don't want to persist it, write it to a **separate run-scoped stream** instead. Run-scoped `streams.writer()` is its own channel — chunks go through the same per-record cap, but the chat stream stays untouched, and `useRealtimeRunWithStreams` consumes them independently of the chat UI.
140+
141+
```ts
142+
import { task, streams } from "@trigger.dev/sdk";
143+
import { chat } from "@trigger.dev/sdk/ai";
144+
145+
const debugLog = streams.define<{ line: string }>("debug-log");
146+
147+
export const myChat = chat.agent({
148+
id: "my-chat",
149+
run: async ({ messages, signal }) => {
150+
// Heavy diagnostic stream lives on its own channel.
151+
const log = debugLog.writer();
152+
log.write({ line: "starting turn" });
153+
154+
return streamText({ /* ... */ });
155+
},
156+
});
157+
```
158+
159+
Frontend:
160+
161+
```tsx
162+
import { useRealtimeRunWithStreams } from "@trigger.dev/react-hooks";
163+
164+
function DebugPanel({ runId }: { runId: string }) {
165+
const { streams } = useRealtimeRunWithStreams<typeof myChat>(runId);
166+
return (
167+
<pre>{streams?.["debug-log"]?.map((c) => c.line).join("\n")}</pre>
168+
);
169+
}
170+
```
171+
172+
Same 1 MiB cap applies per record, so split long content across multiple writes (one record per line, per page, per progress tick) rather than one large blob.
173+
174+
## What does **not** trigger the cap
175+
176+
These calls don't go through the realtime stream and have no per-record cap:
177+
178+
- [`chat.history.set` / `slice` / `replace` / `remove`](/ai-chat/features#chathistory) — locals-only mutations on the in-memory message list.
179+
- [`chat.inject`](/ai-chat/features#chatinject) — appends to the run's pending message queue, not the stream.
180+
- [`chat.defer`](/ai-chat/features#chatdefer) — promise registry; awaited at turn boundaries, never serialized to the stream.
181+
182+
The control markers `chat.agent` emits internally (`trigger:turn-complete`, `trigger:upgrade-required`) are tiny by construction.
183+
184+
## See also
185+
186+
- [Error handling](/ai-chat/error-handling) — how `ChatChunkTooLargeError` flows through the layers.
187+
- [Database persistence](/ai-chat/patterns/database-persistence) — your own store as the durable backing for ID references.
188+
- [Client protocol](/ai-chat/client-protocol) — chunk shapes that travel on the chat stream.

docs/docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,7 @@
126126
"ai-chat/patterns/branching-conversations",
127127
"ai-chat/patterns/code-sandbox",
128128
"ai-chat/patterns/human-in-the-loop",
129+
"ai-chat/patterns/large-payloads",
129130
"ai-chat/patterns/skills"
130131
]
131132
},

0 commit comments

Comments
 (0)