Skip to content

fix(batch-queue): Batch items that hit the environment queue size limit now fast-fail#3352

Merged
ericallam merged 1 commit intomainfrom
ea-branch-124
Apr 13, 2026
Merged

fix(batch-queue): Batch items that hit the environment queue size limit now fast-fail#3352
ericallam merged 1 commit intomainfrom
ea-branch-124

Conversation

@ericallam
Copy link
Copy Markdown
Member

No description provided.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 9, 2026

⚠️ No Changeset found

Latest commit: f32dc72

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 9, 2026

Walkthrough

This pull request implements a fast-fail mechanism for batch items exceeding queue size limits. It introduces a new QueueSizeLimitExceededError that extends ServiceValidationError with a maximumSize property. When triggered during task validation, this error is caught in batch processing, logs a warning, sets tracing attributes, and returns skipRetries: true. The batch completion logic detects when all failures share this error code and aggregates them into a single failure record rather than creating per-item entries. A new skipRetries flag in the batch item callback result type prevents retry scheduling in the queue, ensuring no retries occur regardless of attempt count.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Rationale

The changes span 7 files with mixed complexity. The core additions include: a new error type definition, special-case error handling with aggregation logic in batch processing, type signature updates for the callback result, and retry scheduling modifications. While the changes follow a cohesive pattern for a single feature, the aggregation condition logic in runEngineHandlers.server.ts (checking if all failures match a specific error code) and the coordination across multiple layers (error throwing, batch processing, queue scheduling, and testing) require careful review to ensure correctness of the fast-fail and skip-retries behavior.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning No pull request description was provided; the required template sections (Testing, Changelog, Checklist) are entirely missing. Add a description following the template: include the issue number, complete the checklist, describe testing steps, and provide a changelog entry explaining the batch fast-fail behavior.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: batch items hitting queue size limits now fast-fail instead of retrying.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ea-branch-124

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/webapp/app/services/realtime/mintRunToken.server.ts (1)

9-10: Extract the default expiration to a constant to avoid literal drift.

This is minor, but it helps keep the default value consistent if it changes later.

♻️ Suggested refactor
 import { generateJWT as internal_generateJWT } from "@trigger.dev/core/v3";
 import { extractJwtSigningSecretKey } from "./jwtAuth.server";
 
 type Environment = Parameters<typeof extractJwtSigningSecretKey>[0];
+const DEFAULT_RUN_TOKEN_EXPIRATION = "1h";
 
 export type MintRunTokenOptions = {
   /** Include the input-stream write scope (needed for steering messages from the playground). */
   includeInputStreamWrite?: boolean;
   /** Token expiration. Defaults to "1h". */
@@
   return internal_generateJWT({
     secretKey: extractJwtSigningSecretKey(environment),
     payload: {
       sub: environment.id,
       pub: true,
       scopes,
     },
-    expirationTime: options.expirationTime ?? "1h",
+    expirationTime: options.expirationTime ?? DEFAULT_RUN_TOKEN_EXPIRATION,
   });
 }

Also applies to: 39-39

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/realtime/mintRunToken.server.ts` around lines 9 -
10, Extract the default expiration literal into a single constant (e.g.,
DEFAULT_EXPIRATION = '1h') and use that constant wherever the "1h" default
appears: replace any inline default for the expirationTime property and the
default used in mintRunToken (or related function) so both reference
DEFAULT_EXPIRATION; update the JSDoc/comment to reference the constant rather
than a hardcoded string to keep defaults consistent if changed later.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@apps/webapp/app/services/realtime/mintRunToken.server.ts`:
- Around line 9-10: Extract the default expiration literal into a single
constant (e.g., DEFAULT_EXPIRATION = '1h') and use that constant wherever the
"1h" default appears: replace any inline default for the expirationTime property
and the default used in mintRunToken (or related function) so both reference
DEFAULT_EXPIRATION; update the JSDoc/comment to reference the constant rather
than a hardcoded string to keep defaults consistent if changed later.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0cd3e7b4-4381-46e6-987e-076f71e2097e

📥 Commits

Reviewing files that changed from the base of the PR and between bd41bb2 and 9f71b2d.

📒 Files selected for processing (8)
  • .server-changes/batch-fast-fail-queue-size-limit.md
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • internal-packages/run-engine/src/batch-queue/index.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: typecheck / typecheck
🧰 Additional context used
📓 Path-based instructions (15)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

**/*.ts: Use typecheck to verify changes in apps and internal packages (apps/*, internal-packages/*), not build - building proves almost nothing about correctness
When writing Trigger.dev tasks, always import from @trigger.dev/sdk. Never use @trigger.dev/sdk/v3 or deprecated client.defineJob
Add crumbs as you write code - mark lines with // @Crumbs or wrap blocks in `// `#region` `@crumbs for agentcrumbs debug tracing, then strip before merge

Files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
apps/webapp/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access all environment variables through the env export of env.server.ts instead of directly accessing process.env in the Trigger.dev webapp

Files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: When importing from @trigger.dev/core in the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp

Files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
apps/**/*

📄 CodeRabbit inference engine (CLAUDE.md)

When modifying only server components (apps/webapp/, apps/supervisor/, etc.) with no package changes, add a .server-changes/ file instead of a changeset

Files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
apps/webapp/**/*.server.{ts,tsx}

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

apps/webapp/**/*.server.{ts,tsx}: Environment variables must be accessed via the env export from app/env.server.ts and never use process.env directly
Always use findFirst instead of findUnique in Prisma queries due to implicit DataLoader batching issues and performance concerns

Files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
apps/webapp/**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

Use named constants for sentinel/placeholder values instead of raw string literals scattered across comparisons

Files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
apps/webapp/app/v3/services/**/*.server.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Organize services in the webapp following the pattern app/v3/services/*/*.server.ts

Files:

  • apps/webapp/app/v3/services/common.server.ts
**/*.{test,spec}.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use vitest for all tests in the Trigger.dev repository

Files:

  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
**/*.test.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.test.{ts,tsx,js,jsx}: Test files should live beside the files under test and use descriptive describe and it blocks
Tests should avoid mocks or stubs and use the helpers from @internal/testcontainers when Redis or Postgres are needed
Use vitest for running unit tests

Files:

  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
**/*.test.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Use vitest exclusively for testing and never mock anything - use testcontainers instead for Redis/PostgreSQL

Files:

  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
apps/webapp/app/services/**/*.server.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Separate testable services from configuration files; follow the pattern of realtimeClient.server.ts (testable service) and realtimeClientGlobal.server.ts (configuration) in the webapp

Files:

  • apps/webapp/app/services/realtime/mintRunToken.server.ts
🧠 Learnings (36)
📓 Common learnings
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Use `tasks.batchTrigger()` to trigger multiple runs of a single task from backend code with different payloads
📚 Learning: 2026-03-30T22:25:33.107Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-30T22:25:33.107Z
Learning: Applies to apps/webapp/{app/v3/services/triggerTask.server.ts,app/v3/services/batchTriggerV3.server.ts} : Do NOT add database queries to `triggerTask.server.ts` or `batchTriggerV3.server.ts`; instead piggyback on the existing `backgroundWorkerTask.findFirst()` query in `queues.server.ts`

Applied to files:

  • .server-changes/batch-fast-fail-queue-size-limit.md
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-03T13:08:03.862Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3166
File: packages/redis-worker/src/fair-queue/index.ts:1114-1121
Timestamp: 2026-03-03T13:08:03.862Z
Learning: In packages/redis-worker/src/fair-queue/index.ts, it's acceptable for the worker queue depth cap check to allow overshooting by up to batchClaimSize messages per iteration, as the next iteration will recheck and prevent sustained growth beyond the limit.

Applied to files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-04-07T14:12:59.018Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3331
File: apps/webapp/app/runEngine/concerns/batchPayloads.server.ts:112-136
Timestamp: 2026-04-07T14:12:59.018Z
Learning: In `apps/webapp/app/runEngine/concerns/batchPayloads.server.ts`, the `pRetry` call wrapping `uploadPacketToObjectStore` intentionally retries **all** error types (no `shouldRetry` filter / `AbortError` guards). The maintainer explicitly prefers over-retrying to under-retrying because multiple heterogeneous object store backends are supported and it is impractical to enumerate all permanent error signatures. Do not flag this as an issue in future reviews.

Applied to files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-03T13:07:33.177Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3166
File: internal-packages/run-engine/src/batch-queue/tests/index.test.ts:711-713
Timestamp: 2026-03-03T13:07:33.177Z
Learning: In `internal-packages/run-engine/src/batch-queue/tests/index.test.ts`, test assertions for rate limiter stubs can use `toBeGreaterThanOrEqual` rather than exact equality (`toBe`) because the consumer loop may call the rate limiter during empty pops in addition to actual item processing, and this over-calling is acceptable in integration tests.

Applied to files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-31T06:25:47.761Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3299
File: internal-packages/run-engine/src/run-queue/index.ts:683-699
Timestamp: 2026-03-31T06:25:47.761Z
Learning: In `internal-packages/run-engine/src/run-queue/index.ts`, message scores passed to `enqueueMessage` are always <= now (computed as `(queueTimestamp || createdAt) - priorityMs`). Future-scored messages (nack retries) never go through `enqueueMessage`; they go directly through `nackMessage` → ZADD to the queue sorted set. Do not flag fast-path enqueue for allowing future-timestamped messages.

Applied to files:

  • internal-packages/run-engine/src/batch-queue/index.ts
📚 Learning: 2026-03-02T12:43:43.173Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: packages/redis-worker/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:43.173Z
Learning: Applies to packages/redis-worker/**/redis-worker/src/queue.ts : Job queue abstraction should be Redis-backed in src/queue.ts

Applied to files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

  • internal-packages/run-engine/src/batch-queue/index.ts
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/services/realtime/mintRunToken.server.ts
📚 Learning: 2026-02-10T16:18:48.654Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 2980
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx:512-515
Timestamp: 2026-02-10T16:18:48.654Z
Learning: In apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx, environment.queueSizeLimit is a per-queue maximum that is configured at the environment level, not a shared limit across all queues. Each queue can have up to environment.queueSizeLimit items queued independently.

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Configure concurrency using the `queue` option with `concurrencyLimit` property

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-30T22:25:33.107Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-30T22:25:33.107Z
Learning: Applies to apps/webapp/app/v3/queues.server.ts : When adding new task-level defaults, add to the existing `select` clause in `backgroundWorkerTask.findFirst()` query in `queues.server.ts` instead of adding a second query

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2025-11-14T16:03:06.917Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2681
File: apps/webapp/app/services/platform.v3.server.ts:258-302
Timestamp: 2025-11-14T16:03:06.917Z
Learning: In `apps/webapp/app/services/platform.v3.server.ts`, the `getDefaultEnvironmentConcurrencyLimit` function intentionally throws an error (rather than falling back to org.maximumConcurrencyLimit) when the billing client returns undefined plan limits. This fail-fast behavior prevents users from receiving more concurrency than their plan entitles them to. The org.maximumConcurrencyLimit fallback is only for self-hosted deployments where no billing client exists.

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Limit task execution time using the `maxDuration` option (in seconds)

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-31T06:25:47.761Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3299
File: internal-packages/run-engine/src/run-queue/index.ts:683-699
Timestamp: 2026-03-31T06:25:47.761Z
Learning: In `internal-packages/run-engine/src/run-queue/index.ts`, the fast-path enqueue (when `enableFastPath` is true) intentionally skips the TTL sorted set. The `expireRun` worker job is scheduled independently *before* `enqueueMessage` is called (in `engine.trigger()`), so TTL expiry is handled regardless of which path the message takes. The slow-path dequeue Lua removes from the TTL set on dequeue (not on enqueue), making the fast-path skip equivalent. Do not suggest adding a `!ttlInfo` guard to the fast-path check — it would prevent nearly all production runs from using the fast path since dev environments default to a 10m TTL.

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `schemaTask()` from `trigger.dev/sdk` with a Zod schema for payload validation

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `task()` from `trigger.dev/sdk` for basic task definitions with `id` and `run` properties

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-31T21:37:31.732Z
Learnt from: isshaddad
Repo: triggerdotdev/trigger.dev PR: 3283
File: docs/migration-n8n.mdx:19-21
Timestamp: 2026-03-31T21:37:31.732Z
Learning: In the trigger.dev SDK (`packages/trigger-sdk/src/v3`), `tasks.triggerAndWait()` and `tasks.batchTriggerAndWait()` are real, valid exported APIs defined in `shared.ts` and re-exported via the `tasks` object in `tasks.ts`. They accept a task ID string as their first argument (not a task instance). These are distinct from the instance methods `yourTask.triggerAndWait()` and `yourTask.batchTriggerAndWait()`. Do not flag `tasks.triggerAndWait()` or `tasks.batchTriggerAndWait()` as non-existent APIs.

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `task.triggerAndWait()` to trigger a task and wait for its result from inside another task

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to trigger multiple different tasks and wait for all results

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `task.batchTriggerAndWait()` to batch trigger a task and wait for all results from inside another task

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger multiple tasks by instance and wait for results

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTask()` to batch trigger multiple tasks by passing task instances

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `task.trigger()` to trigger a single run of a task from inside another task

Applied to files:

  • apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2026-03-10T17:56:20.938Z
Learnt from: samejr
Repo: triggerdotdev/trigger.dev PR: 3201
File: apps/webapp/app/v3/services/setSeatsAddOn.server.ts:25-29
Timestamp: 2026-03-10T17:56:20.938Z
Learning: Do not implement local userId-to-organizationId authorization checks inside org-scoped service classes (e.g., SetSeatsAddOnService, SetBranchesAddOnService) in the web app. Rely on route-layer authentication (requireUserId(request)) and org membership enforcement via the _app.orgs.$organizationSlug layout route. Any userId/organizationId that reaches these services from org-scoped routes has already been validated. Apply this pattern across all org-scoped services to avoid redundant auth checks and maintain consistency.

Applied to files:

  • apps/webapp/app/v3/services/common.server.ts
📚 Learning: 2026-03-29T19:16:28.864Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3291
File: apps/webapp/app/v3/featureFlags.ts:53-65
Timestamp: 2026-03-29T19:16:28.864Z
Learning: When reviewing TypeScript code that uses Zod v3, treat `z.coerce.*()` schemas as their direct Zod type (e.g., `z.coerce.boolean()` returns a `ZodBoolean` with `_def.typeName === "ZodBoolean"`) rather than a `ZodEffects`. Only `.preprocess()`, `.refine()`/`.superRefine()`, and `.transform()` are expected to wrap schemas in `ZodEffects`. Therefore, in reviewers’ logic like `getFlagControlType`, do not flag/unblock failures that require unwrapping `ZodEffects` when the input schema is a `z.coerce.*` schema.

Applied to files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-03T13:07:27.810Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3166
File: internal-packages/run-engine/src/batch-queue/tests/index.test.ts:711-713
Timestamp: 2026-03-03T13:07:27.810Z
Learning: In test files under internal-packages/run-engine/src/batch-queue/tests, prefer asserting using toBeGreaterThanOrEqual for rate limiter behavior when the consumer loop may call the limiter during empty polls as well as during actual processing. This avoids flaky failures due to strict equality, while still validating expected non-decreasing behavior.

Applied to files:

  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
📚 Learning: 2026-03-02T12:43:25.254Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/run-engine/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:25.254Z
Learning: Applies to internal-packages/run-engine/src/engine/tests/**/*.test.ts : Implement tests for RunEngine in `src/engine/tests/` using testcontainers for Redis and PostgreSQL containerization

Applied to files:

  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
📚 Learning: 2026-04-07T14:12:18.946Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3331
File: apps/webapp/test/engine/batchPayloads.test.ts:5-24
Timestamp: 2026-04-07T14:12:18.946Z
Learning: In `apps/webapp/test/engine/batchPayloads.test.ts`, using `vi.mock` for `~/v3/objectStore.server` (stubbing `hasObjectStoreClient` and `uploadPacketToObjectStore`), `~/env.server` (overriding offload thresholds), and `~/v3/tracer.server` (stubbing `startActiveSpan`) is intentional and acceptable. Simulating controlled transient upload failures (e.g., fail N times then succeed) to verify `p-retry` behavior cannot be reproduced with real services or testcontainers. This file is an explicit exception to the repo's general no-mocks policy.

Applied to files:

  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Configure retry behavior using the `retry` option with properties like `maxAttempts`, `factor`, `minTimeoutInMs`, `maxTimeoutInMs`, and `randomize`

Applied to files:

  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
📚 Learning: 2026-03-02T12:43:43.173Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: packages/redis-worker/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:43.173Z
Learning: Applies to packages/redis-worker/**/redis-worker/**/*.{test,spec}.{ts,tsx} : Use testcontainers for Redis in test files for redis-worker

Applied to files:

  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
📚 Learning: 2026-03-30T22:25:33.107Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-30T22:25:33.107Z
Learning: Services should call engine methods from `app/v3/runEngine.server.ts` singleton for all run lifecycle operations (triggering, completing, cancelling, etc.)

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-14T10:06:33.506Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3219
File: internal-packages/run-engine/src/run-queue/index.ts:3264-3353
Timestamp: 2026-03-14T10:06:33.506Z
Learning: In `internal-packages/run-engine/src/run-queue/index.ts`, the `dequeueMessagesFromCkQueue` Lua command intentionally dequeues only 1 message per CK sub-queue per iteration (round-robin). This is by design for fairness across concurrency keys (CKs). The `actualMaxCount * 3` scan limit for `ZRANGEBYSCORE` on the CK index is also intentional — it is adequate because each CK independently gets the full queue concurrency limit, so per-CK concurrency blocks rarely occur.

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-02T12:43:25.254Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/run-engine/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:25.254Z
Learning: All new run lifecycle logic should be implemented in the `internal/run-engine` package (`src/engine/`), not directly in `apps/webapp/app/v3/services/`

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/services/**/*.server.{ts,tsx} : Separate testable services from configuration files; follow the pattern of `realtimeClient.server.ts` (testable service) and `realtimeClientGlobal.server.ts` (configuration) in the webapp

Applied to files:

  • apps/webapp/app/services/realtime/mintRunToken.server.ts
📚 Learning: 2026-03-26T09:02:07.973Z
Learnt from: myftija
Repo: triggerdotdev/trigger.dev PR: 3274
File: apps/webapp/app/services/runsReplicationService.server.ts:922-924
Timestamp: 2026-03-26T09:02:07.973Z
Learning: When parsing Trigger.dev task run annotations in server-side services, keep `TaskRun.annotations` strictly conforming to the `RunAnnotations` schema from `trigger.dev/core/v3`. If the code already uses `RunAnnotations.safeParse` (e.g., in a `#parseAnnotations` helper), treat that as intentional/necessary for atomic, schema-accurate annotation handling. Do not recommend relaxing the annotation payload schema or using a permissive “passthrough” parse path, since the annotations are expected to be written atomically in one operation and should not contain partial/legacy payloads that would require a looser parser.

Applied to files:

  • apps/webapp/app/services/realtime/mintRunToken.server.ts
🔇 Additional comments (16)
.server-changes/batch-fast-fail-queue-size-limit.md (1)

1-7: LGTM!

The server-changes file correctly documents the fast-fail behavior for batch items hitting queue size limits, following the project convention for server-only changes.

internal-packages/run-engine/src/batch-queue/index.ts (2)

868-877: LGTM!

The skipRetries extraction and the updated retry condition correctly implement the fast-fail behavior. The === true comparison properly handles cases where skipRetries is undefined, and the span attribute provides useful observability.


901-901: LGTM!

Comment accurately reflects both code paths (final attempt exhausted OR retries skipped).

apps/webapp/app/runEngine/services/triggerTask.server.ts (2)

44-44: LGTM!

Import correctly includes the new QueueSizeLimitExceededError alongside ServiceValidationError.


274-279: LGTM!

The change to throw QueueSizeLimitExceededError enables downstream batch queue handling to identify this specific failure type and short-circuit retries. The ?? 0 fallback is defensive coding, though maximumSize should always be defined when !queueSizeGuard.ok.

apps/webapp/app/v3/services/common.server.ts (1)

13-31: LGTM!

The new QueueSizeLimitExceededError class is well-designed:

  • Correctly extends ServiceValidationError for backward compatibility
  • Adds the maximumSize property for downstream use
  • Sets this.name for proper instanceof checks
  • JSDoc clearly documents the purpose and cross-references the handler
internal-packages/run-engine/src/batch-queue/types.ts (1)

271-286: LGTM!

The type extension is well-designed:

  • Optional skipRetries field maintains backward compatibility
  • JSDoc clearly explains the intended use case (deterministic failures like queue size limits)
  • Discriminated union structure with success boolean remains clean
apps/webapp/app/v3/runEngineHandlers.server.ts (3)

641-648: LGTM!

The constant follows the coding guideline for named constants for sentinel values, and the JSDoc clearly documents its purpose and usage across the codebase.


821-850: LGTM!

The QueueSizeLimitExceededError handling is well-implemented:

  • Catches the specific error type for targeted handling
  • Logs at warn level with appropriate context (no high-cardinality fields)
  • Sets span attributes for observability
  • Returns skipRetries: true to bypass the retry ladder
  • Uses the consistent error code constant

933-977: LGTM!

The aggregation logic for queue-size-limit failures is well-designed:

  • Correctly identifies when all failures share the same error code using .every()
  • Uses the first failure's index as a stable anchor for the unique constraint (enabling idempotent retries)
  • Aggregated error message includes the count of affected items
  • Falls back to per-item error creation for mixed failure types
  • skipDuplicates: true ensures callback retry idempotency

This keeps DB write volume bounded to O(batches) instead of O(items) during queue overload scenarios.

internal-packages/run-engine/src/batch-queue/tests/index.test.ts (4)

779-807: LGTM!

The helper function createBatchQueueWithRetry is well-designed:

  • Short retry delays (20-100ms) ensure tests complete quickly even if regression causes full retry exhaustion
  • Disabling randomization (randomize: false) makes test behavior deterministic
  • Comment explains the rationale for the tiny retry ladder

809-857: LGTM!

Excellent test coverage for the primary skipRetries behavior:

  • Configures 6 retry attempts but verifies each item is processed exactly once
  • Uses the same QUEUE_SIZE_LIMIT_EXCEEDED error code as production
  • Verifies both attempt counts AND completion result structure
  • Uses Redis testcontainers (no mocks) per project guidelines

859-901: LGTM!

Important regression guard test ensuring the normal retry path isn't broken:

  • Verifies items without skipRetries still exhaust the retry ladder
  • Clear comment explaining the test's purpose

903-957: LGTM!

Excellent per-item behavior test:

  • Verifies skipRetries is honored on a per-item basis within the same batch
  • Even-indexed items fast-fail (1 attempt), odd-indexed items retry fully (4 attempts)
  • Covers the real-world scenario where only some items in a batch hit the queue limit
apps/webapp/app/services/realtime/mintRunToken.server.ts (2)

1-5: Nice contract alignment via derived Environment type.

Using Parameters<typeof extractJwtSigningSecretKey>[0] keeps this service’s input type synced with the auth helper and avoids drift.

Based on learnings, this follows “Separate testable services from configuration files; follow the pattern of realtimeClient.server.ts (testable service) and realtimeClientGlobal.server.ts (configuration) in the webapp”.


22-41: JWT scope construction and payload shaping look solid.

Read scope is always present, write scope is correctly gated, and token minting uses the expected signing key path.

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/webapp/app/v3/runEngineHandlers.server.ts (1)

941-963: Aggregation discards per-item details when collapsing errors.

When all failures share QUEUE_SIZE_LIMIT_EXCEEDED, this creates a single error row using only the first failure's taskIdentifier, payload, and options. If the batch contained items targeting different tasks, that detail is lost.

This is likely acceptable given the design goal (avoid DB amplification during queue overload), but worth confirming this trade-off is intentional. If users need to know which specific tasks failed in a multi-task batch, they'd only see the first item's task identifier.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8d105b62-8cb1-4961-bd06-fa9fd0b80a87

📥 Commits

Reviewing files that changed from the base of the PR and between 9f71b2d and f32dc72.

📒 Files selected for processing (7)
  • .server-changes/batch-fast-fail-queue-size-limit.md
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/v3/services/common.server.ts
  • internal-packages/run-engine/src/batch-queue/index.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
✅ Files skipped from review due to trivial changes (3)
  • .server-changes/batch-fast-fail-queue-size-limit.md
  • internal-packages/run-engine/src/batch-queue/index.ts
  • internal-packages/run-engine/src/batch-queue/tests/index.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/webapp/app/runEngine/services/triggerTask.server.ts
  • internal-packages/run-engine/src/batch-queue/types.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: typecheck / typecheck
🧰 Additional context used
📓 Path-based instructions (11)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access all environment variables through the env export of env.server.ts instead of directly accessing process.env in the Trigger.dev webapp

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: When importing from @trigger.dev/core in the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/v3/services/**/*.server.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Organize services in the webapp following the pattern app/v3/services/*/*.server.ts

Files:

  • apps/webapp/app/v3/services/common.server.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

**/*.ts: Use typecheck to verify changes in apps and internal packages (apps/*, internal-packages/*), not build - building proves almost nothing about correctness
When writing Trigger.dev tasks, always import from @trigger.dev/sdk. Never use @trigger.dev/sdk/v3 or deprecated client.defineJob
Add crumbs as you write code - mark lines with // @Crumbs or wrap blocks in `// `#region` `@crumbs for agentcrumbs debug tracing, then strip before merge

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
apps/**/*

📄 CodeRabbit inference engine (CLAUDE.md)

When modifying only server components (apps/webapp/, apps/supervisor/, etc.) with no package changes, add a .server-changes/ file instead of a changeset

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/**/*.server.{ts,tsx}

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

apps/webapp/**/*.server.{ts,tsx}: Environment variables must be accessed via the env export from app/env.server.ts and never use process.env directly
Always use findFirst instead of findUnique in Prisma queries due to implicit DataLoader batching issues and performance concerns

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

Use named constants for sentinel/placeholder values instead of raw string literals scattered across comparisons

Files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
🧠 Learnings (15)
📓 Common learnings
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3166
File: packages/redis-worker/src/fair-queue/index.ts:1114-1121
Timestamp: 2026-03-03T13:08:03.862Z
Learning: In packages/redis-worker/src/fair-queue/index.ts, it's acceptable for the worker queue depth cap check to allow overshooting by up to batchClaimSize messages per iteration, as the next iteration will recheck and prevent sustained growth beyond the limit.
📚 Learning: 2026-02-10T16:18:48.654Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 2980
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx:512-515
Timestamp: 2026-02-10T16:18:48.654Z
Learning: In apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.queues/route.tsx, environment.queueSizeLimit is a per-queue maximum that is configured at the environment level, not a shared limit across all queues. Each queue can have up to environment.queueSizeLimit items queued independently.

Applied to files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-10T17:56:20.938Z
Learnt from: samejr
Repo: triggerdotdev/trigger.dev PR: 3201
File: apps/webapp/app/v3/services/setSeatsAddOn.server.ts:25-29
Timestamp: 2026-03-10T17:56:20.938Z
Learning: Do not implement local userId-to-organizationId authorization checks inside org-scoped service classes (e.g., SetSeatsAddOnService, SetBranchesAddOnService) in the web app. Rely on route-layer authentication (requireUserId(request)) and org membership enforcement via the _app.orgs.$organizationSlug layout route. Any userId/organizationId that reaches these services from org-scoped routes has already been validated. Apply this pattern across all org-scoped services to avoid redundant auth checks and maintain consistency.

Applied to files:

  • apps/webapp/app/v3/services/common.server.ts
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-29T19:16:28.864Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3291
File: apps/webapp/app/v3/featureFlags.ts:53-65
Timestamp: 2026-03-29T19:16:28.864Z
Learning: When reviewing TypeScript code that uses Zod v3, treat `z.coerce.*()` schemas as their direct Zod type (e.g., `z.coerce.boolean()` returns a `ZodBoolean` with `_def.typeName === "ZodBoolean"`) rather than a `ZodEffects`. Only `.preprocess()`, `.refine()`/`.superRefine()`, and `.transform()` are expected to wrap schemas in `ZodEffects`. Therefore, in reviewers’ logic like `getFlagControlType`, do not flag/unblock failures that require unwrapping `ZodEffects` when the input schema is a `z.coerce.*` schema.

Applied to files:

  • apps/webapp/app/v3/services/common.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-30T22:25:33.107Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-30T22:25:33.107Z
Learning: Applies to apps/webapp/{app/v3/services/triggerTask.server.ts,app/v3/services/batchTriggerV3.server.ts} : Do NOT add database queries to `triggerTask.server.ts` or `batchTriggerV3.server.ts`; instead piggyback on the existing `backgroundWorkerTask.findFirst()` query in `queues.server.ts`

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-04-07T14:12:59.018Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3331
File: apps/webapp/app/runEngine/concerns/batchPayloads.server.ts:112-136
Timestamp: 2026-04-07T14:12:59.018Z
Learning: In `apps/webapp/app/runEngine/concerns/batchPayloads.server.ts`, the `pRetry` call wrapping `uploadPacketToObjectStore` intentionally retries **all** error types (no `shouldRetry` filter / `AbortError` guards). The maintainer explicitly prefers over-retrying to under-retrying because multiple heterogeneous object store backends are supported and it is impractical to enumerate all permanent error signatures. Do not flag this as an issue in future reviews.

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-03T13:08:03.862Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3166
File: packages/redis-worker/src/fair-queue/index.ts:1114-1121
Timestamp: 2026-03-03T13:08:03.862Z
Learning: In packages/redis-worker/src/fair-queue/index.ts, it's acceptable for the worker queue depth cap check to allow overshooting by up to batchClaimSize messages per iteration, as the next iteration will recheck and prevent sustained growth beyond the limit.

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2025-11-14T16:03:06.917Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2681
File: apps/webapp/app/services/platform.v3.server.ts:258-302
Timestamp: 2025-11-14T16:03:06.917Z
Learning: In `apps/webapp/app/services/platform.v3.server.ts`, the `getDefaultEnvironmentConcurrencyLimit` function intentionally throws an error (rather than falling back to org.maximumConcurrencyLimit) when the billing client returns undefined plan limits. This fail-fast behavior prevents users from receiving more concurrency than their plan entitles them to. The org.maximumConcurrencyLimit fallback is only for self-hosted deployments where no billing client exists.

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-30T22:25:33.107Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-30T22:25:33.107Z
Learning: Applies to apps/webapp/app/v3/queues.server.ts : When adding new task-level defaults, add to the existing `select` clause in `backgroundWorkerTask.findFirst()` query in `queues.server.ts` instead of adding a second query

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-14T10:06:33.506Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3219
File: internal-packages/run-engine/src/run-queue/index.ts:3264-3353
Timestamp: 2026-03-14T10:06:33.506Z
Learning: In `internal-packages/run-engine/src/run-queue/index.ts`, the `dequeueMessagesFromCkQueue` Lua command intentionally dequeues only 1 message per CK sub-queue per iteration (round-robin). This is by design for fairness across concurrency keys (CKs). The `actualMaxCount * 3` scan limit for `ZRANGEBYSCORE` on the CK index is also intentional — it is adequate because each CK independently gets the full queue concurrency limit, so per-CK concurrency blocks rarely occur.

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-03T13:07:33.177Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3166
File: internal-packages/run-engine/src/batch-queue/tests/index.test.ts:711-713
Timestamp: 2026-03-03T13:07:33.177Z
Learning: In `internal-packages/run-engine/src/batch-queue/tests/index.test.ts`, test assertions for rate limiter stubs can use `toBeGreaterThanOrEqual` rather than exact equality (`toBe`) because the consumer loop may call the rate limiter during empty pops in addition to actual item processing, and this over-calling is acceptable in integration tests.

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-30T22:25:33.107Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-30T22:25:33.107Z
Learning: Services should call engine methods from `app/v3/runEngine.server.ts` singleton for all run lifecycle operations (triggering, completing, cancelling, etc.)

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2026-03-25T15:29:25.889Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2026-03-25T15:29:25.889Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Configure concurrency using the `queue` option with `concurrencyLimit` property

Applied to files:

  • apps/webapp/app/v3/runEngineHandlers.server.ts
🔇 Additional comments (4)
apps/webapp/app/v3/services/common.server.ts (1)

14-31: LGTM!

The new error class is well-designed. It properly extends ServiceValidationError, exposes maximumSize for consumers to access queue limit information, and the JSDoc clearly explains its purpose for batch queue retry short-circuiting.

apps/webapp/app/v3/runEngineHandlers.server.ts (3)

16-16: LGTM!

Import correctly references the new error class from the internal services module.


641-648: LGTM!

Good use of a named constant for the error code, following the coding guideline for sentinel values. The JSDoc clearly documents the integration with BatchQueue and completion callback.


821-850: LGTM!

Excellent error handling design:

  • Specific catch before generic error handling
  • Appropriate warn log level for a customer-overload scenario
  • skipRetries: true correctly signals the BatchQueue to fast-fail
  • Span attributes provide good observability without high-cardinality metric concerns

@ericallam ericallam merged commit 417ab87 into main Apr 13, 2026
41 checks passed
@ericallam ericallam deleted the ea-branch-124 branch April 13, 2026 13:24
@github-actions github-actions Bot mentioned this pull request Apr 17, 2026
ericallam pushed a commit that referenced this pull request May 1, 2026
## Summary
8 new features, 18 improvements, 11 bug fixes.

## Breaking changes
- Add server-side deprecation gate for deploys from v3 CLI versions
(gated by `DEPRECATE_V3_CLI_DEPLOYS_ENABLED`). v4 CLI deploys are
unaffected.
([#3415](#3415))

## Improvements
- Add `--no-browser` flag to `init` and `login` to skip auto-opening the
browser during authentication. Also error loudly when `init` is run
without `--yes` under non-TTY stdin (previously default-and-exited
silently, leaving the project half-initialized). Both commands now show
an `Examples` section in `--help`.
([#3483](#3483))
- Add `isReplay` boolean to the run context (`ctx.run.isReplay`),
derived from the existing `replayedFromTaskRunFriendlyId` database
field. Defaults to `false` for backwards compatibility.
([#3454](#3454))
- Redact the `resolveWaitpoint` runtime log so it only emits `id` and
`type` instead of the full completed waitpoint. Previously the log
printed the entire waitpoint (including `output`) to stdout in
production runs, which could leak sensitive payloads. The value returned
by `wait.forToken()` is unchanged.
([#3490](#3490))
- Add `SessionId` friendly ID generator and schemas for the new durable
Session primitive. Exported from `@trigger.dev/core/v3/isomorphic`
alongside `RunId`, `BatchId`, etc. Ships the
`CreateSessionStreamWaitpoint` request/response schemas alongside the
main Session CRUD.
([#3417](#3417))
- Truncate large error stacks and messages to prevent OOM crashes. Stack
traces are capped at 50 frames (keeping top 5 + bottom 45 with an
omission notice), individual stack lines at 1024 chars, and error
messages at 1000 chars. Applied in parseError, sanitizeError, and OTel
span recording.
([#3405](#3405))

## Server changes

These changes affect the self-hosted Docker image and Trigger.dev Cloud:

- Add a "Back office" tab to `/admin` and a per-organization detail page
at `/admin/back-office/orgs/:orgId`. The first action available on that
page is editing the org's API rate limit: admins can save a
`tokenBucket` override (refill rate, interval, max tokens) and see a
plain-English preview of the resulting sustained rate and burst
allowance. Writes are audit-logged via the server logger.
([#3434](#3434))
- Optional `DEPLOY_REGISTRY_ECR_DEFAULT_REPOSITORY_POLICY` env var to
apply a default repository policy when the webapp creates new ECR repos
([#3467](#3467))
- Ship the Errors page to all users, with a polish + bug-fix pass:
pinned "No channel" item in the Slack alert channel picker,
viewer-timezone alert timestamps via Slack's `<!date^>` token, Activity
sparkline peak tooltip, centered loading spinner and bug-icon empty
state on the error detail page, ellipsis on the Configure alerts
trigger.
([#3477](#3477))
- Configure the set of machine presets to build boot snapshots for at
deploy time via `COMPUTE_TEMPLATE_MACHINE_PRESETS` (CSV of preset names,
default `small-1x`). Use `COMPUTE_TEMPLATE_MACHINE_PRESETS_REQUIRED`
(CSV, default = full PRESETS list) to scope which preset failures fail a
required-mode deploy. Optional preset failures are logged and don't
block the deploy.
([#3492](#3492))
- Regenerating a RuntimeEnvironment API key no longer invalidates the
previous key immediately. The old key is recorded in a new
`RevokedApiKey` table with a 24 hour grace window, and
`findEnvironmentByApiKey` falls back to it when the submitted key
doesn't match any live environment. The grace window can be ended early
(or extended) by updating `expiresAt` on the row.
([#3420](#3420))
- Add the `Session` primitive — a durable, task-bound, bidirectional I/O
channel that outlives a single run and acts as the run manager for
`chat.agent`. Ships the Postgres `Session` + `SessionRun` tables,
ClickHouse `sessions_v1` + replication service, the `sessions` JWT
scope, and the public CRUD + realtime routes (`/api/v1/sessions`,
`/realtime/v1/sessions/:session/:io`) including `end-and-continue` for
server-orchestrated run handoffs and session-stream waitpoints.
([#3417](#3417))
- Add `KUBERNETES_POD_DNS_NDOTS_OVERRIDE_ENABLED` flag (off by default)
that overrides the cluster default and sets `dnsConfig.options.ndots` on
runner pods (defaulting to 2, configurable via
`KUBERNETES_POD_DNS_NDOTS`). Kubernetes defaults pods to `ndots: 5`, so
any name with fewer than 5 dots — including typical external domains
like `api.example.com` — is first walked through every entry in the
cluster search list (`<ns>.svc.cluster.local`, `svc.cluster.local`,
`cluster.local`) before being tried as-is, turning one resolution into
4+ CoreDNS queries (×2 with A+AAAA). Using a lower `ndots` value reduces
DNS query amplification in the `cluster.local` zone.
  
Note: before enabling, make sure no code path relies on search-list
expansion for names with dots ≥ the configured value — those names will
hit their as-is form first and could resolve externally before falling
back to the cluster search path.
([#3441](#3441))
- Vercel integration option to disable auto promotions
([#3376](#3376))
- Make it clear in the admin that feature flags are global and should
rarely be changed.
([#3408](#3408))
- Admin worker groups API: add GET loader and expose more fields on
POST. ([#3390](#3390))
- Add 60s fresh / 60s stale SWR cache to `getEntitlement` in
`platform.v3.server.ts`. Eliminates a synchronous billing-service HTTP
round trip on every trigger. Reuses the existing `platformCache` (LRU
memory + Redis) pattern already used for `limits` and `usage`. Cache key
is `${orgId}`. Errors return a permissive `{ hasAccess: true }` fallback
(existing behavior) and are also cached to prevent thundering-herd on
billing outages.
([#3388](#3388))
- Show a `MicroVM` badge next to the region name on the regions page.
([#3407](#3407))
- Increase default maximum project count per organization from 10 to 25
([#3409](#3409))
- Merge execution snapshot creation into the dequeue taskRun.update
transaction, reducing 2 DB commits to 1 per dequeue operation
([#3395](#3395))
- Add per-worker Node.js heap metrics to the OTel meter —
`nodejs.memory.heap.used`, `nodejs.memory.heap.total`,
`nodejs.memory.heap.limit`, `nodejs.memory.external`,
`nodejs.memory.array_buffers`, `nodejs.memory.rss`. Host-metrics only
publishes RSS, which overstates V8 heap by the external + native
footprint; these give direct heap visibility per cluster worker so
`NODE_MAX_OLD_SPACE_SIZE` can be sized against observed heap peaks
rather than RSS.
([#3437](#3437))
- Tag Prisma spans with `db.datasource: "writer" | "replica"` so
monitors and trace queries can distinguish the writer pool from the
replica pool. Applies to all `prisma:engine:*` spans (including
`prisma:engine:connection` used by the connection-pool monitors) and the
outer `prisma:client:operation` span.
([#3422](#3422))
- Clarify the cross-region intent in the Terraform and AI-prompt helpers
on the Add Private Connection page. Both already default
`supported_regions` to `["us-east-1", "eu-central-1"]`; added an inline
comment / parenthetical so the user understands why both regions are
listed (Trigger.dev runs in both, so the service must be consumable from
either).
([#3465](#3465))
- Add `RUN_ENGINE_READ_REPLICA_SNAPSHOTS_SINCE_ENABLED` flag (default
off) to route the Prisma reads inside `RunEngine.getSnapshotsSince`
through the read-only replica client. Offloads the snapshot polling
queries (fired by every running task runner) from the primary. When
disabled, behavior is unchanged.
([#3423](#3423))
- Stop creating TaskRunTag records and _TaskRunToTaskRunTag join table
entries during task triggering. The denormalized runTags string array on
TaskRun already stores tag names, making the M2M relation redundant
write overhead.
([#3369](#3369))
- Stop writing per-tick state (`lastScheduledTimestamp`,
`nextScheduledTimestamp`, `lastRunTriggeredAt`) on `TaskSchedule` and
`TaskScheduleInstance`. The schedule engine now carries the previous
fire time forward via the worker queue payload, eliminating ~270K
dead-tuple-driven autovacuums per year on these hot tables and the
associated `IO:XactSync` mini-spikes on the writer. Customer-facing
`payload.lastTimestamp` semantics are unchanged.
([#3476](#3476))
- Replace the expensive DISTINCT query for task filter dropdowns with a
dedicated TaskIdentifier registry table backed by Redis. Environments
migrate automatically on their next deploy, with a transparent fallback
to the legacy query for unmigrated environments. Also fixes duplicate
dropdown entries when a task changes trigger source, and adds
active/archived grouping for removed tasks. Moves BackgroundWorkerTask
reads in the trigger hot path to the read replica.
([#3368](#3368))
- Public Access Tokens (PATs) minted before an API key rotation now keep
working during the 24h grace window. `validatePublicJwtKey` falls back
to any non-expired `RevokedApiKey` rows for the signing environment when
the primary signature check against the env's current `apiKey` fails.
The fallback query only runs on the failure path, so the hot success
path is unchanged.
([#3464](#3464))
- Batch items that hit the environment queue size limit now fast-fail
without
retries and without creating pre-failed TaskRuns.
([#3352](#3352))
- Show the cancel button in the runs list for runs in `DEQUEUED` status.
`DEQUEUED` was missing from `NON_FINAL_RUN_STATUSES` so the list hid the
button even though the single run page allowed it.
([#3421](#3421))
- Reduce 5xx feedback loops on hot debounce keys by quantizing
`delayUntil`,
  adding an unlocked fast-path skip, and gracefully handling redlock
contention in `handleDebounce` so the SDK no longer retries into a herd.
([#3453](#3453))
- Fix RSS memory leak in the realtime proxy routes. `/realtime/v1/runs`,
`/realtime/v1/runs/:id`, and `/realtime/v1/batches/:id` called `fetch()`
into Electric with no abort signal, so when a client disconnected mid
long-poll, undici kept the upstream socket open and buffered response
chunks that would never be consumed — retained only in RSS, invisible to
V8 heap tooling. Thread `getRequestAbortSignal()` through
`RealtimeClient.streamRun/streamRuns/streamBatch` to `longPollingFetch`
and cancel the upstream body in the error path. Isolated reproducer
showed ~44 KB retained per leaked request; signal propagation releases
it cleanly.
([#3442](#3442))
- Fix memory leak where every aborted SSE connection pinned the full
request/response graph on Node 20, caused by `AbortSignal.any()` in
`sse.ts` retaining its source signals indefinitely (see
nodejs/node#54614, nodejs/node#55351). Also clear the
`setTimeout(abort)` timer in `entry.server.tsx` so successful HTML
renders don't pin the React tree for 30s per request.
([#3430](#3430))
- Preserve filters on the queues page when submitting modal actions.
([#3471](#3471))
- Fix Redis connection leak in realtime streams and broken abort signal
propagation.
  
**Redis connections**: Non-blocking methods (ingestData, appendPart,
getLastChunkIndex) now share a single Redis connection instead of
creating one per request. streamResponse still uses dedicated
connections (required for XREAD BLOCK) but now tears them down
immediately via disconnect() instead of graceful quit(), with a 15s
inactivity fallback.
  
**Abort signal**: request.signal is broken in Remix/Express due to a
Node.js undici GC bug (nodejs/node#55428) that severs the signal chain
when Remix clones the Request internally. Added getRequestAbortSignal()
wired to Express res.on("close") via httpAsyncStorage, which fires
reliably on client disconnect. All SSE/streaming routes updated to use
it. ([#3399](#3399))
- Prevent dashboard crash (React error #31) when span accessory item
text is not a string. Filters out malformed accessory items in
SpanCodePathAccessory instead of passing objects to React as children.
([#3400](#3400))
- Upgrade Remix packages from 2.1.0 to 2.17.4 to address security
vulnerabilities in React Router
([#3372](#3372))
- Fix Vercel integration settings page (remove redundant section
toggles) and improve the Vercel onboarding flow so the modal closes
after connecting a GitHub repo and the marketplace `next` URL is
preserved across the GitHub app install redirect.
([#3424](#3424))

<details>
<summary>Raw changeset output</summary>

# Releases
## @trigger.dev/build@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## trigger.dev@4.4.5

### Patch Changes

- Add `--no-browser` flag to `init` and `login` to skip auto-opening the
browser during authentication. Also error loudly when `init` is run
without `--yes` under non-TTY stdin (previously default-and-exited
silently, leaving the project half-initialized). Both commands now show
an `Examples` section in `--help`.
([#3483](#3483))
-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`
    -   `@trigger.dev/build@4.4.5`
    -   `@trigger.dev/schema-to-json@4.4.5`

## @trigger.dev/core@4.4.5

### Patch Changes

- Add `isReplay` boolean to the run context (`ctx.run.isReplay`),
derived from the existing `replayedFromTaskRunFriendlyId` database
field. Defaults to `false` for backwards compatibility.
([#3454](#3454))
- Redact the `resolveWaitpoint` runtime log so it only emits `id` and
`type` instead of the full completed waitpoint. Previously the log
printed the entire waitpoint (including `output`) to stdout in
production runs, which could leak sensitive payloads. The value returned
by `wait.forToken()` is unchanged.
([#3490](#3490))
- Add `SessionId` friendly ID generator and schemas for the new durable
Session primitive. Exported from `@trigger.dev/core/v3/isomorphic`
alongside `RunId`, `BatchId`, etc. Ships the
`CreateSessionStreamWaitpoint` request/response schemas alongside the
main Session CRUD.
([#3417](#3417))
- Truncate large error stacks and messages to prevent OOM crashes. Stack
traces are capped at 50 frames (keeping top 5 + bottom 45 with an
omission notice), individual stack lines at 1024 chars, and error
messages at 1000 chars. Applied in parseError, sanitizeError, and OTel
span recording.
([#3405](#3405))

## @trigger.dev/python@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`
    -   `@trigger.dev/build@4.4.5`
    -   `@trigger.dev/sdk@4.4.5`

## @trigger.dev/react-hooks@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## @trigger.dev/redis-worker@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## @trigger.dev/rsc@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## @trigger.dev/schema-to-json@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## @trigger.dev/sdk@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants