Fix: Attempt at improving worker actions query#3774
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
04abf76 to
0692172
Compare
| ; | ||
|
|
||
| -- name: GetWorkerActionsByWorkerActionHash :many | ||
| SELECT DISTINCT ON (w."actionHash") |
There was a problem hiding this comment.
goal here is we'll just move to this completely in a little bit, once all the workers have hashes
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce slow traces in the worker actions lookup path by adding an actionHash to workers and using it to de-duplicate action lookups across identical worker replicas, falling back to worker-id lookups for backward compatibility.
Changes:
- Add
Worker.actionHash(BYTEA) plus a(tenantId, actionHash)index (schema + migrations). - Introduce a new repository method to fetch worker actions using action-hash de-duplication with worker-id fallback.
- Update worker list/get handlers to call the new repository API and pass full worker rows.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| sql/schema/v0.sql | Adds actionHash column and a (tenantId, actionHash) index to the baseline schema. |
| cmd/hatchet-migrate/migrate/migrations/20260428204451_v1_0_103.sql | Adds the actionHash column for existing installations. |
| cmd/hatchet-migrate/migrate/migrations/20260428213451_v1_0_104.sql | Adds the (tenantId, actionHash) index concurrently for existing installations. |
| pkg/repository/worker.go | Implements action-hash-based action lookup and hashes actions on worker creation. |
| pkg/repository/sqlcv1/workers.sql | Updates worker-actions queries and adds a new query by actionHash. |
| pkg/repository/sqlcv1/workers.sql.go | Regenerated sqlc output for updated queries/columns. |
| pkg/repository/sqlcv1/models.go | Adds ActionHash field to the Worker model. |
| api/v1/server/handlers/workers/list.go | Uses the new worker-actions repository method and passes worker rows. |
| api/v1/server/handlers/workers/get.go | Uses the new worker-actions repository method for single-worker fetch. |
| JOIN "_ActionToWorker" aw ON w.id = aw."B" | ||
| JOIN "Action" a ON aw."A" = a.id | ||
| WHERE | ||
| a."tenantId" = @tenantId::UUID |
There was a problem hiding this comment.
GetWorkerActionsByWorkerActionHash doesn’t constrain w."tenantId", so the new (tenantId, actionHash) index won’t be used and (if any cross-tenant links ever exist) results could bleed across tenants. Add AND w."tenantId" = @tenantId::UUID to the WHERE clause to match the index and keep the join tenant-scoped.
| a."tenantId" = @tenantId::UUID | |
| a."tenantId" = @tenantId::UUID | |
| AND w."tenantId" = @tenantId::UUID |
| func hashActions(actions []string) []byte { | ||
| h := sha256.New() | ||
|
|
||
| for _, action := range actions { | ||
| h.Write([]byte(action)) |
There was a problem hiding this comment.
hashActions is not collision-safe as written: concatenating action strings without a delimiter/length prefix means different action lists can produce identical byte streams (e.g., ["ab","c"] vs ["a","bc"]). Also, the hash is order-dependent, which reduces de-duplication if action order varies. Consider hashing a canonical encoding (sort + delimiter or length-prefix each action) to make the hash represent the set/list unambiguously.
| if len(worker.ActionHash) == 0 { | ||
| // if the worker has no action hash, we have no choice but to look | ||
| // it up by its id | ||
| workerIds = append(workerIds, worker.ID) | ||
| continue |
There was a problem hiding this comment.
GetWorkerActionsForWorkers assumes Worker.ActionHash accurately reflects the worker’s current action set. However, actionHash is only set on worker creation right now, so if actions can be updated later (e.g., via UpdateWorker), the stored hash can become stale and cause incorrect action-sharing between workers. Recommend recomputing/persisting actionHash whenever a worker’s actions are modified, or avoid using the hash unless it’s guaranteed up-to-date.
| @@ -88,7 +89,7 @@ type WorkerRepository interface { | |||
| ListActiveSDKsPerTenant(ctx context.Context) (map[TenantIdSDKTuple]int64, error) | |||
|
|
|||
| // GetWorkerActionsByWorkerId returns a list of actions for a worker | |||
There was a problem hiding this comment.
The doc comment still refers to GetWorkerActionsByWorkerId, but the method is now GetWorkerActionsForWorkers. Update the comment to match the new API name/behavior (especially since it now takes full worker rows to use actionHash).
| // GetWorkerActionsByWorkerId returns a list of actions for a worker | |
| // GetWorkerActionsForWorkers returns actions keyed by worker action hash. | |
| // It takes full worker rows so callers can reuse each worker's persisted actionHash | |
| // instead of recomputing it from the worker ID or action list. |
| records, err := w.queries.GetWorkerActionsByWorkerId(ctx, w.pool, sqlcv1.GetWorkerActionsByWorkerIdParams{ | ||
| func (w *workerRepository) GetWorkerActionsForWorkers(ctx context.Context, tenantId uuid.UUID, workers []sqlcv1.Worker) (map[string][]string, error) { | ||
| actionHashSet := make(map[string]struct{}) | ||
| workerIds := make([]uuid.UUID, len(workers)) |
There was a problem hiding this comment.
workerIds is initialized with len(workers) but later populated via append, which leaves len(workers) leading uuid.Nil entries in the slice. This will send a bunch of zero UUIDs to GetWorkerActionsByWorkerId and can significantly bloat/slow the query; initialize with length 0 (and capacity len(workers)) instead, or fill by index without appending.
| workerIds := make([]uuid.UUID, len(workers)) | |
| workerIds := make([]uuid.UUID, 0, len(workers)) |
| ; | ||
|
|
||
| -- name: GetWorkerActionsByWorkerActionHash :many | ||
| SELECT DISTINCT ON (w."actionHash") |
There was a problem hiding this comment.
This SELECT DISTINCT ON (w."actionHash") query only returns one actionId per actionHash (the most recent by createdAt), but GetWorkerActionsForWorkers expects to build the full list of actions for each hash. Remove the DISTINCT ON or change it to de-duplicate at the (actionHash, actionId) level so all actions are returned.
| SELECT DISTINCT ON (w."actionHash") | |
| SELECT |
Description
Lots of slow traces, which I think are happening because we're pulling a ton of rows down out of the db. I figured we could hash the action names and store them with the worker, and then try to only look up actions for each hash once, and then map them back to the relevant worker that way, which should cut down the number of rows we need to pull out of the db by a factor of the number of replicas of a worker any given tenant has
Unfortunately we can't make the action hash non-null and backfill so we need to also have a lookup by worker id for backwards compat for now
Type of change