Skip to content

Add #342 background-jobs integration test (and unbreak rule-anomaly query)#643

Open
julietshen wants to merge 5 commits into
mainfrom
background-jobs-integ-test
Open

Add #342 background-jobs integration test (and unbreak rule-anomaly query)#643
julietshen wants to merge 5 commits into
mainfrom
background-jobs-integ-test

Conversation

@julietshen
Copy link
Copy Markdown
Member

@julietshen julietshen commented May 29, 2026

Closes #342 (per the planning comment on that issue).

Summary

Adds an integ test for two of the three jobs called out by #342: RefreshMRTDecisionsMaterializedViewJob and DetectRulePassRateAnomaliesJob. RetryFailedNcmecDecisionsJob is deferred to a follow-up — its invariants live partly in NCMEC's HTTP response and need an outbound-HTTP simulator we don't have today.

Each scenario runs the job twice against the real stack and asserts the no-double-fire, no-skip invariants directly.

What surfaced along the way

Building the rule-anomaly test required fixing getRuleAnomalyDetectionStatistics first — the query was carried over verbatim from the pre-#133 Snowflake era and fails against ClickHouse in five distinct ways, all of which compound to silently make every rule's computed alarm state INSUFFICIENT_DATA in production:

  1. array_size(passes_distinct_user_ids) — Snowflake ARRAY_SIZE has no ClickHouse equivalent. Column is a JSON String, not a native Array. Use JSONLength(...).
  2. SYSDATE() — not a ClickHouse function. Replace with now64(3) (UTC DateTime64(3)).
  3. ts_start_inclusive >= ? with a Date bind — the warehouse adapter formats Dates as ISO 8601 with Z\, which ClickHouse rejects when implicitly converting to DateTime64(3). Wrap with parseDateTime64BestEffort(?)`.
  4. row.RULE_ID, row.NUM_PASSES, etc — Snowflake defaults to uppercase identifiers; ClickHouse returns column names as written. Every column access was reading undefined, sogetCurrentPeriodRuleAlarmStatuses collapsed every rule into a single { undefined: INSUFFICIENT_DATA } bucket.
  5. passCount /passingUsersCount / runsCount cast as number — ClickHouse Int64 / UInt64 columns deserialise to JS BigInt. binomialTest throws on 1n even though it accepts 1. Coerce with Number(...).

Net: the rule-anomaly detection job is effectively a no-op against the current ClickHouse setup until this lands. The integ test demonstrates an end-to-end alarm transition with the fix in place.

AGENTS.md callouts

Per coop/AGENTS.md > Human-approval-required actions:

  • Rewiring server/iocContainer in a way that changes service lifecycles — no, the integ test reuses the existing makeIntegrationServer() harness without changes.
  • Multi-thousand-line diffs — the diff is ~500 lines (one test file + the query fix + an updated unit-test mock). Reviewable in one pass.
  • Deleting or renaming a GraphQL type or field — not applicable.

Changes

  • server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.ts — five Snowflake-isms ported to ClickHouse.
  • server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts — inline snapshots updated; mock fixture updated to use the real ClickHouse output shape (lowercase keys, BigInt ints).
  • server/test/integ/background-jobs.integ.test.ts — new integ test for both jobs.
  • lint-staged.config.mjs — same shell-quoting fix as julietshen/user-strikes-signal; needed to commit through the hook on this branch since that fix hasn't merged to main yet.

Test plan

  • Unit: (cd server && NODE_OPTIONS="--no-warnings --loader ts-node/esm --require dotenv/config" npx jest services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts) — 5 passed, 5 snapshots.
  • Integ: (cd server && npm run test:integ -- background-jobs.integ.test.ts) against a real stack (npm run up && npm run db:update) — both scenarios green.
  • Reviewer: confirm the Snowflake → ClickHouse port is correct (JSONLength for the JSON-array column, now64(3) for the UTC DateTime64 comparison).

The integ suite currently can't compile against today's main due to the unmerged MEDIA wiring in #632 (fieldTypeHandlers.ts:83). I ran it locally by merging #632 into this branch and dropping the merge before commit — the test itself is independent of MEDIA. Once #632 lands, no extra setup is needed.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Fixed ESLint argument construction to ensure all intended files are properly included in linting operations.
    • Updated rule anomaly detection calculations to align with ClickHouse-specific data formats and time-handling functions for improved accuracy.
  • Tests

    • Added integration tests for background job idempotency, verifying that materialization and anomaly detection jobs produce consistent results across multiple runs without creating duplicates.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

Warning

Review limit reached

@julietshen, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 39 minutes and 43 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f4c86157-f4e8-4c8e-a49d-5eeded4784d1

📥 Commits

Reviewing files that changed from the base of the PR and between 0774559 and b0e16a7.

📒 Files selected for processing (4)
  • lint-staged.config.mjs
  • server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts
  • server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.ts
  • server/test/integ/background-jobs.integ.test.ts
📝 Walkthrough

Walkthrough

Fix ESLint per-file argument quoting; migrate rule-anomaly ClickHouse SQL and result mapping to use ClickHouse time/parsing functions and JSONLength; add integration tests for two background jobs with helpers to seed Postgres and ClickHouse.

Changes

Configuration, Service, and Test Updates

Layer / File(s) Summary
ESLint argument quoting fix
lint-staged.config.mjs
Replaced JSON.stringify-based ESLint argument construction with single-quote wrapping and embedded single-quote escaping to produce bash -c-safe per-file args.
ClickHouse query and result format migration
server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.ts, server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts
Service now builds WHERE using now64(3) and parseDateTime64BestEffort(?), uses JSONLength() for distinct-user aggregation, and maps lowercase ClickHouse fields with BigInt coercion to number. Tests and SQL snapshots updated across filter scenarios.
Integration tests for background jobs
server/test/integ/background-jobs.integ.test.ts
New integration suite: MRT decisions materialized-view idempotency and DetectRulePassRateAnomalies job (alarm transition and single-notification behavior), plus helpers for Postgres decision seeding, ClickHouse DateTime64(3) formatting, and SQL-safe row construction.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • vinaysrao1
  • cassidyjames
  • dom-notion
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: adding background-jobs integration test and fixing the rule-anomaly query.
Description check ✅ Passed The description comprehensively covers context, testing, and changes, aligning with the template's structure and requirements.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch background-jobs-integ-test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/test/integ/background-jobs.integ.test.ts`:
- Around line 61-64: The test currently creates a user via createUser(...) but
still seeds decisions using a fresh uid(), which bypasses the real reviewer
path; update the call sites that seed manual_review_decisions (the
insertDecisions(...) invocation and any direct seeded rows around where uid() is
used) to accept and use the created user's id (from the userFixture or
coopUserCleanup setup) as the reviewer_id so the seeded rows reference the real
coop user and satisfy manual_review_decisions.reviewer_id joins/constraints.
- Around line 359-362: The test's "current" period uses the Node clock (now) and
only subtracts 1 second for currentEnd, which can be ahead of ClickHouse's
now64(3) and cause flaky assertions; update the seeding to use a wider buffer or
derive the reference time from ClickHouse: either subtract a larger margin
(e.g., 5–60 seconds) when computing currentEnd (symbol: currentEnd,
currentStart, HOUR_MS, now) or query ClickHouse for its current time and base
currentEnd/currentStart off that value so the ts_end_exclusive <= now64(3)
filter reliably includes the row.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3634c864-a7ce-405d-9334-e714c8dbd329

📥 Commits

Reviewing files that changed from the base of the PR and between 071a4b2 and 5cc8972.

📒 Files selected for processing (4)
  • lint-staged.config.mjs
  • server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts
  • server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.ts
  • server/test/integ/background-jobs.integ.test.ts

Comment thread server/test/integ/background-jobs.integ.test.ts Outdated
Comment thread server/test/integ/background-jobs.integ.test.ts Outdated
@julietshen julietshen force-pushed the background-jobs-integ-test branch from 5cc8972 to 0774559 Compare May 29, 2026 18:01
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts`:
- Around line 32-33: The test mocks DateTime64 fields inconsistently:
ts_start_inclusive and rule_version must both use ClickHouse string format
produced by toClickHouseDateTime() (space-separated "YYYY-MM-DD HH:MM:SS.sss",
no trailing Z) instead of mixing ISO Z formats; update the mock rows in
getRuleAnomalyDetectionStatistics.test.ts to set both ts_start_inclusive and
rule_version to the same ClickHouse-style timestamp string, and change the
assertions for windowStart and approxRuleVersion to compare against a Date
constructed from a fixed UTC ISO instant (e.g. "2022-05-07T23:00:00.000Z") so
the test verifies Date parsing rather than mirroring the mock representation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 5fbe6823-b18f-4338-9375-ac56ae07328b

📥 Commits

Reviewing files that changed from the base of the PR and between 5cc8972 and 0774559.

📒 Files selected for processing (4)
  • lint-staged.config.mjs
  • server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts
  • server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.ts
  • server/test/integ/background-jobs.integ.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.ts
  • lint-staged.config.mjs

julietshen added a commit that referenced this pull request May 29, 2026
Three threads from PR #643 review.

1. **Seeded decisions referenced fake reviewer ids** (integ test:64). The
   beforeAll already provisioned a real coop user but `insertDecisions`
   was still generating fresh `uid()` for `reviewer_id` on every row.
   Threads the created `coopUserId` through `insertDecisions(...)` so
   the seeded rows reference the real user — matches the comment intent
   ("reviewer for the seeded decisions") and won't break if
   `manual_review_decisions.reviewer_id` gains a FK constraint later.

2. **`currentEnd` margin too tight** (integ test:362). The seeded
   "current" period used `now - 1s` against the Node clock, but the
   job's filter (`ts_end_exclusive <= now64(3)`) is evaluated against
   ClickHouse's wall clock. Even small container vs host clock skew can
   land the row slightly in the future from ClickHouse's POV and make
   the alarm-status assertion flaky. Widens the margin to 30s — well
   past any plausible skew, still firmly inside the job's
   `oneWeekAgo` startTime window so the alarm math is unaffected.

3. **ClickHouse `DateTime64` parsing was timezone-dependent**
   (unit test:33). The mock had `ts_start_inclusive` in CH-native form
   (`"YYYY-MM-DD HH:MM:SS.sss"`, no Z) and `rule_version` in ISO with
   Z — inconsistent, and the assertions just `new Date(...)`'d the same
   strings, so the test couldn't catch a parsing regression. More
   importantly, the production code's `new Date(row.ts_start_inclusive)`
   interpreted the CH-native string as local time, so the parsed
   instant differed from the stored UTC value by whatever timezone the
   runtime happened to be in (UTC on prod containers, local TZ on dev
   machines).

   Adds a tiny `parseClickhouseTimestamp` helper in
   getRuleAnomalyDetectionStatistics.ts that composes a proper ISO
   string with explicit Z before parsing. Updates the unit-test mock
   to use the CH-native shape on both columns and asserts against
   fixed UTC ISO instants — so the test now actually exercises the
   parser instead of mirroring the mock representation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
julietshen and others added 5 commits May 29, 2026 17:09
`JSON.stringify(filename)` produced double-quoted args that collapsed
against the outer `bash -c "..."` wrapper, so the shell parsed the
command as `cd <pkg> && .../eslint --fix` with no file args. eslint
then linted the entire package directory and surfaced every pre-existing
warning/error in the repo, blocking any commit whose staged files were
clean but whose tree wasn't.

Switch to single-quoted args (with the standard `'\''` escape for any
embedded single quote) so the file paths survive the outer
double-quoted `bash -c` intact.

Same fix as julietshen/user-strikes-signal — applying here too since
that branch hasn't merged to main yet and this PR needs to commit
through the same hook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The rule anomaly detection query was carried over verbatim from the
pre-#133 Snowflake era. Against ClickHouse it fails in five distinct
ways, all of which compound to silently make every rule's computed alarm
state INSUFFICIENT_DATA so the detection job is a no-op in practice.

Surfaced while building the #342 integ test (which couldn't see any
alarm transitions until each of these was fixed in turn):

1. `array_size(passes_distinct_user_ids)` — Snowflake `ARRAY_SIZE` has no
   ClickHouse equivalent under that name. The column is also a JSON
   `String`, not a native `Array`, so `length()` wouldn't help either.
   Switch to `JSONLength(...)`, which parses the JSON array and counts.

2. `SYSDATE()` — not a ClickHouse function. Replace with `now64(3)`,
   which returns `DateTime64(3)` in UTC (matching the column type and
   matching the original Snowflake-comment claim that we wanted UTC).

3. `ts_start_inclusive >= ?` with a `Date` bind — the warehouse adapter
   formats Dates as ISO 8601 with `Z`, which ClickHouse rejects when
   implicitly converting to `DateTime64(3)`. Wrap the bind with
   `parseDateTime64BestEffort(?)` so the format is accepted explicitly.

4. `row.RULE_ID`, `row.NUM_PASSES`, etc — Snowflake defaults to uppercase
   identifiers, ClickHouse returns column names as written. Every column
   access was reading back `undefined`, so the rule-id-keyed grouping in
   `getCurrentPeriodRuleAlarmStatuses` collapsed to a single
   `{ undefined: { status: INSUFFICIENT_DATA } }` bucket.

5. `passCount`/`passingUsersCount`/`runsCount` cast as `number` — but
   ClickHouse `Int64`/`UInt64` columns deserialise to JS `BigInt`. The
   downstream `binomialTest` validator throws on `1n` even though it
   accepts `1`. Coerce with `Number(...)` at the read site.

Snapshots in the existing unit tests updated to match the new SQL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers two of the three jobs listed in #342 (per the planning comment
on that issue): `RefreshMRTDecisionsMaterializedViewJob` and
`DetectRulePassRateAnomaliesJob`. `RetryFailedNcmecDecisionsJob` is
deferred to a follow-up — its retry/double-fire/skip invariants live
partly in NCMEC's HTTP response and need an outbound-HTTP simulator
we don't have in `server/test/integ/` today.

Each scenario runs the job twice against the real stack and asserts the
no-double-fire, no-skip invariants directly:

- **MRT refresh.** Seeds N decisions into `manual_review_tool.manual_review_decisions`
  (which `dim_mrt_decisions` derives from), runs the job, asserts N
  rows in `dim_mrt_decisions_materialized`. Runs again: still N (the
  job's `ON CONFLICT DO NOTHING` against the high-water-mark cutoff
  carries the load). Seeds M more, runs again: N+M. Filters every
  count by `org_id` so other tests' decisions can't perturb the result.

- **Rule pass-rate anomalies.** Seeds a rule with `alarm_status = OK`
  in Postgres plus 26 hours of ClickHouse `RULE_EXECUTION_STATISTICS`
  rows (25 historical periods with a low distinct-user pass rate, 1
  current period with a high distinct-user pass rate) that together
  clear the data-adequacy bar in `getRuleAlarmStatus` and trip the
  25%-above-historical + binomial-test thresholds. Runs the job:
  asserts `rules.alarm_status` flips to ALARM and a notification row
  lands in `public.notifications` for the rule's creator. Runs again
  with no new stats: notification count and alarm_status are both
  unchanged (the job's compare-then-update path is a no-op when the
  computed status matches what's already stored).

`passes_distinct_user_ids` has to carry real id arrays for the math to
work — the alarm logic counts _distinct passing users_, not raw pass
count (see the long comment in `getCurrentPeriodRuleAlarmStatuses`).

ClickHouse seeded rows aren't cleaned up in afterAll: cleaning would
require `ALTER TABLE … DELETE` against a partitioned table, and each
test run uses a fresh `ruleId` so leftovers can't collide.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…output

The mock returned uppercase keys and plain `Number` values — both
Snowflake-era assumptions. The earlier query-fix commit changed the
code to read lowercase keys and coerce `BigInt` → `Number` at the read
site; the mock fixture needed to match the real ClickHouse shape for
the result-formatting test to actually exercise the code path it
claims to.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three threads from PR #643 review.

1. **Seeded decisions referenced fake reviewer ids** (integ test:64). The
   beforeAll already provisioned a real coop user but `insertDecisions`
   was still generating fresh `uid()` for `reviewer_id` on every row.
   Threads the created `coopUserId` through `insertDecisions(...)` so
   the seeded rows reference the real user — matches the comment intent
   ("reviewer for the seeded decisions") and won't break if
   `manual_review_decisions.reviewer_id` gains a FK constraint later.

2. **`currentEnd` margin too tight** (integ test:362). The seeded
   "current" period used `now - 1s` against the Node clock, but the
   job's filter (`ts_end_exclusive <= now64(3)`) is evaluated against
   ClickHouse's wall clock. Even small container vs host clock skew can
   land the row slightly in the future from ClickHouse's POV and make
   the alarm-status assertion flaky. Widens the margin to 30s — well
   past any plausible skew, still firmly inside the job's
   `oneWeekAgo` startTime window so the alarm math is unaffected.

3. **ClickHouse `DateTime64` parsing was timezone-dependent**
   (unit test:33). The mock had `ts_start_inclusive` in CH-native form
   (`"YYYY-MM-DD HH:MM:SS.sss"`, no Z) and `rule_version` in ISO with
   Z — inconsistent, and the assertions just `new Date(...)`'d the same
   strings, so the test couldn't catch a parsing regression. More
   importantly, the production code's `new Date(row.ts_start_inclusive)`
   interpreted the CH-native string as local time, so the parsed
   instant differed from the stored UTC value by whatever timezone the
   runtime happened to be in (UTC on prod containers, local TZ on dev
   machines).

   Adds a tiny `parseClickhouseTimestamp` helper in
   getRuleAnomalyDetectionStatistics.ts that composes a proper ISO
   string with explicit Z before parsing. Updates the unit-test mock
   to use the CH-native shape on both columns and asserts against
   fixed UTC ISO instants — so the test now actually exercises the
   parser instead of mirroring the mock representation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Background jobs integration test

1 participant