Add #342 background-jobs integration test (and unbreak rule-anomaly query)#643
Add #342 background-jobs integration test (and unbreak rule-anomaly query)#643julietshen wants to merge 5 commits into
Conversation
|
Warning Review limit reached
More reviews will be available in 39 minutes and 43 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughFix ESLint per-file argument quoting; migrate rule-anomaly ClickHouse SQL and result mapping to use ClickHouse time/parsing functions and JSONLength; add integration tests for two background jobs with helpers to seed Postgres and ClickHouse. ChangesConfiguration, Service, and Test Updates
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/test/integ/background-jobs.integ.test.ts`:
- Around line 61-64: The test currently creates a user via createUser(...) but
still seeds decisions using a fresh uid(), which bypasses the real reviewer
path; update the call sites that seed manual_review_decisions (the
insertDecisions(...) invocation and any direct seeded rows around where uid() is
used) to accept and use the created user's id (from the userFixture or
coopUserCleanup setup) as the reviewer_id so the seeded rows reference the real
coop user and satisfy manual_review_decisions.reviewer_id joins/constraints.
- Around line 359-362: The test's "current" period uses the Node clock (now) and
only subtracts 1 second for currentEnd, which can be ahead of ClickHouse's
now64(3) and cause flaky assertions; update the seeding to use a wider buffer or
derive the reference time from ClickHouse: either subtract a larger margin
(e.g., 5–60 seconds) when computing currentEnd (symbol: currentEnd,
currentStart, HOUR_MS, now) or query ClickHouse for its current time and base
currentEnd/currentStart off that value so the ts_end_exclusive <= now64(3)
filter reliably includes the row.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 3634c864-a7ce-405d-9334-e714c8dbd329
📒 Files selected for processing (4)
lint-staged.config.mjsserver/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.tsserver/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.tsserver/test/integ/background-jobs.integ.test.ts
5cc8972 to
0774559
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts`:
- Around line 32-33: The test mocks DateTime64 fields inconsistently:
ts_start_inclusive and rule_version must both use ClickHouse string format
produced by toClickHouseDateTime() (space-separated "YYYY-MM-DD HH:MM:SS.sss",
no trailing Z) instead of mixing ISO Z formats; update the mock rows in
getRuleAnomalyDetectionStatistics.test.ts to set both ts_start_inclusive and
rule_version to the same ClickHouse-style timestamp string, and change the
assertions for windowStart and approxRuleVersion to compare against a Date
constructed from a fixed UTC ISO instant (e.g. "2022-05-07T23:00:00.000Z") so
the test verifies Date parsing rather than mirroring the mock representation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 5fbe6823-b18f-4338-9375-ac56ae07328b
📒 Files selected for processing (4)
lint-staged.config.mjsserver/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.tsserver/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.tsserver/test/integ/background-jobs.integ.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.ts
- lint-staged.config.mjs
Three threads from PR #643 review. 1. **Seeded decisions referenced fake reviewer ids** (integ test:64). The beforeAll already provisioned a real coop user but `insertDecisions` was still generating fresh `uid()` for `reviewer_id` on every row. Threads the created `coopUserId` through `insertDecisions(...)` so the seeded rows reference the real user — matches the comment intent ("reviewer for the seeded decisions") and won't break if `manual_review_decisions.reviewer_id` gains a FK constraint later. 2. **`currentEnd` margin too tight** (integ test:362). The seeded "current" period used `now - 1s` against the Node clock, but the job's filter (`ts_end_exclusive <= now64(3)`) is evaluated against ClickHouse's wall clock. Even small container vs host clock skew can land the row slightly in the future from ClickHouse's POV and make the alarm-status assertion flaky. Widens the margin to 30s — well past any plausible skew, still firmly inside the job's `oneWeekAgo` startTime window so the alarm math is unaffected. 3. **ClickHouse `DateTime64` parsing was timezone-dependent** (unit test:33). The mock had `ts_start_inclusive` in CH-native form (`"YYYY-MM-DD HH:MM:SS.sss"`, no Z) and `rule_version` in ISO with Z — inconsistent, and the assertions just `new Date(...)`'d the same strings, so the test couldn't catch a parsing regression. More importantly, the production code's `new Date(row.ts_start_inclusive)` interpreted the CH-native string as local time, so the parsed instant differed from the stored UTC value by whatever timezone the runtime happened to be in (UTC on prod containers, local TZ on dev machines). Adds a tiny `parseClickhouseTimestamp` helper in getRuleAnomalyDetectionStatistics.ts that composes a proper ISO string with explicit Z before parsing. Updates the unit-test mock to use the CH-native shape on both columns and asserts against fixed UTC ISO instants — so the test now actually exercises the parser instead of mirroring the mock representation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`JSON.stringify(filename)` produced double-quoted args that collapsed against the outer `bash -c "..."` wrapper, so the shell parsed the command as `cd <pkg> && .../eslint --fix` with no file args. eslint then linted the entire package directory and surfaced every pre-existing warning/error in the repo, blocking any commit whose staged files were clean but whose tree wasn't. Switch to single-quoted args (with the standard `'\''` escape for any embedded single quote) so the file paths survive the outer double-quoted `bash -c` intact. Same fix as julietshen/user-strikes-signal — applying here too since that branch hasn't merged to main yet and this PR needs to commit through the same hook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The rule anomaly detection query was carried over verbatim from the pre-#133 Snowflake era. Against ClickHouse it fails in five distinct ways, all of which compound to silently make every rule's computed alarm state INSUFFICIENT_DATA so the detection job is a no-op in practice. Surfaced while building the #342 integ test (which couldn't see any alarm transitions until each of these was fixed in turn): 1. `array_size(passes_distinct_user_ids)` — Snowflake `ARRAY_SIZE` has no ClickHouse equivalent under that name. The column is also a JSON `String`, not a native `Array`, so `length()` wouldn't help either. Switch to `JSONLength(...)`, which parses the JSON array and counts. 2. `SYSDATE()` — not a ClickHouse function. Replace with `now64(3)`, which returns `DateTime64(3)` in UTC (matching the column type and matching the original Snowflake-comment claim that we wanted UTC). 3. `ts_start_inclusive >= ?` with a `Date` bind — the warehouse adapter formats Dates as ISO 8601 with `Z`, which ClickHouse rejects when implicitly converting to `DateTime64(3)`. Wrap the bind with `parseDateTime64BestEffort(?)` so the format is accepted explicitly. 4. `row.RULE_ID`, `row.NUM_PASSES`, etc — Snowflake defaults to uppercase identifiers, ClickHouse returns column names as written. Every column access was reading back `undefined`, so the rule-id-keyed grouping in `getCurrentPeriodRuleAlarmStatuses` collapsed to a single `{ undefined: { status: INSUFFICIENT_DATA } }` bucket. 5. `passCount`/`passingUsersCount`/`runsCount` cast as `number` — but ClickHouse `Int64`/`UInt64` columns deserialise to JS `BigInt`. The downstream `binomialTest` validator throws on `1n` even though it accepts `1`. Coerce with `Number(...)` at the read site. Snapshots in the existing unit tests updated to match the new SQL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers two of the three jobs listed in #342 (per the planning comment on that issue): `RefreshMRTDecisionsMaterializedViewJob` and `DetectRulePassRateAnomaliesJob`. `RetryFailedNcmecDecisionsJob` is deferred to a follow-up — its retry/double-fire/skip invariants live partly in NCMEC's HTTP response and need an outbound-HTTP simulator we don't have in `server/test/integ/` today. Each scenario runs the job twice against the real stack and asserts the no-double-fire, no-skip invariants directly: - **MRT refresh.** Seeds N decisions into `manual_review_tool.manual_review_decisions` (which `dim_mrt_decisions` derives from), runs the job, asserts N rows in `dim_mrt_decisions_materialized`. Runs again: still N (the job's `ON CONFLICT DO NOTHING` against the high-water-mark cutoff carries the load). Seeds M more, runs again: N+M. Filters every count by `org_id` so other tests' decisions can't perturb the result. - **Rule pass-rate anomalies.** Seeds a rule with `alarm_status = OK` in Postgres plus 26 hours of ClickHouse `RULE_EXECUTION_STATISTICS` rows (25 historical periods with a low distinct-user pass rate, 1 current period with a high distinct-user pass rate) that together clear the data-adequacy bar in `getRuleAlarmStatus` and trip the 25%-above-historical + binomial-test thresholds. Runs the job: asserts `rules.alarm_status` flips to ALARM and a notification row lands in `public.notifications` for the rule's creator. Runs again with no new stats: notification count and alarm_status are both unchanged (the job's compare-then-update path is a no-op when the computed status matches what's already stored). `passes_distinct_user_ids` has to carry real id arrays for the math to work — the alarm logic counts _distinct passing users_, not raw pass count (see the long comment in `getCurrentPeriodRuleAlarmStatuses`). ClickHouse seeded rows aren't cleaned up in afterAll: cleaning would require `ALTER TABLE … DELETE` against a partitioned table, and each test run uses a fresh `ruleId` so leftovers can't collide. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…output The mock returned uppercase keys and plain `Number` values — both Snowflake-era assumptions. The earlier query-fix commit changed the code to read lowercase keys and coerce `BigInt` → `Number` at the read site; the mock fixture needed to match the real ClickHouse shape for the result-formatting test to actually exercise the code path it claims to. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three threads from PR #643 review. 1. **Seeded decisions referenced fake reviewer ids** (integ test:64). The beforeAll already provisioned a real coop user but `insertDecisions` was still generating fresh `uid()` for `reviewer_id` on every row. Threads the created `coopUserId` through `insertDecisions(...)` so the seeded rows reference the real user — matches the comment intent ("reviewer for the seeded decisions") and won't break if `manual_review_decisions.reviewer_id` gains a FK constraint later. 2. **`currentEnd` margin too tight** (integ test:362). The seeded "current" period used `now - 1s` against the Node clock, but the job's filter (`ts_end_exclusive <= now64(3)`) is evaluated against ClickHouse's wall clock. Even small container vs host clock skew can land the row slightly in the future from ClickHouse's POV and make the alarm-status assertion flaky. Widens the margin to 30s — well past any plausible skew, still firmly inside the job's `oneWeekAgo` startTime window so the alarm math is unaffected. 3. **ClickHouse `DateTime64` parsing was timezone-dependent** (unit test:33). The mock had `ts_start_inclusive` in CH-native form (`"YYYY-MM-DD HH:MM:SS.sss"`, no Z) and `rule_version` in ISO with Z — inconsistent, and the assertions just `new Date(...)`'d the same strings, so the test couldn't catch a parsing regression. More importantly, the production code's `new Date(row.ts_start_inclusive)` interpreted the CH-native string as local time, so the parsed instant differed from the stored UTC value by whatever timezone the runtime happened to be in (UTC on prod containers, local TZ on dev machines). Adds a tiny `parseClickhouseTimestamp` helper in getRuleAnomalyDetectionStatistics.ts that composes a proper ISO string with explicit Z before parsing. Updates the unit-test mock to use the CH-native shape on both columns and asserts against fixed UTC ISO instants — so the test now actually exercises the parser instead of mirroring the mock representation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d2db7bf to
b0e16a7
Compare
Closes #342 (per the planning comment on that issue).
Summary
Adds an integ test for two of the three jobs called out by #342:
RefreshMRTDecisionsMaterializedViewJobandDetectRulePassRateAnomaliesJob.RetryFailedNcmecDecisionsJobis deferred to a follow-up — its invariants live partly in NCMEC's HTTP response and need an outbound-HTTP simulator we don't have today.Each scenario runs the job twice against the real stack and asserts the no-double-fire, no-skip invariants directly.
What surfaced along the way
Building the rule-anomaly test required fixing
getRuleAnomalyDetectionStatisticsfirst — the query was carried over verbatim from the pre-#133 Snowflake era and fails against ClickHouse in five distinct ways, all of which compound to silently make every rule's computed alarm stateINSUFFICIENT_DATAin production:array_size(passes_distinct_user_ids)— SnowflakeARRAY_SIZEhas no ClickHouse equivalent. Column is a JSONString, not a nativeArray. UseJSONLength(...).SYSDATE()— not a ClickHouse function. Replace withnow64(3)(UTCDateTime64(3)).ts_start_inclusive >= ?with aDatebind — the warehouse adapter formats Dates as ISO 8601 withZ\, which ClickHouse rejects when implicitly converting toDateTime64(3). Wrap withparseDateTime64BestEffort(?)`.row.RULE_ID,row.NUM_PASSES, etc — Snowflake defaults to uppercase identifiers; ClickHouse returns column names as written. Every column access was readingundefined, sogetCurrentPeriodRuleAlarmStatusescollapsed every rule into a single{ undefined: INSUFFICIENT_DATA }bucket.passCount/passingUsersCount/runsCountcast asnumber— ClickHouseInt64/UInt64columns deserialise to JSBigInt.binomialTestthrows on1neven though it accepts1. Coerce withNumber(...).Net: the rule-anomaly detection job is effectively a no-op against the current ClickHouse setup until this lands. The integ test demonstrates an end-to-end alarm transition with the fix in place.
AGENTS.md callouts
Per
coop/AGENTS.md> Human-approval-required actions:server/iocContainerin a way that changes service lifecycles — no, the integ test reuses the existingmakeIntegrationServer()harness without changes.Changes
server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.ts— five Snowflake-isms ported to ClickHouse.server/services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts— inline snapshots updated; mock fixture updated to use the real ClickHouse output shape (lowercase keys,BigIntints).server/test/integ/background-jobs.integ.test.ts— new integ test for both jobs.lint-staged.config.mjs— same shell-quoting fix as julietshen/user-strikes-signal; needed to commit through the hook on this branch since that fix hasn't merged to main yet.Test plan
(cd server && NODE_OPTIONS="--no-warnings --loader ts-node/esm --require dotenv/config" npx jest services/ruleAnomalyDetectionService/getRuleAnomalyDetectionStatistics.test.ts)— 5 passed, 5 snapshots.(cd server && npm run test:integ -- background-jobs.integ.test.ts)against a real stack (npm run up && npm run db:update) — both scenarios green.JSONLengthfor the JSON-array column,now64(3)for the UTC DateTime64 comparison).🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests