feat(contrail): add refresh phase to contrail:sync#584
Merged
Conversation
Adds a third phase to `npm run contrail:sync`: after discover + backfill, walk every known DID's PDS and apply records we're missing or have stale locally. Closes drift caused by jetstream cursor expiry (multi-day outages) or transient ingest failures the live-ingest pod didn't recover from. Records inside the lib's ignore window (default 60s) are skipped so this stays cheap to run frequently. Intended to run as a daily CronJob alongside the live-ingest Deployment that handles the continuous case. Re-running is idempotent and bounded by the relay's reverse-index for our NSIDs. Also adds the missing Refresh* type declarations to contrail.d.ts so the local Contrail class typing matches the actual lib.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extends
npm run contrail:syncfromdiscover → backfilltodiscover → backfill → refresh. The third phase re-walks every known DID's PDS, compares records against our local DB, and applies anything missing or stale.Why now: dev confirmed today that the live-ingest path catches new writes but has no mechanism to repair gaps from Jetstream cursor expiry (multi-day outage scenarios) or transient ingest failures. The contrail library already exposes
Contrail.refresh()for exactly this — we just weren't calling it.Cost: with ~1,200 known DIDs × 2 collections at the lib's default concurrency=50, an end-to-end run is single-digit minutes. Records inside
ignoreWindowMs(default 60s) are skipped so the bulk of recent writes don't cause repeat PDS reads.Test plan
Functional verification will land via the infra PR that:
contrail-syncCronJob in the dev overlayAfter that:
kubectl create job --from=cronjob/contrail-sync contrail-sync-verify-$(date +%s) -n devDiscovery,Backfill,Refresh) with summary statsRefreshed: N missing + M stale (X users scanned)line appearscontrail.records_event, re-run → expect 1 missing, record restoredProd opt-in is gated behind separate Track B work (monitoring + image pin + CONTRAIL_DATABASE_URL secret).
What this PR does NOT do
Contrail.refresh()is exercised by the lib's own test suite; this PR is glue code around the existing public API.openmeet-skip-typecheckbecause the activity-feed/atproto-identity spec baseline is on a parked cleanup list; productionsrc/typechecks clean for this change