Skip to content

feat(contrail): live-ingest entrypoint + fork pin#582

Merged
tompscanlan merged 12 commits into
mainfrom
feat/contrail-live-ingest
May 16, 2026
Merged

feat(contrail): live-ingest entrypoint + fork pin#582
tompscanlan merged 12 commits into
mainfrom
feat/contrail-live-ingest

Conversation

@tompscanlan

Copy link
Copy Markdown
Contributor

Summary

Adds a long-running contrail:ingest entrypoint that streams ATProto records from Jetstream into a Contrail Postgres appview continuously. Will replace the suspended contrail-sync CronJob as the primary writer post-merge (Phase E, separate infra repo); CronJob stays around as cold-start fallback.

  • New src/contrail/ingest.tsrunPersistent loop with SIGTERM-bound AbortController.
  • Env-driven networkOverrides so the same image runs in our dev cluster (private PLC + Jetstream + relays + SSRF allowlist) and prod (defaults). Spec: 2026-05-15-pr44-growth-and-live-ingest-design in the vault.
  • New Dockerfile target ingest — same artifact as release, CMD overridden to npm run contrail:ingest. COPY vendor/ added to the base stage so the file: tarball deps resolve during npm ci.
  • Pinned to fork tarballs of @atmo-dev/contrail* (flo-bit/contrail#44, head 583f5e8) until upstream merges + publishes. Tarballs are committed under vendor/ (~351 KB across 5 files) so CI checkout has them on disk; regenerated via scripts/prepare-contrail-deps.sh on each fork bump.

The DELETE PATH

PR #44's L3 (IF NOT EXISTS schema) + L3-followup (runIdempotentDdl for concurrent CREATE on Postgres) means the library now owns init idempotency. The consumer-side init lock (contrail-init-lock.ts, from early dev) is DELETED in commit 7a35daf. No coordination needed between API replicas at boot.

Operator runbook (Phase F preview — not executed in this PR)

Single-replica writer Deployment (Phase E manifest, separate repo). For schema-reset operations:

  1. Scale any readers depending on the schema to 0.
  2. DROP SCHEMA contrail CASCADE; in the appview DB.
  3. Let the next contrail-ingest pod re-init the schema (idempotent under concurrent boots; see contrail-init-idempotency.spec.ts).
  4. Scale readers back; verify XRPC reads.
  5. If a full backfill is needed, manually unsuspend the existing contrail-sync CronJob (kept as cold-start fallback).

Test plan

  • npm run build succeeds with tarball-pinned deps.
  • Phase C local validation (manual; vault daily 2026-05-16 §09:00):
    • C2: ingest connects to local devnet (PLC :2582, Jetstream :6008, PDS :4000), reads cursor=none, flushes 18 events (6 events / 2 RSVPs / 8 identities / profiles), advances cursor.
    • C3: SIGTERM → Persistent ingestion stoppedDisconnected: 1000 (WebSocket clean close).
    • C4: two consecutive restarts produce identical startup pattern; cursor resumed; no DDL errors round 2.
  • Phase D smoke (tarball-pinned build): identical behavior to the file:-linked dev build — same 8 known DIDs loaded, same cursor resumed, SIGTERM clean.
  • Containerized image test (openmeet-ecr/openmeet-api:contrail-live-ingest-7a425f6): runs against host devnet via --add-host=host.docker.internal:host-gateway, connects, flushes events, exits 0 on docker stop.
  • contrail-init-idempotency.spec.ts: concurrent test PASSES (load-bearing for L3-followup); sequential test blocked on a Jest --experimental-vm-modules linker quirk with the linked fork's hybrid ESM/CJS bundle — separate test-infra issue, tracked under follow-ups.

What this PR does NOT do

  • No infra changes. The contrail-ingest Deployment manifest + dev-overlay env vars land in openmeet-infrastructure after this merges and the ECR image is published (Phase E, Gate G3). Image SHA from main post-merge feeds the image transformer there.
  • No operator action. Runbook above is documentation only; Phase F unsuspends the CronJob and performs first prod activation in a separate change.

Pre-existing baseline NOT introduced here

*.spec.ts files in activity-feed/, atproto-identity/, and auth/auth.service.event-date-validation.spec.ts carry ~461 pre-existing tsc --noEmit errors. Confirmed pre-existing in the plan (2026-05-15-pr44-growth-and-live-ingest-plan) §Phase 1 follow-ups and again in daily 2026-05-16 §09:00. Deferred to a separate hygiene PR per the plan.

Follow-ups (separate PRs / issues)

  • Drop vendor/ (all 5 .tgz files + scripts/prepare-contrail-deps.sh) + the Dockerfile COPY vendor/ step once PR Tom/bring tests #44 merges and @atmo-dev/contrail* publishes to npm.
  • Fix Jest ESM linker for sequential idempotency spec (@atmo-dev/contrail-base ... not linked on --experimental-vm-modules).
  • Pre-existing tsc spec-baseline cleanup (project hygiene).
  • Phase E (infra manifest) + Phase F (prod activation runbook execution).

References

  • Upstream PR: flo-bit/contrail#44 (head 583f5e8)
  • Local image tag tested: openmeet-ecr/openmeet-api:contrail-live-ingest-7a425f6

Converts contrail.config.ts to an async builder (buildContrailConfig)
because @atcute/identity-resolver is ESM-only and must be loaded via the
same Function-import trick already used in contrail-loader.ts. Provider
and sync.ts updated to await the builder. Adds an ambient declare for
@atcute/identity-resolver to the shim so TS resolves the classes.

Env vars:
- CONTRAIL_PLC_URL — point PLC resolver at a private mirror
- CONTRAIL_SLINGSHOT_URL — replace public slingshot endpoint
- CONTRAIL_ALLOWED_HOSTS — comma-separated SSRF allowlist additions
- CONTRAIL_JETSTREAM_URLS, CONTRAIL_RELAYS — comma-separated overrides
…d fork

- Add @atcute/identity-resolver@^1.2.3 as direct dep. The consumer
  constructs the resolver and passes it in (per fork's PR #44 API);
  was transitive-only in the fork's pnpm workspace and not reachable
  from OM API's npm node_modules.
- Extend Jest transformIgnorePatterns with contrail-pr30/packages/.*/dist/
  so ts-jest stops transforming the linked fork's ESM bundle as CJS.

Surfaced by Phase C / Gate G1 local validation 2026-05-16.
Replaces filesystem path deps (file:../contrail-pr30/packages/*) with
vendored tarballs under vendor/. Image build (Phase D2) consumes these
without needing the sibling fork checkout.

Pins:
  @atmo-dev/contrail          vendor/atmo-dev-contrail.tgz
  @atmo-dev/contrail-base     vendor/atmo-dev-contrail-base.tgz       (override)
  @atmo-dev/contrail-appview  vendor/atmo-dev-contrail-appview.tgz    (override)
  @atmo-dev/contrail-authority    vendor/atmo-dev-contrail-authority.tgz  (override)
  @atmo-dev/contrail-record-host  vendor/atmo-dev-contrail-record-host.tgz (override)

Packed from tompscanlan/pr30-network-overrides @ 583f5e8 (PR #44 head)
which includes the L3-followup runIdempotentDdl helper. -authority and
-record-host are pinned to satisfy transitive deps of -appview even
though they aren't used at runtime (β path: PR #30 core, no community,
no record-host). -community deliberately not vendored.

vendor/ contents are gitignored; regenerate via:
  scripts/prepare-contrail-deps.sh

Drop the script + vendor pins once @atmo-dev/contrail* publishes to npm.

Validated: tarball-pinned ingest runs identically to file:-linked build
(Phase A.b smoke; 8 known DIDs loaded, cursor resumed, SIGTERM clean).
…rget

- base stage: COPY vendor/ before npm ci so file:./vendor/atmo-dev-contrail*.tgz
  deps resolve. Drops out when upstream publishes to npm (PR #44 follow-up).
- new ingest target: same artifact as release, CMD overridden to
  `npm run contrail:ingest`. k8s Deployment for contrail-ingest builds against
  --target ingest; existing API Deployment continues to use release.

Validated D3 smoke: containerized ingest connects to devnet via host-gateway,
loads 8 known DIDs, resumes cursor, flushes 6 events, SIGTERM clean (exit 0).
vendor/*.tgz needs to exist on disk for `npm ci` and `docker build` in CI
(deploy-to-dev.yml). Gitignoring the tarballs broke that — CI checkout
has an empty vendor/ and the file: deps fail to resolve.

5 tarballs, ~351 KB total. All packed from the fork at PR #44 head
(tompscanlan/pr30-network-overrides @ 583f5e8). Cleanup PR removes
vendor/ entirely once upstream publishes @atmo-dev/contrail* to npm.
vendor/*.tgz is tracked now (prior commit); reflect that in the
regeneration workflow comment so future bumps remember to git add
the new tarballs.
CI's e2e-test job builds via relational.e2e.Dockerfile which copies only
package*.json into /tmp/app/, then runs `npm install` — but vendor/
isn't there, so the file:./vendor/atmo-dev-contrail*.tgz deps ENOENT
on @atmo-dev/contrail-record-host.tgz.

Same fix for relational.test.Dockerfile (same pattern, used by
docker-compose.relational.test.yaml).

Drops out alongside the production Dockerfile change when upstream
publishes (PR #44 follow-up).
@tompscanlan tompscanlan merged commit b518567 into main May 16, 2026
4 checks passed
tompscanlan added a commit that referenced this pull request May 17, 2026
…583)

PR #582 appended `FROM release AS ingest` at the end of the Dockerfile.
deploy-to-dev.yml's `docker build` runs without `--target`, so Docker
defaults to the LAST stage in the Dockerfile — silently flipped the API
image to the ingest variant (CMD `npm run contrail:ingest`). New API
pods on b518567 crashloop because no HTTP server binds to port 3000 and
the readiness probe gets connection refused.

The `ingest` target was never load-bearing: the Phase E Deployment
manifest overrides `command: ["npm", "run", "contrail:ingest"]` on the
single-replica ingest pod, so the same `release` image works for both
roles. Dropping the extra target restores `release` as the default and
unblocks the dev rollout.
tompscanlan added a commit that referenced this pull request May 21, 2026
* fix(docker): drop ingest target so release stays default build stage

PR #582 appended `FROM release AS ingest` at the end of the Dockerfile.
deploy-to-dev.yml's `docker build` runs without `--target`, so Docker
defaults to the LAST stage in the Dockerfile — silently flipped the API
image to the ingest variant (CMD `npm run contrail:ingest`). New API
pods on b518567 crashloop because no HTTP server binds to port 3000 and
the readiness probe gets connection refused.

The `ingest` target was never load-bearing: the Phase E Deployment
manifest overrides `command: ["npm", "run", "contrail:ingest"]` on the
single-replica ingest pod, so the same `release` image works for both
roles. Dropping the extra target restores `release` as the default and
unblocks the dev rollout.

* chore(contrail): vendor contrail-community from PR30+PR31 integration

Rebuild the contrail tarballs from the fork's feat/pr30-pr31-integration
branch and add @atmo-dev/contrail-community as a sixth vendored package.
prepare-contrail-deps.sh now builds and packs contrail-community alongside
the other five; package.json pins it as a file: dependency. Verified at
runtime against both sqlite (in-memory) and postgres: contrail + community
schemas apply and createApp wires the spaces/community routes.

* feat(contrail): extend type shim for community integration

Add CredentialKeyMaterial, AuthorityConfig, SpacesConfig, and CommunityIntegration
types to the @atmo-dev/contrail ambient module. Extend ContrailConfig with spaces?
and community?, ContrailOptions with communityIntegration?, and add
generateAuthoritySigningKey() function declaration. Add new
@atmo-dev/contrail-community module declaration with createCommunityIntegration().

* feat(contrail): add env-gated spaces.authority + community config

Adds two opt-in sections to buildContrailConfig(): spaces.authority
(enabled by CONTRAIL_AUTHORITY_SIGNING_KEY) and community (enabled by
CONTRAIL_COMMUNITY_MASTER_KEY), both absent by default so existing
ingest-only deployments are unaffected.

* feat(contrail): add loadContrailCommunity ESM loader

* test(contrail): postgres verification for community wiring

* fix(contrail): import ESM subpaths sequentially to avoid jest vm-modules link race

* refactor(contrail): rename community master key env/locals to encryption key

Keep the vendor 'masterKey' config property (contrail-community API);
feed it from CONTRAIL_COMMUNITY_ENCRYPTION_KEY. Avoids 'master' terminology
in our env vars, locals, and docs.

* test(contrail): capture process.env snapshot in beforeEach for test-order resilience

* feat(contrail): wire community integration into ContrailProvider

* docs(contrail): document community/spaces env vars + wire dev compose

Append a Contrail block to env-example-relational covering
CONTRAIL_DATABASE_URL, CONTRAIL_COMMUNITY_ENCRYPTION_KEY,
CONTRAIL_AUTHORITY_SIGNING_KEY and the optional network vars, noting
that community XRPC routes mount only when both the community block and
spaces.authority are configured.

Wire the same CONTRAIL_* vars into the api service in
docker-compose-dev.yml: CONTRAIL_DATABASE_URL reuses the local Postgres
(schema-isolated), the two keys interpolate from the shell at `up` time
so no key material lands in git. SERVICE_DID is intentionally omitted —
contrail.config defaults it and it is a shared var read elsewhere.

* test(contrail): e2e probe that community.getHealth mounts

Assert GET /xrpc/net.openmeet.community.getHealth returns 401
(auth-gated, mounted) and explicitly not 404 (unmounted) or 503
(provider down). Gated on the community env vars since the routes only
register when both community and spaces.authority are configured.

* test(contrail): enable contrail-xrpc e2e in CI + drop flaky loader spec

- Add CONTRAIL_* (DB URL on the CI Postgres, schema-isolated, plus two
  throwaway CI test keys) to env-example-relational-ci so the
  contrail-xrpc e2e suite runs in CI instead of skipping. Verified end to
  end in the layered CI stack: 5/5 pass, including community.getHealth.
- Delete contrail-community-loader.spec.ts: it only asserted an export is
  a function (covered by the wiring spec + e2e + runtime) and triggered a
  jest vm-modules teardown race when sharing a worker with sibling contrail
  specs (om-mcv0) — a real risk to the maxWorkers=2 unit job.

* fix(contrail): enforce community both-keys invariant; drop dead method

Address PR #585 review feedback:

- buildContrailConfig now assembles the community block only when BOTH
  CONTRAIL_COMMUNITY_ENCRYPTION_KEY and CONTRAIL_AUTHORITY_SIGNING_KEY are
  set. Community routes are service-auth gated against credentials the
  authority signs/verifies, so an encryption key alone is a half-configured
  mount the vendor integration can't serve. Previously the partial-config
  path (encryption key, no authority) reached createCommunityIntegration at
  startup untested; now it is dropped with a warning. This matches the
  documented "both keys" behavior in env-example-relational.
- Remove unused ContrailProvider.isCommunityReady() (no callers; the /xrpc
  middleware forwards through handle() unconditionally). communityEnabled
  is retained for the init log line.
- Mark the committed CI test keys as DO-NOT-REUSE / inert outside CI.
- Add trailing newline to env-example-relational.

Tests: src/contrail config suite 4/4 (adds a partial-config drop+warn case);
wiring/idempotency unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant