Skip to content

fix(docker): drop ingest target so release stays default build stage#583

Merged
tompscanlan merged 1 commit into
mainfrom
fix/dockerfile-default-target-release
May 17, 2026
Merged

fix(docker): drop ingest target so release stays default build stage#583
tompscanlan merged 1 commit into
mainfrom
fix/dockerfile-default-target-release

Conversation

@tompscanlan

Copy link
Copy Markdown
Contributor

Summary

Hotfix for dev API crashloop introduced by PR #582.

PR #582 appended FROM release AS ingest as the last stage of the Dockerfile. .github/workflows/deploy-to-dev.yml runs docker build with no --target, so Docker defaults to the last stage in the file — the API image at tag b518567 is therefore the ingest variant with CMD ["npm", "run", "contrail:ingest"].

The API Deployment doesn't override CMD, so the new pod runs the ingest entrypoint, never binds port 3000, readiness probe hits connection refused, kubelet restarts → crashloop.

> openmeet-api@1.5.0 contrail:ingest
> node dist/contrail/ingest.js
[ingest] starting; namespace=net.openmeet, schema=contrail, jetstreams=(default)
Loaded 1205 known DIDs from database
Starting persistent ingestion. Cursor: ...

Dev is degraded but not down — the prior 44h pod (49c35a4) keeps serving.

Fix

Delete the FROM release AS ingest block. The release stage becomes the default again; npm run start:prod runs by default; readiness probe passes.

The ingest target was not load-bearing: Phase E's contrail-ingest Deployment overrides command: ["npm", "run", "contrail:ingest"] on the single-replica pod, so the same release image works for both roles.

Why this slipped past CI

  • test and e2e-test jobs build relational.test.Dockerfile / relational.e2e.Dockerfile, not the production Dockerfile.
  • build-and-deploy job's docker build . "succeeds" — it just builds the wrong stage. No probe runs in CI to catch the no-HTTP-server image.

Follow-up issue worth filing: add a CI smoke that runs the built image with the same readiness probe as the Deployment.

Test plan

  • git diff shows only the 9-line ingest block removed; release stage and its CMD npm run start:prod untouched.
  • Post-merge: build-and-deploy produces a new ECR image; gitops bumps dev overlay; ArgoCD rolls API; new pod passes readiness within ~90s.
  • kubectl get pods -n dev -l app=api shows 2/2 Ready, no restart loop.

PR #582 appended `FROM release AS ingest` at the end of the Dockerfile.
deploy-to-dev.yml's `docker build` runs without `--target`, so Docker
defaults to the LAST stage in the Dockerfile — silently flipped the API
image to the ingest variant (CMD `npm run contrail:ingest`). New API
pods on b518567 crashloop because no HTTP server binds to port 3000 and
the readiness probe gets connection refused.

The `ingest` target was never load-bearing: the Phase E Deployment
manifest overrides `command: ["npm", "run", "contrail:ingest"]` on the
single-replica ingest pod, so the same `release` image works for both
roles. Dropping the extra target restores `release` as the default and
unblocks the dev rollout.
@tompscanlan tompscanlan merged commit 09c0e20 into main May 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant