Skip to content

feat(tools): prod Release flow (initiate/approve/promote) — spec + plan#404

Open
prasanna-anchorage wants to merge 18 commits into
mainfrom
feat/prod-release-flow
Open

feat(tools): prod Release flow (initiate/approve/promote) — spec + plan#404
prasanna-anchorage wants to merge 18 commits into
mainfrom
feat/prod-release-flow

Conversation

@prasanna-anchorage

Copy link
Copy Markdown
Contributor

Stacked on #395. Adds the design + implementation plan for prod parser_app deploys with separation of duties: CI initiates and promotes holding only the Turnkey API key; a human operator approves the manifest out-of-band with the quorum key. A pivot-hash pin (verify --expected-pivot-hash) lets the post-promote smoke assert the live enclave runs exactly the released binary.

In this PR so far

  • docs/specs/2026-06-29-prod-release-initiate-approve-design.md — approved design
  • docs/specs/2026-06-29-prod-release-initiate-approve-plan.md — phased TDD implementation plan

Planned (per the plan, to be added here)

  • Phase 1 (visualsign-turnkeyclient, stacked on PR Add ethereum networks #23): verify --expected-pivot-hash + surface the real pivot hash
  • Phase 2: TvcOps trait + initiate/approve/promote subcommands (dev deploy composes them)
  • Phase 3: smoke.sh --canonical + --expected-pivot-hash
  • Phase 4: rename TVC Deploy -> TVC Deploy (Dev); add Release (initiate-only) and Promote (set-live + pinned smoke) workflows; runbook update

Prerequisites (not blocking review of the above)

  • Prod Turnkey org + API key (TVC_PROD_*), prod app id, prod operator id + its 1Password item, and a prod parse key/fixture for the Promote smoke.

🤖 Generated with Claude Code

prasanna-anchorage and others added 18 commits June 23, 2026 07:17
Lean, fast-compiling Rust binary (own workspace; deps: qos_p256 + serde_json
only, no visualsign crates) that owns the parser_app TVC deploy flow:

- gen-operator-key: mint a qos_p256 operator key; seed -> 0600 file, only the
  public key is printed (never the seed).
- deploy: assemble tvc-deploy.json (gRPC health), create the deployment, gate
  on the manifest pivot hash matching --expected-digest before approving,
  approve, poll until replicas are healthy, then set live. Retries the
  transient TVC states (status-not-ready, zero-healthy-replicas) and tolerates
  the fresh-app auto-target. Shells out to the `tvc` CLI for API actions.

Proven end-to-end against a throwaway dev app (create -> digest gate -> approve
-> 3/3 -> live), which was then deleted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A workflow_dispatch workflow to exercise dev/staging parser_app deploys on
demand: builds the standalone tvc-deploy helper and runs create -> digest
safety-gate -> approve -> poll-to-healthy -> set-live with operator/API-key
secrets. Inputs (app_id, image_url, expected_digest, operator_id, qos/host/port)
let an operator trigger a deploy with args. Dev/staging only; prod stays manual.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…code

Drop the hand-rolled hex_encode; use P256Pair::to_master_seed_hex() and
P256Public::to_hex_bytes() (qos_p256 owns these formats). No new deps.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…gate

Reorganize the deploy helper on the idiomatic shell-orchestration stack
(xshell cmd! + anyhow + lexopt), dropping the hand-rolled process wrappers,
String errors, HashMap flag parser, and the to_str()-lossy path() helper.

Replace the fragile `approve --dry-run` manifest-hash parse with an
image-derived digest gate: extract /parser_app from the image and sha256 it,
asserting it equals --expected-digest before deploying (ties the digest to the
actual binary).

Addresses review on #395:
- workflow: add minimal `permissions: contents: read`; pin `cargo install tvc
  --version 0.6.2 --locked` (reproducible, no drift).
- deploy(): move config-write inside the cleanup closure so an early error
  never leaves the operator seed temp file on disk.
- path() (lossy non-UTF-8 -> "") removed; xshell takes Path directly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…label

Add a label-gated pull_request trigger (mirrors the stagex pattern) so the
deploy workflow can be exercised from a PR before merge: applying the
`tvc-deploy-test` label runs a deploy against the dev TEST app using repo
vars.TVC_TEST_* (never the live app). workflow_dispatch keeps explicit inputs;
the job `if` gates to dispatch or the specific label, and a concurrency group
prevents overlapping deploys.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
crates.io tvc 0.6.2 takes `deploy create <CONFIG_FILE>` positionally, but the
helper (and the locally-built tvc) use `--config-file`. 0.7.0 is the published
release that carries the `--config-file` flag interface; pin to it so CI matches.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a smoke step to the TVC deploy workflow that parses a known Solana V0+ALT
transaction through the deployed dev parser (/visualsign-dev) and asserts it
renders — a regression guard for the "Cannot render V0" failure. Drives the
published turnkey-client container (no Go), reconstructing the parse API key
from the existing TVC_API_KEY_* secrets; abort guard skips on a transport
outage. Requires the turnkey-client image on GHCR (visualsign-turnkeyclient).

- scripts/smoke.sh: container-driven parse + jq assertions.
- testdata/solana_v0_alt.b64: the V0+ALT fixture.
- tvc-deploy.yml: Smoke step (+ packages: read) and key scrub.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- write_secret_file: explicitly chmod 0600 after open (mode() only applies on
  create; a pre-existing broader-perm file could leave the seed world-readable).
- verify_image_digest: always remove the extracted temp binary + rm the
  container, even when docker cp / sha256sum fails early.
- temp_path: add a per-process atomic counter so same-tick calls can't collide.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bound the job (cargo install tvc + poll-to-healthy + smoke) so a hung step
can't run indefinitely.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolve the approval seed as flag -> env TVC_CI_OPERATOR_SEED -> none. When
none is given, omit --operator-seed so `tvc deploy approve` falls back to the
logged-in org operator key (the local `tvc login` path), making local deploys
work without materializing a seed file. CI still sets the env, so it continues
to write and scrub a temp seed; the flag is splatted as OsString args (no lossy
path conversion) and only the env-sourced temp file is cleaned up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sion

Switch the post-deploy smoke from `parse` to `verify --dev-path`, asserting the
tx both RENDERS (Cannot-render-V0 guard) and cryptographically VERIFIES (AWS
Nitro attestation + enclave signature via .valid/.attestationValid/
.signatureValid). Also:

- Failure classification defaults to a hard error (exit 2); only a recognized
  endpoint outage skips, so an unpullable image can't masquerade as a pass.
- --turnkey-client-path falls back to a local binary or source dir (built) when
  the container image is unavailable.
- --turnkey-client-version / VSP_SMOKE_CLIENT_VERSION pins an approved client
  image instead of :latest; wired through the workflow (input -> repo var).
- Pass the client's verification log through by default; --quiet suppresses it.
- Fix a set -e bug where the FAIL-branch summary jq could mask exit 1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Design for prod separation-of-duties deploys: CI initiates (digest gate +
create) and promotes (set-live + pinned smoke) holding only the Turnkey API
key, while a human operator approves the manifest out-of-band with the quorum
key. Adds a pivot-hash pin (verify --expected-pivot-hash) so the post-promote
smoke asserts the live enclave runs exactly the released binary.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Phased, TDD, bite-sized plan: verify --expected-pivot-hash (Go), TvcOps trait +
initiate/approve/promote subcommands (Rust), smoke.sh --canonical + pin (bash),
and the Release/Promote/renamed-dev workflows.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…prove)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…leanup

Resolves seed leak when initiate fails (digest gate / create) by resolving the
env-sourced seed only after initiate succeeds, routed through a shared
approve_and_cleanup helper used by both do_deploy and approve. Doc header now
lists all subcommands.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Base automatically changed from feat/tvc-deploy-helper to main July 2, 2026 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant