Skip to content

feat: #216 agent-side wire — vault-fetched LLM key into Hermes (verified live, real LLM)#218

Open
hanwencheng wants to merge 15 commits into
mainfrom
claude/216-agent-wire
Open

feat: #216 agent-side wire — vault-fetched LLM key into Hermes (verified live, real LLM)#218
hanwencheng wants to merge 15 commits into
mainfrom
claude/216-agent-wire

Conversation

@hanwencheng

@hanwencheng hanwencheng commented Jun 6, 2026

Copy link
Copy Markdown
Member

Context

Closes the core of #216 (agent-side wire): the agent runs Hermes on an LLM key it fetched from the master's vault — never an ambient OPENROUTER_API_KEY. The cred-fetch wire primitive landed via #217; this PR adds the agent-facing consumer, the store half, real e2es, and the env→vault swap in the wire demo — proven end-to-end against the LIVE broker + cred worker + aiosandbox with a real LLM call.

What landed

1. agentkeys cred fetch <service>crates/agentkeys-cli/src/cred_admin.rs (+ main.rs/lib.rs). Mints a CredFetch cap → BackendClient::cred_fetch → per-actor STS → cred worker → decrypts → prints the plaintext. Routes through the shared agentkeys-backend-client (the #204 one-owner path — no re-typed wire shapes).

2. agentkeys cred store <service> --secret|--secret-env — the symmetric store half (master-self by default). Adds CredStore{Body,Resp,Input,Result} + BackendClient::cred_store to the crate, and folds the daemon's hand-rolled cred-store body into the crate type (store_master_credential_inner — closes a #204 drift gap; a drifted field is now a compile error). --secret-env NAME keeps the plaintext off argv / out of the shell history.

3. harness/cred-fetch-demo.sh — the cred-fetch real e2e (STEP_TOTAL=4): master vaults a probe via the daemon (web path), agent fetches via the CLI (agent path), asserting the EXACT secret round-trips.

4. harness/cred-wire-demo.sh — the FULL #216 wire e2e (STEP_TOTAL=6, headless): master vaults the LLM key → agent cred-fetches itplant into the sandbox Hermes (~/.hermes/.env + hermes config set model.*) → Hermes runs on the vault key (real LLM smoke). Asserts the planted key == the vaulted key (sha) with no OPENROUTER_API_KEY in the agent env.

5. phase1-wire-demo.sh Phase 4.0b — the env→vault swap (#216's named target, :1072). Phase 4.0b now resolves the wire key vault-first (agentkeys cred fetch, agent identity) and plants THAT; $OPENROUTER_API_KEY becomes a clearly-labelled dev-only fallback. SEED_SCOPE_SERVICES also grants the agent its cred scope (bare $SERVICE). Fallback-safe: with no vaulted key / no cred scope, it degrades to the env key exactly as before. Honest labelling in the 0.6 step, header, overview, and the 4.0 ok line.

keep-docs-in-sync: harness/CLAUDE.md orchestrator table + docs/operator-runbook-harness.md (both demos).

Verified live (this PR, real OpenRouter key)

harness/cred-wire-demo.sh against the live broker + cred worker + aiosandbox, release binaries:

step 4  ok agent fetched the vaulted key from the vault (len=73, sha fddff3ff…) — no env read
step 5  ok planted the vault-fetched key into ~/.hermes/.env + hermes config
step 6  ok 6.1 vault-sourced — the key Hermes will use == the master-vaulted key, NOT an env var
step 6  ok 6.2 llm smoke — Hermes answered using the VAULT-FETCHED key: "OK"

A real deepseek-v4-flash call via OpenRouter answered "OK" on the vault-fetched key — #216's headline acceptance proven with real data, end-to-end through the live chain.

cred store verified with a CLI-only store→fetch round-trip:

stored `cred-store-probe` → bots/941…/credentials/cred-store-probe.enc
✅ CLI store→fetch ROUND-TRIP PASS — agentkeys cred store works end-to-end

cargo clippy --workspace --all-targets -D warnings clean; cargo test --workspace green.

What did NOT land (tracked, remaining in #216)

🤖 Generated with Claude Code

…st live infra)

The agent-facing consumer of the #216 cred-fetch primitive, verified end-to-end
against the LIVE broker + cred worker:

- agentkeys-cli: `agentkeys cred fetch <service>` (cred_admin.rs) — mints a
  master-self/agent CredFetch cap → BackendClient.cred_fetch → STS → cred worker
  → decrypt → prints the plaintext. Adds the agentkeys-backend-client dep (the
  #204 one-owner path; no re-typed wire shapes).
- harness/cred-fetch-demo.sh — the real e2e: a master VAULTS a probe cred via the
  daemon (web path), then the agent FETCHES it via the CLI (agent path), asserting
  the EXACT secret round-trips through cap-mint → STS → cred worker → S3 → decrypt.
  Idempotent (fixed `cred-e2e-probe`), --ci-tolerant, real-only. Contract-compliant
  (STEP_TOTAL=4, ok/skip/fail, EXIT-trap daemon cleanup).
- keep-docs-in-sync: harness/CLAUDE.md orchestrator table + operator-runbook-harness.md.

VERIFIED LIVE (this run): master vaulted via daemon (HTTP 200), agent
`cred fetch` returned the EXACT key (len matched) — broker.litentry.org +
cred.litentry.org. #216's cred half is proven, not just compiled.

Remaining #216: the Hermes wire (phase1-wire Phase 4.0) — plant the fetched key
into Hermes instead of $OPENROUTER_API_KEY (the full sandbox surprise).
… live, real LLM)

Carries the #216 cred-fetch through the Hermes wire — the complete agent-side
guarantee, proven end-to-end against the LIVE broker + cred worker + aiosandbox:

  master VAULTS the LLM key  (daemon: cap-mint cred-store → STS → cred worker → S3)
    → agent CRED-FETCHES it  (agentkeys cred fetch: cap-mint cred-fetch → STS → decrypt)
    → plant into Hermes      (~/.hermes/.env + hermes config set model.*) IN THE SANDBOX
    → Hermes RUNS on the vault key (real LLM smoke) — NO OPENROUTER_API_KEY in the agent env

harness/cred-wire-demo.sh (STEP_TOTAL=6, contract-compliant, headless): asserts
the key Hermes uses == the master-vaulted key (sha), and that it arrived via the
vault fetch, not an ambient env var (the sandbox shell has no OPENROUTER_API_KEY;
the .env value is the cred-fetch result). The durable, no-Touch-ID complement to
phase1-wire-demo.sh Phase 4.0b — same wire result without the interactive gates.
Routes through the shared agentkeys-backend-client (#204).

VERIFIED LIVE (this run, real OpenRouter key):
  step 4  ok agent fetched the vaulted key from the vault (len=73, sha fddff3ff…) — no env read
  step 5  ok planted the vault-fetched key into ~/.hermes/.env + hermes config
  step 6  ok 6.1 vault-sourced — the key Hermes will use == the master-vaulted key, NOT an env var
  step 6  ok 6.2 llm smoke — Hermes answered using the VAULT-FETCHED key: "OK"
Exit 0. A REAL deepseek-v4-flash call via OpenRouter answered "OK" on the
vault-fetched key — #216's acceptance ("the agent runs on MY authorized key, not
the operator's env") proven with real data.

Idempotent (FIXED openrouter service; the .env key-line is rewritten not appended);
daemon killed on exit; --ci-tolerant. keep-docs-in-sync: harness/CLAUDE.md +
docs/operator-runbook-harness.md.
…→ dev fallback)

Replaces the operator-env-key write (#216's named target: phase1-wire-demo.sh:1072)
with the vault path: Phase 4.0b now fetches the agent's LLM key from the master's
VAULT via `agentkeys cred fetch cred:<service>` and plants THAT into the sandbox
Hermes — the $OPENROUTER_API_KEY/$LLM_API_KEY env becomes a clearly-labelled
DEV-ONLY fallback.

- Phase 4.0b resolves WIRE_KEY VAULT-FIRST (the agent-identity cred-fetch: operator
  session authorizes, actor=agent device — mirrors the memory cap-mint identity
  model), env-fallback only when the vault is unavailable. Backward-compatible: with
  no vaulted key / no cred scope the fetch fails and it degrades to the env key
  exactly as before, so the change is fallback-safe.
- SEED_SCOPE_SERVICES also grants the agent its cred scope (bare `$SERVICE` — the
  cred-fetch cap-mint hashes the bare service, unlike memory's `memory:<ns>`) so the
  P.3 pairing grant authorizes the vault fetch.
- Honest labelling throughout: the 0.6 step, the header, and the top overview now
  state the env key is the dev fallback and the vault is primary; the 4.0 ok line
  prints which source the planted key came from.

The full vault chain (master vaults → agent cred-fetches → plant → Hermes runs on
it, real LLM smoke) is proven headless + live by harness/cred-wire-demo.sh (this
PR). The interactive agent-identity path additionally needs the operator's Touch ID
cred-scope grant (P.3) + a seeded vault — until then Phase 4.0b labels + uses the
dev fallback.
@hanwencheng hanwencheng changed the title feat: #216 agent cred-fetch — CLI consumer + real e2e (verified live) feat: #216 agent-side wire — vault-fetched LLM key into Hermes (verified live, real LLM) Jun 6, 2026
…n fix (verified live)

Completes the CLI cred surface with the store half of `cred fetch`, and folds the
daemon's hand-rolled cred-store body into the crate (closing a #204 drift gap):

- agentkeys-backend-client: `CredStoreBody`/`CredStoreResp`/`CredStoreInput`/
  `CredStoreResult` (mirror the CredFetch types) + `BackendClient::cred_store`
  (cap-mint CredStore → per-actor STS under the VAULT role → cred worker
  `/v1/cred/store` → encrypt + S3 PUT). Exported from the crate.
- agentkeys-daemon: `store_master_credential_inner` now builds the worker body from
  the crate-owned `CredStoreBody` instead of an inline `serde_json::json!({...})`
  (#204 — "broker/worker request shapes have ONE owner"; a drifted field is now a
  compile error, matching the memory-put path).
- agentkeys-cli: `agentkeys cred store <service> --secret|--secret-env` (master-self
  by default). `--secret-env NAME` keeps the plaintext off argv / out of the shell
  history + process list. Prints the worker S3 key.

VERIFIED LIVE (CLI-only store→fetch round-trip, master-self):
  stored `cred-store-probe` → bots/941…/credentials/cred-store-probe.enc
  ✅ CLI store→fetch ROUND-TRIP PASS — agentkeys cred store works end-to-end

Scope note: this is the master-self vault primitive. The master provisioning a key
INTO the agent's S3 prefix (so the agent fetches with actor=agent) needs dual
bearers (operator session for cap-mint + agent session for the STS PrincipalTag)
and is #214's authorization-side job — deliberately out of #216 scope.

clippy -D warnings clean; cargo check green.
…web app + CLI, fresh start)

Restructures the wire runbook from a CLI/sandbox + memory-only "run the demo" doc
into the single fresh-start guide for testing the WHOLE wire — both the #216
vault-fetched LLM key and the permissioned memory — two ways:

- New top: the two guarantees, a two-paths table (web app vs CLI, same agent side),
  the fastest test (`harness/cred-wire-demo.sh`), and a fresh-start checklist
  (3 setup scripts + sandbox + OpenRouter key + master identity).
- Path A — Web app: `bash dev.sh` → onboard → vault the key (credentials page) →
  pair+authorize (pairing page, Touch ID). Honest "wired vs pending" note: the web
  vault + #214 pairing are real/on-chain today; the agent-identity vault-fetch needs
  #214's dual-bearer master-provisioning (not wired yet), so the master-self
  cred-wire-demo is the end-to-end proof.
- Path B — CLI: the existing phase1-wire-demo walkthrough, reframed.
- LLM-key gate now documents Phase 4.0b vault-first/env-fallback; "Verifying it
  worked" splits into the two deterministic checks; +3 web/cred troubleshooting rows;
  Appendix B gains the `cred store`/`cred fetch` primitives; cross-refs add the new
  demos + #216/#214 + dev.sh.

keep-docs-in-sync: folds back the cred-wire-demo + cred-store + Phase 4.0b changes
from this PR into the operator runbook.
Caught in review: Path A had the agent run in the sandbox (agentkeys-daemon
--request-pairing → cred fetch → wire hermes) but never said how the compiled
agentkeys / agentkeys-daemon / agentkeys-mcp-server binaries get INTO the sandbox.
They can't run there unless cross-built for the sandbox's Linux arch and uploaded
(the sandbox is aarch64/x86 Linux, not the operator's Mac) — which is what Path B /
phase1-wire-demo.sh Phase 1 does (target/sandbox-linux cross-build → sbx_put).

Rewrote Path A to be honest:
- The web app is ONLY the master's console; it does not provision the agent device.
- A. Vault the LLM key — fully standalone (no sandbox).
- B. Pair — needs the agent binaries in the sandbox first; and phase1-wire's Phase 1
  bundles the cross-build/upload WITH the CLI pairing (Phase P lives inside Phase 1),
  so there's no clean "binaries only" command and no one-command web-pairing flow yet
  (drive the web claim by hand: upload binaries, open a request, claim in the UI).
- C. End-to-end is the headless cred-wire-demo.sh / Path B.
Also corrected my own first attempt, which suggested `--skip-2..5` to "stage only the
sandbox" — that still runs Phase 1 and therefore CLI-pairs the agent.
…t + add sandbox-build-push.sh

Per review: the runbook treated Path A as leaning on Path B's harness for the agent
side. Now each path is a self-contained quick-start.

- NEW harness/sandbox-build-push.sh — Path A's standalone "compile agentkeys + push to
  the sandbox" command. Cross-builds the 3 binaries (agentkeys / -mcp-server / -daemon)
  for the sandbox's aarch64-Linux arch in the SAME cached arm64 builder image + cargo
  volumes phase1-wire-demo uses (warm tree re-pushes in seconds), uploads them to
  ~/.local/bin. Build + push ONLY — never pairs/wires. Re-run after any local change so
  the in-sandbox agent runs current source. VERIFIED live: pushed to the sandbox, and
  `agentkeys cred --help` there confirms the current #216 source.
- operator-runbook-wire.md restructured: "Two independent paths — pick one" with BRIEF
  quick-starts for each (Path A = sandbox-build-push.sh + dev.sh + 3 UI actions; Path B =
  one phase1-wire-demo command) + a "neither path" headless check (cred-wire-demo). Path A
  details now use sandbox-build-push.sh (dropped the phase1-wire dependence + the
  now-moot "harness bundles pairing" caveat); kept the honest #214 wired-vs-pending note.
- keep-docs-in-sync: harness/CLAUDE.md inventory + operator-runbook-harness.md.
…broker-url

Operator hit `Error: --broker-url (or AGENTKEYS_BROKER_URL) required for
--request-pairing` running the runbook command in the sandbox — my Path A command
dropped the required flag. Verified the corrected invocation in the live sandbox
(produces a pairing_code). Folded the complete, correct flow into Path A:

  1. sandbox: agentkeys-daemon --request-pairing --broker-url https://broker.litentry.org
     → prints pairing_code + a state_file (the request_id lives in the file, not stdout)
  2. web UI: claim the pairing_code (Touch ID)
  3. sandbox: agentkeys-daemon --retrieve-pairing --request-id <from state file> --broker-url …

Matches phase1-wire-demo.sh Phase P.0/P.1b exactly. Fixed both the quick-start and the
Path A — details command.
…needed)

`agentkeys-daemon --request-pairing` / `--retrieve-pairing` required --broker-url
(or AGENTKEYS_BROKER_URL) and errored without it — friction for the Path-A operator
running them in the sandbox. These commands ALWAYS need a broker, so default it:

- main.rs: new `const DEFAULT_PAIRING_BROKER_URL = "https://broker.litentry.org"`;
  run_request_pairing + run_retrieve_pairing now `unwrap_or_else(default)` instead of
  erroring. `--broker-url` / `AGENTKEYS_BROKER_URL` still override (e.g. a test broker).
  Deliberately NOT a global arg default — `--ui-bridge`'s unset broker_url keeps its
  "fall back to pre-sourced AWS creds" meaning (the §191 pre-Stage-7 path).

VERIFIED live: cross-built + pushed the daemon to the sandbox; `agentkeys-daemon
--request-pairing` (no flag) now defaults to prod + opens a §10.2 request (code
9ZpC8nwu…) — the "--broker-url required" error is gone.

Runbook (Path A quick-start + details) simplified to drop the flag; notes the prod
default + the override. clippy -D warnings clean; daemon tests green.
…-create.sh

`accept pairing · Touch ID` POSTed /v1/agent/pairing/register and got 502. Root
cause: register_pairing derived the agent-register script as a SIBLING of
--register-master-script, but the two are NOT co-located — dev.sh's master register
is harness/scripts/heima-register-first-master.sh while heima-agent-create.sh lives
in <repo>/scripts/. The sibling path (harness/scripts/heima-agent-create.sh) doesn't
exist, so `bash <missing>` exited non-zero → register_agent_device errored → 502.

Fix: resolve heima-agent-create.sh from candidates — the sibling (co-located case)
AND <repo>/scripts/ derived from the master script path — picking the first that
exists; fail with a clear SERVICE_UNAVAILABLE message if neither is found.

Verified: scripts/heima-agent-create.sh accepts exactly the args register_agent_device
passes (--label/--agent-address/--actor-omni/--device-key-hash/--pop-sig, from-pubkey
mode auto-detected), and a dry-run with the live agent details returns
{"ok":true,"skipped":"already-registered"} → register_agent_device → Ok(None) → 200.
The "no Touch ID" is expected (browser passkey UserOp is the E7-pending frontend item;
the register goes through the daemon script shell-out today). clippy -D warnings clean;
daemon tests green.
…ull request_id (slice 1)

The master pairing card showed a truncated "PAIR-CODE" that was actually the
request_id (never the agent's one-time code), with no value the operator could
cross-check against the agent — a confused-deputy surface (#224). Slice 1 surfaces
the values that ARE on both sides today, with no broker change/deploy:

- daemon (pending_binding_to_request): map the broker's device_key_hash →
  `deviceKeyHash` (+ short); keep `id` (the full request_id). The agent's
  `--request-pairing` already prints device_key_hash + D_pub, so these are the
  cross-verifiable identity.
- agent (run_request_pairing): print device_key_hash on the human-facing line so the
  operator reads it off the agent to compare.
- frontend (PairingRequest type + pairing card): replace the misleading "pair-code"
  with **device key hash · verify on agent** + **D_pub · verify on agent** (full) +
  **request id** (full handle). Operator confirms the device matches before
  accept · Touch ID.
- test: pending_binding_maps_to_pairing_request asserts the full deviceKeyHash.

Deferred to slice 2 (needs a broker change + deploy): created_at/expires_at
timestamps on the card (the broker pending row has no timestamps today) and the
`--force` supersede-prior-requests behavior. clippy/fmt clean; daemon tests + frontend
typecheck green.
…ual reload

acceptPairing did registerPairing + refreshPairing but never re-fetched the
actor tree, so a freshly-registered agent only appeared in the device/permission
views after the operator reloaded the page. Re-fetch listActors after a
successful register (matches finishPairingCeremony), surfacing the paired device
immediately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant