Skip to content

feat: add e3 key forward secrecy [skip-line-limit]#1591

Draft
0xjei wants to merge 8 commits into
mainfrom
keys/fw-sec
Draft

feat: add e3 key forward secrecy [skip-line-limit]#1591
0xjei wants to merge 8 commits into
mainfrom
keys/fw-sec

Conversation

@0xjei

@0xjei 0xjei commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

WIP, will continue after critical path

--

Each E3 round gets a randomly-generated AES-256-GCM key ("E3 key"). All sensitive cryptographic material for that round (BFV keyshares, threshold shares, decryption shares) is encrypted with this per-E3 key rather than the node's master cipher.

The E3 key itself is stored in the KV store encrypted under the master cipher, so the node can survive restarts mid-round without user interaction. When the round finishes (or times out), the E3 key is deleted from the store. From that point on, the encrypted blobs in the commit log are permanently irrecoverable (even if the master passphrase is later leaked).

The master cipher is a key-encryption key (KEK); the per-E3 keys are data-encryption keys (DEKs). The DEKs are ephemeral: they exist only for the lifetime of a round.

Note that previously, a timeout mid-round (DKG, committee formation, compute, decryption) would leave the E3 context and its per-E3 key alive indefinitely; resulting in a resource/memory leak. Now, timeout failures trigger E3RequestComplete the same way a successful round does, which tears down the context and purges the E3 key.

Summary by CodeRabbit

  • New Features

    • Per-request forward-secrecy ciphers: generate, encrypt-to-node, recover from snapshots, auto-purge on completion, and cached per-request resolution with master-cipher fallback.
    • New cipher APIs to generate and export raw key bytes for persistence/export workflows.
    • Multithread actor now resolves per-request ciphers from datastore and caches them for a round.
  • Bug Fixes

    • Routing ignores additional late terminal events and treats timeout-driven failures as completed.
    • DKG/decryption collection failures map to timeout/decryption-timeout reasons.
  • Tests

    • Added tests for per-request cipher boundaries, storage round-trips, hydrate/purge behavior, and updated failure-reason expectations.
  • Documentation

    • Clarified comments about how sensitive bytes relate to per-request ciphers.
  • Other

    • Added helper for constructing per-request store keys and a helper to identify timeout failures.

@0xjei 0xjei requested a review from hmzakhalid June 9, 2026 09:36
@vercel

vercel Bot commented Jun 9, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
crisp Ready Ready Preview, Comment Jun 9, 2026 10:11pm
enclave-docs Ready Ready Preview, Comment Jun 9, 2026 10:11pm
enclave-enclave-dashboard Ready Ready Preview, Comment Jun 9, 2026 10:11pm
interfold-dashboard Error Error Jun 9, 2026 10:11pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds per-E3 forward-secrecy cipher support: new Cipher key APIs, E3CipherExtension to generate/encrypt/store per-E3 AES-GCM keys under a master cipher, lifecycle purge and hydrate restore, keyshare changes to use per-E3/multithread ciphers, routing updates for timeout-terminal events, and builder wiring/tests.

Changes

Per-E3 Cipher Extension Workflow

Layer / File(s) Summary
Cipher API Foundation
crates/crypto/src/cipher.rs
Cipher::from_key_bytes(), Cipher::generate(), and Cipher::key_bytes() provide raw 32-byte key validation, generation, and secure access for persistence workflows.
Event Classification & Store Keys
crates/events/src/enclave_event/e3_failed.rs, crates/events/src/store_keys.rs
FailureReason::is_timeout() classifies deadline-expiration variants; StoreKeys::e3_key() constructs //e3_keys/{e3_id} storage paths.
Per-E3 Cipher Extension Core Implementation
crates/request/src/e3_cipher.rs, crates/request/Cargo.toml, crates/request/src/lib.rs
Adds E3_CIPHER_KEY, E3CipherRepositoryFactory, and E3CipherExtension that generates per-E3 AES-GCM keys on E3Requested, encrypts keys under master_cipher, persists encrypted bytes to KV, injects cipher into E3Context, purges on E3RequestComplete, and restores via hydrate; includes unit tests for round-trip, missing keys, purge, wrong-master decryption failure, idempotency, and snapshot-driven restore; workspace dependency e3-crypto added and module re-exported.
Routing Logic and Timeout-Aware Terminal Event Handling
crates/request/src/domain/routing.rs
RequestRouter::route now ignores late-terminal E3StageChanged(Failed) and E3Failed with timeout when e3_id is already completed; timeout E3Failed maps to PostForward::PublishComplete while non-timeout failures do not; tests and helpers added to validate behavior and idempotence.
Keyshare Extension & ThresholdKeyshare Changes
crates/keyshare/src/actors/threshold_keyshare.rs, crates/keyshare/src/ext.rs, crates/keyshare/src/domain/share_generation.rs
Adds multithread_cipher to ThresholdKeyshareParams and ThresholdKeyshare, uses multithread_cipher to decrypt certain shares and for transport-local cipher separation in build_shares_generated_plan, updates three DKG-related failure handlers to emit DKGTimeout/DecryptionTimeout, updates tests, and changes ThresholdKeyshareExtension to resolve per-E3 cipher from E3Context with fallback to master_cipher.
Extension Registration in CiphernodeBuilder
crates/ciphernode-builder/src/ciphernode_builder.rs
E3CipherExtension imported and registered first in CiphernodeBuilder::setup_extensions so subsequent extensions can consume per-E3 cipher from E3Context.

Sequence Diagram

sequenceDiagram
  participant RequestRouter
  participant E3CipherExtension
  participant MasterCipher
  participant KVStore
  participant E3Context
  participant ThresholdKeyshareExtension

  RequestRouter->>E3CipherExtension: E3Requested(e3_id)
  E3CipherExtension->>E3CipherExtension: generate AES-256-GCM Cipher (per-E3)
  E3CipherExtension->>MasterCipher: encrypt(per-E3 key bytes)
  E3CipherExtension->>KVStore: store encrypted bytes at //e3_keys/{e3_id}
  E3CipherExtension->>E3Context: inject per-E3 Cipher (E3_CIPHER_KEY)
  RequestRouter->>ThresholdKeyshareExtension: create ThresholdKeyshareParams
  ThresholdKeyshareExtension->>E3Context: resolve per-E3 cipher (or fallback master)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A cipher born per E3, wrapped tight under master’s care,
Hopped into context, stored away, then purged when duty’s there,
Timeouts whisper kindly — the router lets them pass,
Keyshare asks the rabbit: “Which cipher works?” — it answers fast,
Forward secrecy hops along, safeguarded everywhere.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 67.69% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add e3 key forward secrecy' accurately and concisely summarizes the main change—implementing forward secrecy for E3 round keys by generating per-E3 ephemeral ciphers.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch keys/fw-sec

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@0xjei 0xjei linked an issue Jun 9, 2026 that may be closed by this pull request
@0xjei 0xjei self-assigned this Jun 9, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/request/src/domain/routing.rs (1)

204-219: 💤 Low value

Minor test duplication: consolidate or remove redundant test.

Both stage_changed_to_failed_ignored_when_completed (line 204) and stage_changed_to_failed_ignored_when_already_completed (line 390) test the same routing behavior: E3StageChanged(Failed) is ignored when the E3 is already completed. The only difference is the previous_stage value (CiphertextReady vs CommitteeFinalized), but the router doesn't branch on previous_stage—only new_stage.

Consider consolidating into a single test or removing one.

Also applies to: 390-403

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/request/src/domain/routing.rs` around lines 204 - 219, Two tests,
stage_changed_to_failed_ignored_when_completed and
stage_changed_to_failed_ignored_when_already_completed, duplicate the same
assertion that RequestRouter::route returns RoutingDecision::Ignore for
E3StageChanged with new_stage E3Stage::Failed when the e3_id is in the completed
set; remove one of them (or merge into a single test) so only one test asserts
that behavior using from_data(E3StageChanged { e3_id, new_stage: E3Stage::Failed
}) with completed containing the id and asserting RequestRouter::route(&msg,
&completed) == RoutingDecision::Ignore; keep the most clearly named test and
update its previous_stage value if you want to still exercise a different prior
stage for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/request/src/e3_cipher.rs`:
- Around line 101-104: The current Err branch in the create_and_store result
swallows the error (tracing::error!(e3_id = %data.e3_id, "failed to create E3
cipher: {e}")) which allows silent fallback to the master cipher and breaks
forward secrecy; change this handling either to propagate the failure (return
Err(...) / propagate the create_and_store error from the enclosing function so
the round aborts) or, if degraded operation is acceptable, explicitly downgrade
to tracing::warn! and set a clearly-documented fallback flag/event (emit a bus
error or status entry) so downstream code knows the per-E3 cipher is missing;
update the create_and_store error branch accordingly and reference the same
e3_id in the new return/error/event so callers can correlate the failure.

---

Nitpick comments:
In `@crates/request/src/domain/routing.rs`:
- Around line 204-219: Two tests, stage_changed_to_failed_ignored_when_completed
and stage_changed_to_failed_ignored_when_already_completed, duplicate the same
assertion that RequestRouter::route returns RoutingDecision::Ignore for
E3StageChanged with new_stage E3Stage::Failed when the e3_id is in the completed
set; remove one of them (or merge into a single test) so only one test asserts
that behavior using from_data(E3StageChanged { e3_id, new_stage: E3Stage::Failed
}) with completed containing the id and asserting RequestRouter::route(&msg,
&completed) == RoutingDecision::Ignore; keep the most clearly named test and
update its previous_stage value if you want to still exercise a different prior
stage for clarity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0040c354-515a-4f3c-8b12-d80f8c63a2da

📥 Commits

Reviewing files that changed from the base of the PR and between 209a130 and 6cb9c6c.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • crates/ciphernode-builder/src/ciphernode_builder.rs
  • crates/crypto/src/cipher.rs
  • crates/events/src/enclave_event/e3_failed.rs
  • crates/events/src/store_keys.rs
  • crates/keyshare/src/actors/threshold_keyshare.rs
  • crates/keyshare/src/ext.rs
  • crates/request/Cargo.toml
  • crates/request/src/domain/routing.rs
  • crates/request/src/e3_cipher.rs
  • crates/request/src/lib.rs

Comment thread crates/request/src/e3_cipher.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/keyshare/src/actors/threshold_keyshare.rs (1)

2228-2232: ⚡ Quick win

Exercise the actor path with distinct ciphers.

These tests prove the Cipher primitive rejects the wrong key, but they never drive ThresholdKeyshare with different cipher and multithread_cipher values. A regression back to self.cipher in handle_shares_generated would still pass here because start_actor wires the same key into both fields.

Also applies to: 2350-2422

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/keyshare/src/actors/threshold_keyshare.rs` around lines 2228 - 2232,
Create and use two distinct Cipher instances when constructing
ThresholdKeyshareParams so tests exercise the actor path with different keys:
call Cipher::from_password twice (e.g., for "test-password" and a different
password) and pass one into the cipher field and the other into
multithread_cipher before calling ThresholdKeyshare::new (and likewise update
the other test block referenced around the 2350-2422 region); this ensures
start_actor can't mask a regression where handle_shares_generated incorrectly
uses self.cipher instead of the multithreaded cipher.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/keyshare/src/actors/threshold_keyshare.rs`:
- Around line 2228-2232: Create and use two distinct Cipher instances when
constructing ThresholdKeyshareParams so tests exercise the actor path with
different keys: call Cipher::from_password twice (e.g., for "test-password" and
a different password) and pass one into the cipher field and the other into
multithread_cipher before calling ThresholdKeyshare::new (and likewise update
the other test block referenced around the 2350-2422 region); this ensures
start_actor can't mask a regression where handle_shares_generated incorrectly
uses self.cipher instead of the multithreaded cipher.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3b8f2384-5c52-47fc-b4e3-211bfb965739

📥 Commits

Reviewing files that changed from the base of the PR and between b827b9e and d785c16.

📒 Files selected for processing (2)
  • crates/keyshare/src/actors/threshold_keyshare.rs
  • crates/keyshare/src/ext.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/keyshare/src/ext.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/keyshare/src/domain/share_generation.rs (1)

63-67: ⚡ Quick win

Make the two cipher roles non-positional.

Lines 64-67 take two adjacent &Cipher values with opposite downstream contracts. A caller can swap them without a compiler error and break either Multithread proof handling or local own-share recovery at runtime. A tiny wrapper struct or distinct newtypes would make this API much harder to misuse.

♻️ Proposed refactor
+pub(crate) struct SharePlanCiphers<'a> {
+    pub transport: &'a Cipher,
+    pub local: &'a Cipher,
+}
+
 pub(crate) fn build_shares_generated_plan(
-    transport_cipher: &Cipher,
-    local_cipher: &Cipher,
+    ciphers: SharePlanCiphers<'_>,
     share_enc_preset: BfvPreset,
     party_id: u64,
     threshold_m: u64,
@@
-        transport_cipher,
+        ciphers.transport,
@@
-        local_cipher,
+        ciphers.local,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/keyshare/src/domain/share_generation.rs` around lines 63 - 67, The
parameters transport_cipher and local_cipher of build_shares_generated_plan are
two plain &Cipher references whose roles can be accidentally swapped; introduce
distinct wrapper types (e.g., TransportCipher(Cipher) and LocalCipher(Cipher) or
a small struct with named fields) and change the function signature to accept
&TransportCipher and &LocalCipher (or the struct) instead of two &Cipher to make
the roles non-positional; update all callers to wrap their Cipher values in the
new types and adjust any pattern matching or method accessors to unwrap the
inner Cipher where needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/keyshare/src/domain/share_generation.rs`:
- Around line 63-67: The parameters transport_cipher and local_cipher of
build_shares_generated_plan are two plain &Cipher references whose roles can be
accidentally swapped; introduce distinct wrapper types (e.g.,
TransportCipher(Cipher) and LocalCipher(Cipher) or a small struct with named
fields) and change the function signature to accept &TransportCipher and
&LocalCipher (or the struct) instead of two &Cipher to make the roles
non-positional; update all callers to wrap their Cipher values in the new types
and adjust any pattern matching or method accessors to unwrap the inner Cipher
where needed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3a951f0f-c68e-4848-b9e4-b526566e757f

📥 Commits

Reviewing files that changed from the base of the PR and between d785c16 and 01efc70.

📒 Files selected for processing (2)
  • crates/keyshare/src/actors/threshold_keyshare.rs
  • crates/keyshare/src/domain/share_generation.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/keyshare/src/actors/threshold_keyshare.rs

@0xjei

0xjei commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

still working on this!

@0xjei 0xjei marked this pull request as draft June 9, 2026 15:11

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/keyshare/src/actors/threshold_keyshare.rs (1)

2130-2136: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Preserve the phase-specific timeout event here too.

Unlike the encryption-key and threshold-share timeout handlers above, this branch never republishes DecryptionKeySharedCollectionFailed before emitting E3Failed. That drops the structured missing_parties / reason payload for the decryption phase and makes this timeout path behave differently in history/debug consumers.

🧾 Keep timeout telemetry consistent
             self.decryption_key_shared_collector = None;
+            self.bus.publish_without_context(msg.clone())?;

             self.bus.publish_without_context(E3Failed {
                 e3_id: msg.e3_id.clone(),
                 failed_at_stage: E3Stage::CommitteeFinalized,
                 reason: FailureReason::DecryptionTimeout,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/keyshare/src/actors/threshold_keyshare.rs` around lines 2130 - 2136,
This branch currently clears self.decryption_key_shared_collector and publishes
E3Failed without first emitting the phase-specific timeout event; restore parity
by publishing DecryptionKeySharedCollectionFailed (including the collector's
missing_parties/related payload and reason FailureReason::DecryptionTimeout)
before publishing E3Failed, using the same data sourced from
self.decryption_key_shared_collector (or a snapshot of it) so consumers receive
the structured timeout telemetry for the decryption phase; ensure you still
clear self.decryption_key_shared_collector afterwards and keep the E3Failed
publish as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/keyshare/src/actors/threshold_keyshare.rs`:
- Around line 2130-2136: This branch currently clears
self.decryption_key_shared_collector and publishes E3Failed without first
emitting the phase-specific timeout event; restore parity by publishing
DecryptionKeySharedCollectionFailed (including the collector's
missing_parties/related payload and reason FailureReason::DecryptionTimeout)
before publishing E3Failed, using the same data sourced from
self.decryption_key_shared_collector (or a snapshot of it) so consumers receive
the structured timeout telemetry for the decryption phase; ensure you still
clear self.decryption_key_shared_collector afterwards and keep the E3Failed
publish as-is.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d48bfb54-ef43-4eda-a75f-e6d144682f52

📥 Commits

Reviewing files that changed from the base of the PR and between 01efc70 and 8c02a62.

📒 Files selected for processing (5)
  • crates/ciphernode-builder/src/ciphernode_builder.rs
  • crates/keyshare/src/actors/threshold_keyshare.rs
  • crates/keyshare/src/domain/share_generation.rs
  • crates/keyshare/src/ext.rs
  • crates/multithread/src/multithread.rs
💤 Files with no reviewable changes (1)
  • crates/keyshare/src/ext.rs

Comment thread crates/multithread/src/multithread.rs
@0xjei

0xjei commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

WIP WIP WIP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement forward secrecy by creating E3 keys

1 participant