test(devnet): cross-RPC nonce-replay e2e test for nonceEnforcement (batch E)#899
Conversation
|
Warning Review limit reached
More reviews will be available in 48 minutes and 25 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughAdds a paired devnet e2e test suite (shell + JavaScript) that boots a funded genesis, submits a single signed transaction to two RPC nodes concurrently, and validates nonce-enforcement deduplication: receiver credit equals exactly the transfer amount and sender nonce advances by exactly 1. ChangesDouble-broadcast nonce-replay protection e2e test
Possibly related PRs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@greptile review |
Review Summary by QodoAdd cross-RPC nonce-replay e2e test for nonceEnforcement
WalkthroughsDescription• Add end-to-end test for cross-RPC nonce-replay protection • Verifies nonceEnforcement fork prevents double-spend attacks • Broadcasts same signed transaction through two nodes simultaneously • Asserts receiver balance increases by exactly one transfer amount • Validates sender nonce advances by exactly one (not two) Diagramflowchart LR
A["Devnet Setup<br/>with nonceEnforcement"] --> B["Build + Sign<br/>Single TX"]
B --> C["Submit to<br/>Node-1 & Node-2<br/>in Parallel"]
C --> D["Broadcast<br/>ValidityData<br/>from Both"]
D --> E["Poll Chain State<br/>for 60s"]
E --> F["Assert:<br/>Receiver Delta = 1x<br/>Sender Nonce Delta = 1"]
F --> G{Test Result}
G -->|Pass| H["Exit 0: OK"]
G -->|Fail| I["Exit 1: FAIL<br/>with Diagnostics"]
File Changes1. testing/devnet/scripts/test-double-broadcast-e2e.sh
|
Code Review by Qodo
1. Missing await on getAddress
|
Greptile SummaryAdds a new devnet e2e test for the
Confidence Score: 3/5The test harness is structurally sound and addresses many issues from prior reviews, but it still depends on a docker-compose.fixture.yml overlay that does not exist in the repository, making the script non-runnable as-is. The mjs script has been well hardened — dual-RPC polling via sampleDeltas, the settling window, the both-confirm gate, and the zero-amount guard are all present. The remaining functional gaps are the missing fixture overlay (flagged in a previous review, still absent) that every docker compose invocation references, and the lack of transient-error recovery in sampleDeltas which could cause false-negative CI failures during node startup. Both files warrant attention: test-double-broadcast-e2e.sh is blocked by the missing docker-compose.fixture.yml, and double_broadcast_replay.mjs should add null-guards around the sampleDeltas() return value to survive transient network errors during polling. Important Files Changed
Sequence DiagramsequenceDiagram
participant SH as test-double-broadcast-e2e.sh
participant N1 as node-1 RPC
participant N2 as node-2 RPC
participant BFT as Consensus (4-of-5 BFT)
participant MJS as double_broadcast_replay.mjs
SH->>N1: docker compose up (all 5 validators)
SH->>MJS: bun double_broadcast_replay.mjs
MJS->>N1: demos1.pay() build+sign single tx
N1-->>MJS: "signedTx (nonce=N)"
par confirm in parallel
MJS->>N1: DemosTransactions.confirm(signedTx)
N1-->>MJS: confirmed1
and
MJS->>N2: DemosTransactions.confirm(signedTx)
N2-->>MJS: confirmed2
end
Note over MJS: Both confirms must succeed
par broadcast in parallel
MJS->>N1: DemosTransactions.broadcast(confirmed1)
and
MJS->>N2: DemosTransactions.broadcast(confirmed2)
end
N1->>BFT: "propose tx nonce=N"
N2->>BFT: "propose same tx nonce=N REPLAY"
BFT->>BFT: block N apply tx nonce advances
BFT->>BFT: block N+1 GCRNonceRoutines rejects replay
loop Poll up to 70s
MJS->>N1: getAddressInfo
MJS->>N2: getAddressInfo
Note over MJS: MAX delta across both nodes
end
MJS->>MJS: "Assert delta==amountOs and nonce_delta==1"
Reviews (13): Last reviewed commit: "fix(e2e): address CodeRabbit findings on..." | Re-trigger Greptile |
Greptile SummaryThis PR adds an end-to-end devnet test for the
Confidence Score: 2/5The test cannot run as written — the devnet fixture compose file it depends on is not present in the repository. The bash harness fails at docker compose up because docker-compose.fixture.yml is absent from the repository, so every invocation exits before any node starts and the test provides zero coverage. The poll loop in the bun script only queries node-1 despite the comment claiming both RPCs are cross-checked, risking silent false negatives. An unguarded property access on the pay() return value would surface as an unhandled TypeError rather than a clean diagnostic exit. Both files need attention: test-double-broadcast-e2e.sh needs the missing fixture compose file resolved, and double_broadcast_replay.mjs needs the poll-loop RPC cross-check and the null guard corrected. Important Files Changed
Sequence DiagramsequenceDiagram
participant SH as test-double-broadcast-e2e.sh
participant DC as Docker Compose
participant BUN as double_broadcast_replay.mjs
participant N1 as node-1 RPC
participant N2 as node-2 RPC
participant Chain as Consensus / Chain
SH->>DC: docker compose up node-1 node-2 postgres tlsnotary
SH->>N1: poll GET / until Hello World
SH->>N2: poll GET / until Hello World
SH->>N1: getAddressInfo(sender)
SH->>BUN: bun double_broadcast_replay.mjs
BUN->>N1: connect + connectWallet(mnemonic)
BUN->>N2: connect + connectWallet(mnemonic)
BUN->>N1: getAddressInfo snapshot pre-state
BUN->>N1: pay() build and sign ONE tx
par Parallel confirm
BUN->>N1: DemosTransactions.confirm(signedTx)
and
BUN->>N2: DemosTransactions.confirm(signedTx)
end
par Parallel broadcast
BUN->>N1: DemosTransactions.broadcast(validityData1)
and
BUN->>N2: DemosTransactions.broadcast(validityData2)
end
Chain->>Chain: GCRNonceRoutines expectedPrior check rejects duplicate
loop Poll up to 60s node-1 only
BUN->>N1: getAddressInfo(receiver) and getAddressInfo(sender)
BUN->>BUN: "check delta == expectedAmount and nonceDelta == 1"
end
BUN-->>SH: exit 0 OK or exit 1 FAIL
Reviews (1): Last reviewed commit: "test(devnet): cross-RPC nonce-replay e2e..." | Re-trigger Greptile |
| const sender = demos1.getAddress() | ||
| console.log(`[double-broadcast] SENDER=${sender}`) | ||
|
|
||
| // ----------------------------------------------------------------------------- | ||
| // 2. Snapshot pre-state | ||
| // ----------------------------------------------------------------------------- | ||
| const senderInfoBefore = await demos1.getAddressInfo(sender) | ||
| const receiverInfoBefore = await demos1.getAddressInfo(RECEIVER_PUBKEY) |
There was a problem hiding this comment.
1. Missing await on getaddress 🐞 Bug ≡ Correctness
double_broadcast_replay.mjs assigns sender = demos1.getAddress() without awaiting, but getAddress() is used elsewhere in the repo as an async API. This makes sender a Promise and breaks subsequent getAddressInfo(sender) calls, causing the e2e test to fail before it can validate nonce-replay protection.
Agent Prompt
### Issue description
`sender` is assigned from `demos1.getAddress()` without `await`, so it is likely a Promise rather than an address string. The script then passes `sender` into `getAddressInfo()`, which will fail or behave incorrectly.
### Issue Context
Another repo script uses `await demos.getAddress()` when retrieving the wallet address, indicating `getAddress()` is async.
### Fix Focus Areas
- testing/devnet/scripts/double_broadcast_replay.mjs[61-80]
### Suggested fix
- Change to `const sender = await demos1.getAddress()`.
- (Optional hardening) Validate `sender` matches expected hex address format before calling `getAddressInfo(sender)` and exit with a clear error if not.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
|
@greptileai review |
|
@greptile review |
|
@greptile review |
…atch E) Closes the e2e-test follow-up flagged in PR #886's description. What this verifies Builds + signs ONE transaction via the SDK against node-1, then submits the SAME signed tx to BOTH node-1 AND node-2 in parallel through `DemosTransactions.confirm()` + `DemosTransactions. broadcast()`. Asserts that after the chain advances: 1. Receiver balance increased by EXACTLY one transfer amount (not 2× — no double-spend). 2. Sender account nonce advanced by EXACTLY 1 (not 2 — both did not apply at consensus). Failure modes (any of these → FAIL with diagnostic dump): - `observed_receiver_delta > amount`: cross-RPC double-spend, both broadcasts applied (consensus-side `expectedPrior` reject in `GCRNonceRoutines` failed). - `observed_nonce_delta > 1`: same as above. - `observed_receiver_delta < amount`: neither broadcast landed (test couldn't verify the safety net engaged). Files - `testing/devnet/scripts/double_broadcast_replay.mjs` — the dual-broadcast bun script. Reads NODE1_URL / NODE2_URL / IDENTITY_PATH / RECEIVER_PUBKEY / AMOUNT_OS from env so the bash harness can wire it to devnet without source edits. - `testing/devnet/scripts/test-double-broadcast-e2e.sh` — bash harness. Mirrors `test-transfer-e2e.sh` shape: boots devnet fixture stack, waits for both RPCs, runs the dual broadcast, asserts via the bun script's exit code. How to run bash testing/devnet/scripts/test-double-broadcast-e2e.sh bash testing/devnet/scripts/test-double-broadcast-e2e.sh --keep --no-build Exercises - PR #886's consensus-side `expectedPrior` apply-time reject in `GCRNonceRoutines`. - PR #887's same-node TOCTOU advisory lock in `Mempool. addTransaction` (the second broadcast hitting node-1 may be caught at the mempool boundary before reaching consensus, depending on race ordering). - PR #888's vote-race fix in `broadcastBlockHash` (both txs going through consensus correctly tally and the second one fails BFT once `expectedPrior` rejects the second nonce edit). Out of scope - Performance / load test variants (1000 concurrent replays): tracked separately as the "mempool stress test" item. - Cross-RPC vote-relay scenarios (TODO at end of `broadcastBlockHash.ts`): separate consensus design follow-up.
Greptile P1 (mjs:103): `signedTx?.hash` was guarded but the next line read `signedTx.content.nonce` unguarded. If `pay()` returns null/undefined the script would crash with `TypeError` instead of the clean `process.exit(1)` used elsewhere. Added a single guard on `signedTx?.content` covering both fields. Greptile P1 (mjs:183): post-state polling only queried node-1, despite the comment claiming both RPCs were checked. If the credit committed on node-2 first and node-1 lagged on replication, all 30 polls saw stale state and the test reported "neither tx landed" — a false negative that hid replay-protection failures behind apparent network lag. Now polls BOTH RPCs in parallel and takes the max delta across nodes — if either side observed the credit, the credit landed. The double-spend assertion (`observedDelta > expectedDelta`) still catches real replays because the inflated balance would show up *somewhere*. Per-node deltas are logged at each poll for debuggability. Greptile P2 (sh:147): if the sender balance is between 1 and 9 OS, integer-divided 10% rounds to 0 and the test silently runs as a zero-value transfer, which doesn't exercise the double-spend path meaningfully. Now fails fast with a clear error before invoking the bun script. This is a cheap safety net for a corner case that shouldn't happen against the fixture, but would silently produce a green test if it did. Rejected (verified false positives): - Greptile P1 (sh:22): claimed `docker-compose.fixture.yml` doesn't exist. The file is present at `testing/devnet/docker-compose. fixture.yml` — Greptile read the wrong directory listing. - qodo (mjs:76): claimed `getAddress()` needs `await` because it "is used elsewhere as async". Per the SDK type defs (demosclass.ts:195), `getAddress(): string`. The companion script `transfer_10pct.mjs` uses the same sync form. No fix needed.
Greptile iter-2 P1: the script started only node-1, node-2, postgres, tlsnotary — matching what `test-transfer-e2e.sh` does. But this test needs the chain to actually advance past a block-finalize so the sender's nonce + receiver's balance change become observable. The fixture genesis (`genesis.devnet.json`) registers 5 validators with status=2 each, so consensus requires 2/3+1 = 4-of-5 votes per block. With only 2 validators live, BFT can never reach quorum, every poll sees `observedDelta=0`, and the test reports "neither tx landed within 60s" — a false negative that looks identical to a replay-protection failure. Now starts the full validator set. `test-transfer-e2e.sh` should probably be updated too, but that's outside this PR's scope (and the fact that it currently passes with 2 nodes suggests its assertion fires before block finalization, or some other path I haven't traced; leaving it for a separate audit).
de10a07 to
71f4617
Compare
Greptile iter-3 P1: the poll loop broke on the first observation of `observedDelta >= expectedDelta` and evaluated all assertions against that single snapshot. If the consensus `expectedPrior` check is broken or absent, the replayed tx could land in block N+1 — one or two blocks after the legitimate one — by which point the script has already printed "OK" and exited 0. A real double-spend would pass the test undetected. Split the polling into two phases: - Phase 1: wait up to 60s for the first credit. Unchanged logic, just records `firstObservedAt` instead of breaking. - Phase 2: settling window. Keep polling for ~10s (5 × 2s) after the first observation. Do NOT early-break — always wait the full window so the test takes the same wall-clock time on success and failure. If the replay applied late, the inflated delta surfaces here and the `observedDelta > expectedDelta` assertion at the bottom catches it. Same wall-clock cost on success (~+10s), but now a regression in `GCRNonceRoutines.expectedPrior` that allows sequenced re-application fails the test instead of passing it.
|
@greptile review |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@testing/devnet/scripts/double_broadcast_replay.mjs`:
- Around line 117-147: The current logic lets the test continue if only one of
the confirm() calls (confirmResults / confirm1 / confirm2) succeeds, which
allows a single-node fallback; update the post-confirmation check to require
both confirmed1 and confirmed2 to be non-null (i.e., both confirmResults entries
have status "fulfilled") or, if one is rejected, assert that its rejection
reason explicitly matches the expected replay/nonce-enforcement error string
before allowing progress; otherwise log an error and exit (process.exit(1)) so
the test fails fast instead of silently proceeding with a single successful leg.
In `@testing/devnet/scripts/test-double-broadcast-e2e.sh`:
- Around line 118-125: The pipeline inside get_balance_os can fail under set -e
and cause the whole script to exit; wrap the risky command substitutions that
call get_balance_os (the balance-probing assignments) so failures don't trigger
an immediate exit — either run the subshell with set +e temporarily or append ||
true to the command substitution, then explicitly test the result and on
empty/failure print a clear error and dump node logs (e.g., docker/journalctl
output) before exiting; ensure you update get_balance_os callers to check the
returned value and emit the diagnostic block instead of letting the pipeline
abort the script.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1d763eee-fccf-431c-86bd-4f0e6bb82b2c
📒 Files selected for processing (2)
testing/devnet/scripts/double_broadcast_replay.mjstesting/devnet/scripts/test-double-broadcast-e2e.sh
CodeRabbit Major (mjs:147): the gate allowed the test to proceed if EITHER confirm() succeeded. That degrades the test to single-node happy-path the moment one RPC rejects (e.g. RPC-side nonce check fires) — CI passes green without ever exercising the cross-RPC consensus dedup path the test exists to cover. Tightened to require BOTH confirms fulfilled; either rejection now fails fast with both sides' rejection reasons logged. Simplified the broadcast block since confirmed1/confirmed2 are now guaranteed non-null past the gate. CodeRabbit Minor (sh:125): under `set -euo pipefail`, a curl timeout or empty grep inside `$(get_balance_os ...)` made the command substitution exit non-zero and killed the harness before the explicit diagnostic block could run. Wrapped both probe assignments in `if !` so failures produce a clear error + node logs instead of a silent abort.
|
@greptile review |
Plain-English summary
What your code does
The `nonceEnforcement` fork landed across PRs #884-#887 to close cross-RPC replay attacks. The protection is supposed to work like this:
But until this PR, there was no automated test that exercised that scenario end-to-end. PRs #886 + #887's plain-English summaries described the protection, manual traces argued it works, but nothing actually ran the attack against running nodes.
What this PR adds
A devnet e2e test that simulates the attack and asserts the protection works:
Why this matters
Three different layers of protection landed across four PRs (RPC validation, mempool advisory lock, consensus `expectedPrior` reject). Without an integration test exercising all three in concert against real running nodes, a future refactor could silently regress any one of them and the protection would degrade quietly — no unit test would catch it.
This test is the canary. Run it after any change to `assignNonce`, `Mempool.addTransaction`, `GCRNonceRoutines`, `endpointValidation` hash-strip, or `broadcastBlockHash` vote tally.
Technical detail
How to run
```bash
bash testing/devnet/scripts/test-double-broadcast-e2e.sh
or with --keep / --no-build flags (same as test-transfer-e2e.sh)
```
What it exercises
Failure modes the test catches
Out of scope
Test plan
Summary by CodeRabbit
Release Notes