Skip to content

test(devnet): cross-RPC nonce-replay e2e test for nonceEnforcement (batch E)#899

Merged
tcsenpai merged 6 commits into
stabilisationfrom
feat/audit-sweep-batch-e-nonce-e2e-test-2026-06-01
Jun 2, 2026
Merged

test(devnet): cross-RPC nonce-replay e2e test for nonceEnforcement (batch E)#899
tcsenpai merged 6 commits into
stabilisationfrom
feat/audit-sweep-batch-e-nonce-e2e-test-2026-06-01

Conversation

@tcsenpai
Copy link
Copy Markdown
Contributor

@tcsenpai tcsenpai commented Jun 1, 2026

Plain-English summary

What your code does

The `nonceEnforcement` fork landed across PRs #884-#887 to close cross-RPC replay attacks. The protection is supposed to work like this:

  1. Sign a transaction once. It has a specific nonce (sequence number).
  2. Try to broadcast that same signed transaction through two different nodes simultaneously.
  3. Both nodes may validate it independently and pass it forward.
  4. At block-application time, only ONE survives — the second sees `accountGCR.nonce !== expectedPrior` and is rejected.
  5. Receiver gets paid exactly once. Sender's nonce advances by exactly 1.

But until this PR, there was no automated test that exercised that scenario end-to-end. PRs #886 + #887's plain-English summaries described the protection, manual traces argued it works, but nothing actually ran the attack against running nodes.

What this PR adds

A devnet e2e test that simulates the attack and asserts the protection works:

  • New bash harness `test-double-broadcast-e2e.sh` boots the devnet fixture stack (with `nonceEnforcement.activationHeight: 0` active from block 0, set by PR feat(forks): register nonceEnforcement fork + ship assignNonce validation infra (batch C PR 1) #884's genesis fixture).
  • New bun script `double_broadcast_replay.mjs` signs one transaction, fires it at BOTH node-1 and node-2 in parallel via `DemosTransactions.confirm()` and `broadcast()`.
  • Polls for up to 60s, then asserts:
    • Receiver balance went up by exactly one transfer amount (not 2×).
    • Sender's nonce advanced by exactly 1 (not 2).
  • Exits 0 on OK, 1 with diagnostic dump on FAIL.

Why this matters

Three different layers of protection landed across four PRs (RPC validation, mempool advisory lock, consensus `expectedPrior` reject). Without an integration test exercising all three in concert against real running nodes, a future refactor could silently regress any one of them and the protection would degrade quietly — no unit test would catch it.

This test is the canary. Run it after any change to `assignNonce`, `Mempool.addTransaction`, `GCRNonceRoutines`, `endpointValidation` hash-strip, or `broadcastBlockHash` vote tally.


Technical detail

How to run

```bash
bash testing/devnet/scripts/test-double-broadcast-e2e.sh

or with --keep / --no-build flags (same as test-transfer-e2e.sh)

```

What it exercises

Layer Defence PR
RPC validation `assignNonce` checks `tx.content.nonce === account.nonce + 1 + pendingCount` #884 + #885
Mempool insert Advisory lock + in-lock re-check, fork-gated #887
Consensus apply `GCRNonceRoutines` rejects when `accountGCR.nonce !== expectedPrior` #886
BFT vote tally Vote race fixed, signature-count semantics removed #888

Failure modes the test catches

Symptom Root cause
`observed_receiver_delta > amount` Cross-RPC double-spend; both broadcasts applied at consensus
`observed_nonce_delta > 1` Same as above
`observed_receiver_delta < amount` after 60s Neither broadcast landed (network issue / safety net too eager / test infra broken)
Both confirms rejected Test cannot verify replay protection — fails with exit 1

Out of scope

  • Performance / load-test variants (1000 concurrent replays from one sender). Tracked separately.
  • Cross-RPC vote-relay scenarios (`broadcastBlockHash.ts` TODO). Separate consensus design.

Test plan

Summary by CodeRabbit

Release Notes

  • Tests
    • Added end-to-end tests for validating nonce-replay protection across multiple RPC nodes.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

Warning

Review limit reached

@tcsenpai, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 48 minutes and 25 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1de59945-fd33-48a2-970a-2819a0b5f723

📥 Commits

Reviewing files that changed from the base of the PR and between fcb7a1f and 34e88d7.

📒 Files selected for processing (2)
  • testing/devnet/scripts/double_broadcast_replay.mjs
  • testing/devnet/scripts/test-double-broadcast-e2e.sh

Walkthrough

Adds a paired devnet e2e test suite (shell + JavaScript) that boots a funded genesis, submits a single signed transaction to two RPC nodes concurrently, and validates nonce-enforcement deduplication: receiver credit equals exactly the transfer amount and sender nonce advances by exactly 1.

Changes

Double-broadcast nonce-replay protection e2e test

Layer / File(s) Summary
Shell script initialization and devnet bootstrap
testing/devnet/scripts/test-double-broadcast-e2e.sh
Initializes strict bash, parses --keep and --no-build CLI flags, defines cleanup trap for devnet teardown, materializes .env and identity files, boots docker-compose with fresh volumes, and implements RPC health polling via curl until "Hello, World" marker appears.
Sender and receiver identification and balance verification
testing/devnet/scripts/test-double-broadcast-e2e.sh
Fetches sender and receiver public keys from devnet identity file and JSON-RPC nodeCall, queries sender balance from both nodes, validates genesis applied (non-zero balance), and detects inter-node balance disagreement.
JavaScript client initialization and transaction building
testing/devnet/scripts/double_broadcast_replay.mjs
Validates required environment variables (RPC URLs, sender identity, receiver key, transfer amount), connects two SDK clients to separate nodes using shared mnemonic, snapshots pre-state sender nonce and balances, validates sender has sufficient funds, and builds and signs exactly one payment transaction.
Concurrent transaction submission to both nodes
testing/devnet/scripts/double_broadcast_replay.mjs
Submits the same signed transaction to both nodes concurrently via confirm(), handles promise settlement with allSettled, and proceeds only if at least one confirmation succeeds.
Concurrent validity broadcast to both nodes
testing/devnet/scripts/double_broadcast_replay.mjs
Broadcasts per-node ValidityData from successful confirmations to both nodes concurrently, records per-node broadcast status, and skips broadcast for nodes whose confirmations were not fulfilled.
Polling and nonce-enforcement deduplication validation
testing/devnet/scripts/double_broadcast_replay.mjs
Polls both RPCs concurrently for up to 60 seconds using two-phase logic: first waits until receiver credit appears (max delta across nodes), then runs settling window to detect late replay. Asserts receiver delta equals transfer amount and sender nonce delta equals exactly 1; emits FAIL diagnostics for double-spend, missing landing, or unreachable branches.
Test runner invocation and success reporting
testing/devnet/scripts/test-double-broadcast-e2e.sh
Locates and runs double_broadcast_replay.mjs under bun, computes transfer amount as 10% of sender balance (failing if zero), passes node URLs, identity path, receiver key, and amount as environment variables, and reports success on zero exit.

Possibly related PRs

  • kynesyslabs/node#886: The new double_broadcast_replay test directly exercises the nonce-enforcement logic implemented via expectedPrior in transaction confirmation and endpointValidation.
  • kynesyslabs/node#884: The test explicitly validates the nonceEnforcement fork behavior and devnet genesis activation that gates nonce deduplication.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A rabbit hops through devnet lanes,
Broadcast twice, but nonce constrains!
One credit flows, one bump ahead—
Replay's stopped—the test is fed. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title directly and clearly describes the main change: a cross-RPC nonce-replay end-to-end test for the nonceEnforcement feature. It is specific, concise, and accurately reflects the primary purpose of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/audit-sweep-batch-e-nonce-e2e-test-2026-06-01

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tcsenpai
Copy link
Copy Markdown
Contributor Author

tcsenpai commented Jun 1, 2026

@greptile review

@qodo-code-review
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Add cross-RPC nonce-replay e2e test for nonceEnforcement

🧪 Tests

Grey Divider

Walkthroughs

Description
• Add end-to-end test for cross-RPC nonce-replay protection
• Verifies nonceEnforcement fork prevents double-spend attacks
• Broadcasts same signed transaction through two nodes simultaneously
• Asserts receiver balance increases by exactly one transfer amount
• Validates sender nonce advances by exactly one (not two)
Diagram
flowchart LR
  A["Devnet Setup<br/>with nonceEnforcement"] --> B["Build + Sign<br/>Single TX"]
  B --> C["Submit to<br/>Node-1 & Node-2<br/>in Parallel"]
  C --> D["Broadcast<br/>ValidityData<br/>from Both"]
  D --> E["Poll Chain State<br/>for 60s"]
  E --> F["Assert:<br/>Receiver Delta = 1x<br/>Sender Nonce Delta = 1"]
  F --> G{Test Result}
  G -->|Pass| H["Exit 0: OK"]
  G -->|Fail| I["Exit 1: FAIL<br/>with Diagnostics"]

Loading

Grey Divider

File Changes

1. testing/devnet/scripts/test-double-broadcast-e2e.sh 🧪 Tests +159/-0

Bash harness for devnet double-broadcast e2e test

• Bash harness that boots devnet fixture stack with nonceEnforcement active from block 0
• Waits for both node-1 and node-2 RPCs to become available
• Verifies funded-genesis overlay applied to both nodes by checking sender balance
• Runs the dual-broadcast bun script with environment variables for node URLs, identity, receiver
 pubkey, and transfer amount
• Cleans up devnet on exit (with --keep flag option to leave running for manual inspection)

testing/devnet/scripts/test-double-broadcast-e2e.sh


2. testing/devnet/scripts/double_broadcast_replay.mjs 🧪 Tests +233/-0

Bun script for parallel cross-RPC transaction broadcast test

• Connects two Demos SDK clients to node-1 and node-2 with same sender identity
• Snapshots pre-state: sender/receiver balances and sender nonce
• Builds and signs one transaction via node-1 SDK (single nonce assignment)
• Submits same signed transaction to both nodes in parallel via DemosTransactions.confirm()
• Broadcasts resulting ValidityData from both nodes simultaneously
• Polls post-state for up to 60 seconds, checking receiver balance delta and sender nonce
 advancement
• Asserts receiver balance increased by exactly one transfer amount and sender nonce advanced by
 exactly one
• Exits 0 on success, 1 on double-spend or failure with diagnostic output

testing/devnet/scripts/double_broadcast_replay.mjs


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown
Contributor

qodo-code-review Bot commented Jun 1, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX issues (0)

Grey Divider


Action required

1. Missing await on getAddress 🐞 Bug ≡ Correctness
Description
double_broadcast_replay.mjs assigns sender = demos1.getAddress() without awaiting, but
getAddress() is used elsewhere in the repo as an async API. This makes sender a Promise and
breaks subsequent getAddressInfo(sender) calls, causing the e2e test to fail before it can
validate nonce-replay protection.
Code

testing/devnet/scripts/double_broadcast_replay.mjs[R69-76]

Evidence
The new script calls getAddress() without awaiting and then immediately uses the result as an
address; existing repo usage awaits getAddress(), showing the intended async contract.

testing/devnet/scripts/double_broadcast_replay.mjs[61-76]
scripts/send-l2-batch.ts[300-310]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`sender` is assigned from `demos1.getAddress()` without `await`, so it is likely a Promise rather than an address string. The script then passes `sender` into `getAddressInfo()`, which will fail or behave incorrectly.

### Issue Context
Another repo script uses `await demos.getAddress()` when retrieving the wallet address, indicating `getAddress()` is async.

### Fix Focus Areas
- testing/devnet/scripts/double_broadcast_replay.mjs[61-80]

### Suggested fix
- Change to `const sender = await demos1.getAddress()`.
- (Optional hardening) Validate `sender` matches expected hex address format before calling `getAddressInfo(sender)` and exit with a clear error if not.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Post-state polled from one node ✓ Resolved 🐞 Bug ☼ Reliability
Description
The poll loop claims it re-checks post-state via both RPCs but actually queries only node-1
(demos1) for receiver balance and sender nonce. This can produce false failures (e.g., reporting
"neither tx landed") if node-1 is temporarily stale while node-2 has the updated state.
Code

testing/devnet/scripts/double_broadcast_replay.mjs[R168-185]

Evidence
The code comment explicitly promises cross-RPC polling, but the implementation reads only from
demos1 in the loop, contradicting the intended behavior.

testing/devnet/scripts/double_broadcast_replay.mjs[167-185]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The poll loop says it checks both RPCs to avoid stale reads, but it only calls `demos1.getAddressInfo(...)`. This undermines the test's stated robustness goal.

### Issue Context
The test is explicitly exercising a cross-RPC scenario; post-state should be sampled from both nodes (or require convergence) to avoid relying on a single node's freshness.

### Fix Focus Areas
- testing/devnet/scripts/double_broadcast_replay.mjs[167-193]

### Suggested fix
- In each poll iteration, query BOTH `demos1` and `demos2` for receiver info and sender info (e.g., `Promise.all`).
- Compute deltas using either:
 - the max observed receiver balance / sender nonce across nodes, or
 - require both nodes to reflect the expected deltas before breaking and asserting.
- Update the log output to show both nodes’ observed deltas to aid debugging when nodes disagree.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Brittle balance JSON parsing 🐞 Bug ⚙ Maintainability
Description
test-double-broadcast-e2e.sh extracts balance using a regex over the raw JSON response, which is
fragile to response shape/format changes and can cause the script to exit early under `set -euo
pipefail`. This can break the test before it runs the double-broadcast scenario even when the RPC is
healthy.
Code

testing/devnet/scripts/test-double-broadcast-e2e.sh[R112-119]

Evidence
The harness uses regex parsing of the response body; the server-side response serialization is plain
JSON.stringify, so robust extraction should use JSON parsing to avoid brittle coupling to
formatting.

testing/devnet/scripts/test-double-broadcast-e2e.sh[112-119]
src/libs/network/bunServer.ts[288-303]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The script parses JSON with `grep|sed`, assuming an exact substring match for `"balance":"<digits>"`. Any small output variation (or different envelope nesting) can make this return empty and abort the harness.

### Issue Context
The RPC server returns responses via `JSON.stringify(...)`, so the balance is part of a structured JSON envelope; it should be extracted via JSON parsing rather than regex.

### Fix Focus Areas
- testing/devnet/scripts/test-double-broadcast-e2e.sh[112-119]

### Suggested fix
- Replace the `grep|sed` pipeline with a real JSON parse.
 - Since the harness already requires `bun`, a practical option is:
   - `curl ... | bun -e 'import { readFileSync } from "node:fs"; const j=JSON.parse(readFileSync(0,"utf8")); console.log(j?.response?.balance ?? j?.response?.response?.balance ?? "");'`
 - Alternatively use `jq -r` if you want an external dependency.
- Validate the parsed value is a non-empty digit string before proceeding.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 1, 2026

Greptile Summary

Adds a new devnet e2e test for the nonceEnforcement cross-RPC replay-protection path introduced in PRs #884#888. The test signs one transaction and broadcasts it through both node-1 and node-2 in parallel, then polls both nodes with a settling window to assert the receiver's balance and the sender's nonce each advance by exactly 1.

  • double_broadcast_replay.mjs signs a single tx, confirms it against both nodes in parallel (failing fast if either confirm fails), broadcasts both ValidityData payloads simultaneously, and waits up to 70s (60s detect + 10s settling) before asserting no double-spend.
  • test-double-broadcast-e2e.sh boots all five validator nodes, waits for node-1 and node-2 RPCs, computes a non-zero transfer amount (10% of sender balance), and delegates to the mjs script.

Confidence Score: 3/5

The test harness is structurally sound and addresses many issues from prior reviews, but it still depends on a docker-compose.fixture.yml overlay that does not exist in the repository, making the script non-runnable as-is.

The mjs script has been well hardened — dual-RPC polling via sampleDeltas, the settling window, the both-confirm gate, and the zero-amount guard are all present. The remaining functional gaps are the missing fixture overlay (flagged in a previous review, still absent) that every docker compose invocation references, and the lack of transient-error recovery in sampleDeltas which could cause false-negative CI failures during node startup.

Both files warrant attention: test-double-broadcast-e2e.sh is blocked by the missing docker-compose.fixture.yml, and double_broadcast_replay.mjs should add null-guards around the sampleDeltas() return value to survive transient network errors during polling.

Important Files Changed

Filename Overview
testing/devnet/scripts/double_broadcast_replay.mjs Core replay-protection e2e script; addresses most prior review issues (dual-RPC polling via sampleDeltas, settling window, both-confirm gate) but sampleDeltas has no per-call error recovery — a transient rejection crashes the poll loop rather than skipping a cycle.
testing/devnet/scripts/test-double-broadcast-e2e.sh Bash harness that boots the devnet and invokes the mjs script; correctly starts all 5 validators and guards against zero AMOUNT_OS, but still references the non-existent docker-compose.fixture.yml which causes all docker compose calls to fail before any node starts.

Sequence Diagram

sequenceDiagram
    participant SH as test-double-broadcast-e2e.sh
    participant N1 as node-1 RPC
    participant N2 as node-2 RPC
    participant BFT as Consensus (4-of-5 BFT)
    participant MJS as double_broadcast_replay.mjs

    SH->>N1: docker compose up (all 5 validators)
    SH->>MJS: bun double_broadcast_replay.mjs
    MJS->>N1: demos1.pay() build+sign single tx
    N1-->>MJS: "signedTx (nonce=N)"
    par confirm in parallel
        MJS->>N1: DemosTransactions.confirm(signedTx)
        N1-->>MJS: confirmed1
    and
        MJS->>N2: DemosTransactions.confirm(signedTx)
        N2-->>MJS: confirmed2
    end
    Note over MJS: Both confirms must succeed
    par broadcast in parallel
        MJS->>N1: DemosTransactions.broadcast(confirmed1)
    and
        MJS->>N2: DemosTransactions.broadcast(confirmed2)
    end
    N1->>BFT: "propose tx nonce=N"
    N2->>BFT: "propose same tx nonce=N REPLAY"
    BFT->>BFT: block N apply tx nonce advances
    BFT->>BFT: block N+1 GCRNonceRoutines rejects replay
    loop Poll up to 70s
        MJS->>N1: getAddressInfo
        MJS->>N2: getAddressInfo
        Note over MJS: MAX delta across both nodes
    end
    MJS->>MJS: "Assert delta==amountOs and nonce_delta==1"
Loading

Reviews (13): Last reviewed commit: "fix(e2e): address CodeRabbit findings on..." | Re-trigger Greptile

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 1, 2026

Greptile Summary

This PR adds an end-to-end devnet test for the nonceEnforcement cross-RPC replay-protection: a bash harness boots the devnet, signs a single transaction, and fires it at both nodes simultaneously; a bun script then asserts the receiver balance increased by exactly one transfer amount and the sender nonce advanced by exactly one.

  • test-double-broadcast-e2e.sh: Orchestrates devnet startup, RPC readiness polling, and invokes the bun script; references docker-compose.fixture.yml which does not exist in the repository, meaning the test cannot run at all in its current state.
  • double_broadcast_replay.mjs: Implements the attack scenario and assertions; the post-state poll loop queries only node-1 despite the comment stating both RPCs are checked, and signedTx.content.nonce is accessed without a null guard immediately after an optional-chained .hash access on the same value.

Confidence Score: 2/5

The test cannot run as written — the devnet fixture compose file it depends on is not present in the repository.

The bash harness fails at docker compose up because docker-compose.fixture.yml is absent from the repository, so every invocation exits before any node starts and the test provides zero coverage. The poll loop in the bun script only queries node-1 despite the comment claiming both RPCs are cross-checked, risking silent false negatives. An unguarded property access on the pay() return value would surface as an unhandled TypeError rather than a clean diagnostic exit.

Both files need attention: test-double-broadcast-e2e.sh needs the missing fixture compose file resolved, and double_broadcast_replay.mjs needs the poll-loop RPC cross-check and the null guard corrected.

Important Files Changed

Filename Overview
testing/devnet/scripts/test-double-broadcast-e2e.sh New bash harness that boots the devnet fixture stack and orchestrates the double-broadcast test; fails immediately because it references a non-existent docker-compose.fixture.yml, and the AMOUNT_OS calculation can silently produce 0 for tiny devnet balances.
testing/devnet/scripts/double_broadcast_replay.mjs New bun script that signs one transaction, submits it to both nodes in parallel, and asserts the receiver balance incremented exactly once; has two correctness issues: an unguarded signedTx.content.nonce access that throws instead of exiting cleanly, and a poll loop that only queries node-1 despite the comment claiming it cross-checks both RPCs.

Sequence Diagram

sequenceDiagram
    participant SH as test-double-broadcast-e2e.sh
    participant DC as Docker Compose
    participant BUN as double_broadcast_replay.mjs
    participant N1 as node-1 RPC
    participant N2 as node-2 RPC
    participant Chain as Consensus / Chain

    SH->>DC: docker compose up node-1 node-2 postgres tlsnotary
    SH->>N1: poll GET / until Hello World
    SH->>N2: poll GET / until Hello World
    SH->>N1: getAddressInfo(sender)
    SH->>BUN: bun double_broadcast_replay.mjs

    BUN->>N1: connect + connectWallet(mnemonic)
    BUN->>N2: connect + connectWallet(mnemonic)
    BUN->>N1: getAddressInfo snapshot pre-state
    BUN->>N1: pay() build and sign ONE tx

    par Parallel confirm
        BUN->>N1: DemosTransactions.confirm(signedTx)
    and
        BUN->>N2: DemosTransactions.confirm(signedTx)
    end

    par Parallel broadcast
        BUN->>N1: DemosTransactions.broadcast(validityData1)
    and
        BUN->>N2: DemosTransactions.broadcast(validityData2)
    end

    Chain->>Chain: GCRNonceRoutines expectedPrior check rejects duplicate

    loop Poll up to 60s node-1 only
        BUN->>N1: getAddressInfo(receiver) and getAddressInfo(sender)
        BUN->>BUN: "check delta == expectedAmount and nonceDelta == 1"
    end

    BUN-->>SH: exit 0 OK or exit 1 FAIL
Loading

Reviews (1): Last reviewed commit: "test(devnet): cross-RPC nonce-replay e2e..." | Re-trigger Greptile

Comment thread testing/devnet/scripts/test-double-broadcast-e2e.sh
Comment thread testing/devnet/scripts/double_broadcast_replay.mjs Outdated
Comment thread testing/devnet/scripts/double_broadcast_replay.mjs
Comment thread testing/devnet/scripts/test-double-broadcast-e2e.sh
Comment on lines +69 to +76
const sender = demos1.getAddress()
console.log(`[double-broadcast] SENDER=${sender}`)

// -----------------------------------------------------------------------------
// 2. Snapshot pre-state
// -----------------------------------------------------------------------------
const senderInfoBefore = await demos1.getAddressInfo(sender)
const receiverInfoBefore = await demos1.getAddressInfo(RECEIVER_PUBKEY)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Missing await on getaddress 🐞 Bug ≡ Correctness

double_broadcast_replay.mjs assigns sender = demos1.getAddress() without awaiting, but
getAddress() is used elsewhere in the repo as an async API. This makes sender a Promise and
breaks subsequent getAddressInfo(sender) calls, causing the e2e test to fail before it can
validate nonce-replay protection.
Agent Prompt
### Issue description
`sender` is assigned from `demos1.getAddress()` without `await`, so it is likely a Promise rather than an address string. The script then passes `sender` into `getAddressInfo()`, which will fail or behave incorrectly.

### Issue Context
Another repo script uses `await demos.getAddress()` when retrieving the wallet address, indicating `getAddress()` is async.

### Fix Focus Areas
- testing/devnet/scripts/double_broadcast_replay.mjs[61-80]

### Suggested fix
- Change to `const sender = await demos1.getAddress()`.
- (Optional hardening) Validate `sender` matches expected hex address format before calling `getAddressInfo(sender)` and exit with a clear error if not.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@tcsenpai
Copy link
Copy Markdown
Contributor Author

tcsenpai commented Jun 2, 2026

@greptileai review

@tcsenpai
Copy link
Copy Markdown
Contributor Author

tcsenpai commented Jun 2, 2026

@greptile review

Comment thread testing/devnet/scripts/test-double-broadcast-e2e.sh
@tcsenpai
Copy link
Copy Markdown
Contributor Author

tcsenpai commented Jun 2, 2026

@greptile review

tcsenpai added 4 commits June 2, 2026 15:26
…atch E)

Closes the e2e-test follow-up flagged in PR #886's description.

What this verifies
  Builds + signs ONE transaction via the SDK against node-1, then
  submits the SAME signed tx to BOTH node-1 AND node-2 in parallel
  through `DemosTransactions.confirm()` + `DemosTransactions.
  broadcast()`. Asserts that after the chain advances:

    1. Receiver balance increased by EXACTLY one transfer amount
       (not 2× — no double-spend).
    2. Sender account nonce advanced by EXACTLY 1 (not 2 — both
       did not apply at consensus).

  Failure modes (any of these → FAIL with diagnostic dump):
    - `observed_receiver_delta > amount`: cross-RPC double-spend,
      both broadcasts applied (consensus-side `expectedPrior`
      reject in `GCRNonceRoutines` failed).
    - `observed_nonce_delta > 1`: same as above.
    - `observed_receiver_delta < amount`: neither broadcast
      landed (test couldn't verify the safety net engaged).

Files
  - `testing/devnet/scripts/double_broadcast_replay.mjs` —
    the dual-broadcast bun script. Reads NODE1_URL / NODE2_URL /
    IDENTITY_PATH / RECEIVER_PUBKEY / AMOUNT_OS from env so the
    bash harness can wire it to devnet without source edits.
  - `testing/devnet/scripts/test-double-broadcast-e2e.sh` —
    bash harness. Mirrors `test-transfer-e2e.sh` shape: boots
    devnet fixture stack, waits for both RPCs, runs the dual
    broadcast, asserts via the bun script's exit code.

How to run
    bash testing/devnet/scripts/test-double-broadcast-e2e.sh
    bash testing/devnet/scripts/test-double-broadcast-e2e.sh --keep --no-build

Exercises
  - PR #886's consensus-side `expectedPrior` apply-time reject in
    `GCRNonceRoutines`.
  - PR #887's same-node TOCTOU advisory lock in `Mempool.
    addTransaction` (the second broadcast hitting node-1 may be
    caught at the mempool boundary before reaching consensus,
    depending on race ordering).
  - PR #888's vote-race fix in `broadcastBlockHash` (both txs
    going through consensus correctly tally and the second one
    fails BFT once `expectedPrior` rejects the second nonce
    edit).

Out of scope
  - Performance / load test variants (1000 concurrent replays):
    tracked separately as the "mempool stress test" item.
  - Cross-RPC vote-relay scenarios (TODO at end of
    `broadcastBlockHash.ts`): separate consensus design follow-up.
Greptile P1 (mjs:103): `signedTx?.hash` was guarded but the next line
read `signedTx.content.nonce` unguarded. If `pay()` returns
null/undefined the script would crash with `TypeError` instead of the
clean `process.exit(1)` used elsewhere. Added a single guard on
`signedTx?.content` covering both fields.

Greptile P1 (mjs:183): post-state polling only queried node-1, despite
the comment claiming both RPCs were checked. If the credit committed
on node-2 first and node-1 lagged on replication, all 30 polls saw
stale state and the test reported "neither tx landed" — a false
negative that hid replay-protection failures behind apparent network
lag. Now polls BOTH RPCs in parallel and takes the max delta across
nodes — if either side observed the credit, the credit landed. The
double-spend assertion (`observedDelta > expectedDelta`) still catches
real replays because the inflated balance would show up *somewhere*.
Per-node deltas are logged at each poll for debuggability.

Greptile P2 (sh:147): if the sender balance is between 1 and 9 OS,
integer-divided 10% rounds to 0 and the test silently runs as a
zero-value transfer, which doesn't exercise the double-spend path
meaningfully. Now fails fast with a clear error before invoking the
bun script. This is a cheap safety net for a corner case that
shouldn't happen against the fixture, but would silently produce a
green test if it did.

Rejected (verified false positives):
- Greptile P1 (sh:22): claimed `docker-compose.fixture.yml` doesn't
  exist. The file is present at `testing/devnet/docker-compose.
  fixture.yml` — Greptile read the wrong directory listing.
- qodo (mjs:76): claimed `getAddress()` needs `await` because it
  "is used elsewhere as async". Per the SDK type defs
  (demosclass.ts:195), `getAddress(): string`. The companion script
  `transfer_10pct.mjs` uses the same sync form. No fix needed.
Greptile iter-2 P1: the script started only node-1, node-2, postgres,
tlsnotary — matching what `test-transfer-e2e.sh` does. But this test
needs the chain to actually advance past a block-finalize so the
sender's nonce + receiver's balance change become observable. The
fixture genesis (`genesis.devnet.json`) registers 5 validators with
status=2 each, so consensus requires 2/3+1 = 4-of-5 votes per block.
With only 2 validators live, BFT can never reach quorum, every poll
sees `observedDelta=0`, and the test reports "neither tx landed within
60s" — a false negative that looks identical to a replay-protection
failure.

Now starts the full validator set. `test-transfer-e2e.sh` should
probably be updated too, but that's outside this PR's scope (and the
fact that it currently passes with 2 nodes suggests its assertion
fires before block finalization, or some other path I haven't traced;
leaving it for a separate audit).
@tcsenpai tcsenpai force-pushed the feat/audit-sweep-batch-e-nonce-e2e-test-2026-06-01 branch from de10a07 to 71f4617 Compare June 2, 2026 13:26
Comment thread testing/devnet/scripts/double_broadcast_replay.mjs Outdated
Greptile iter-3 P1: the poll loop broke on the first observation of
`observedDelta >= expectedDelta` and evaluated all assertions against
that single snapshot. If the consensus `expectedPrior` check is broken
or absent, the replayed tx could land in block N+1 — one or two blocks
after the legitimate one — by which point the script has already
printed "OK" and exited 0. A real double-spend would pass the test
undetected.

Split the polling into two phases:
- Phase 1: wait up to 60s for the first credit. Unchanged logic, just
  records `firstObservedAt` instead of breaking.
- Phase 2: settling window. Keep polling for ~10s (5 × 2s) after the
  first observation. Do NOT early-break — always wait the full window
  so the test takes the same wall-clock time on success and failure.
  If the replay applied late, the inflated delta surfaces here and the
  `observedDelta > expectedDelta` assertion at the bottom catches it.

Same wall-clock cost on success (~+10s), but now a regression in
`GCRNonceRoutines.expectedPrior` that allows sequenced re-application
fails the test instead of passing it.
@tcsenpai
Copy link
Copy Markdown
Contributor Author

tcsenpai commented Jun 2, 2026

@greptile review

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@testing/devnet/scripts/double_broadcast_replay.mjs`:
- Around line 117-147: The current logic lets the test continue if only one of
the confirm() calls (confirmResults / confirm1 / confirm2) succeeds, which
allows a single-node fallback; update the post-confirmation check to require
both confirmed1 and confirmed2 to be non-null (i.e., both confirmResults entries
have status "fulfilled") or, if one is rejected, assert that its rejection
reason explicitly matches the expected replay/nonce-enforcement error string
before allowing progress; otherwise log an error and exit (process.exit(1)) so
the test fails fast instead of silently proceeding with a single successful leg.

In `@testing/devnet/scripts/test-double-broadcast-e2e.sh`:
- Around line 118-125: The pipeline inside get_balance_os can fail under set -e
and cause the whole script to exit; wrap the risky command substitutions that
call get_balance_os (the balance-probing assignments) so failures don't trigger
an immediate exit — either run the subshell with set +e temporarily or append ||
true to the command substitution, then explicitly test the result and on
empty/failure print a clear error and dump node logs (e.g., docker/journalctl
output) before exiting; ensure you update get_balance_os callers to check the
returned value and emit the diagnostic block instead of letting the pipeline
abort the script.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1d763eee-fccf-431c-86bd-4f0e6bb82b2c

📥 Commits

Reviewing files that changed from the base of the PR and between 3949d2b and fcb7a1f.

📒 Files selected for processing (2)
  • testing/devnet/scripts/double_broadcast_replay.mjs
  • testing/devnet/scripts/test-double-broadcast-e2e.sh

Comment thread testing/devnet/scripts/double_broadcast_replay.mjs
Comment thread testing/devnet/scripts/test-double-broadcast-e2e.sh
CodeRabbit Major (mjs:147): the gate allowed the test to proceed if
EITHER confirm() succeeded. That degrades the test to single-node
happy-path the moment one RPC rejects (e.g. RPC-side nonce check
fires) — CI passes green without ever exercising the cross-RPC
consensus dedup path the test exists to cover. Tightened to require
BOTH confirms fulfilled; either rejection now fails fast with both
sides' rejection reasons logged. Simplified the broadcast block since
confirmed1/confirmed2 are now guaranteed non-null past the gate.

CodeRabbit Minor (sh:125): under `set -euo pipefail`, a curl timeout
or empty grep inside `$(get_balance_os ...)` made the command
substitution exit non-zero and killed the harness before the explicit
diagnostic block could run. Wrapped both probe assignments in `if !`
so failures produce a clear error + node logs instead of a silent
abort.
@tcsenpai
Copy link
Copy Markdown
Contributor Author

tcsenpai commented Jun 2, 2026

@greptile review

@tcsenpai tcsenpai merged commit 8e9ce3f into stabilisation Jun 2, 2026
4 checks passed
@tcsenpai tcsenpai deleted the feat/audit-sweep-batch-e-nonce-e2e-test-2026-06-01 branch June 2, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant