Perf/verifier recursion opts by diegokingston · Pull Request #636 · yetanotherco/lambda_vm

diegokingston · 2026-05-29T19:19:17Z

No description provided.

…fy; borrow lookup challenges

github-actions · 2026-05-29T19:22:14Z

Codex Code Review

No issues found in the PR diff.

I reviewed the changed verifier logic for security, correctness, significant performance regressions, and unnecessary complexity, and didn’t find an actionable concern limited to these changes.

claude · 2026-05-29T19:23:49Z

+        // it is the number of composition poly parts the proof advertises (the
+        // same array the consumer validates against). Hoist `z^N` once.
+        let number_of_parts = proof.composition_poly_parts_ood_evaluation.len();
+        let z_pow = challenges.z.pow(number_of_parts);


Low – silent semantic change for malformed proofs

number_of_parts is now taken from proof.composition_poly_parts_ood_evaluation.len() (the OOD header), whereas the old code took it from lde_composition_poly_parts_evaluation.len() (the LDE opening evaluations slice).

For valid proofs these are always equal, so there is no regression. For a malformed proof where the opening carries fewer parts than the header (M < N), the old code silently computed z^M (wrong denominator) and then hit None on the subsequent gammas.get(j)?; the new code computes the correct z^N from the start. This is actually a fix, but it is an implicit behavioral change worth calling out explicitly.

Consider adding a validation check so the caller fails fast and the intent is clear:

Suggested change

let z_pow = challenges.z.pow(number_of_parts);

let number_of_parts = proof.composition_poly_parts_ood_evaluation.len();

let z_pow = challenges.z.pow(number_of_parts);

// All openings must carry the same part count as the OOD header; validated

// per-entry inside reconstruct_deep_composition_poly_evaluation via the

// gammas/ood bounds checks there.

claude · 2026-05-29T19:23:59Z

+        // (it is stored owned in the returned `Challenges`). Clone exactly once
+        // here, where ownership is actually required — this removes the
+        // per-table clone that previously lived at the `multi_verify` call site.


Low – comment overstates the allocation saving

The clone is not eliminated — it moves from the call site (lookup_challenges.clone()) into verify_rounds_2_to_4 (rap_challenges.to_vec()). Total allocations per-table is unchanged.

The real wins from this PR are:

logup_alpha_powers computed once globally instead of once per table.

A single inplace_batch_inverse across all query denominators instead of one per query.

The comment should reflect that, so future readers don't expect a measurable per-table allocation reduction from this line change alone.

claude · 2026-05-29T19:24:27Z

Review: Perf/verifier recursion opts

Summary

Two orthogonal performance improvements to the STARK verifier:

Single global inplace_batch_inverse — denominators for all query/symmetric-point pairs are collected in one flat buffer and inverted in a single batch instead of one batch per query. This removes num_queries batch-inversion calls.
Global LogUp alpha powers — compute_alpha_powers is called once with the maximum bus element count across all AIRs, and each table slices the prefix it needs, instead of calling compute_alpha_powers once per table.

The two-pass refactoring (collect denominators → batch invert → reconstruct) is sound. Ordering of deep_poly_evaluations and deep_poly_evaluations_sym is preserved correctly (non-sym and sym for iota[i] land at index i in their respective Vecs, matching what step_3_verify_fri expects).

Issues

Low — z_pow basis change (see inline comment on L576)
number_of_parts is now read from proof.composition_poly_parts_ood_evaluation.len() (OOD header) instead of the opening evaluation slice's length. For valid proofs these are always equal. For malformed proofs with fewer opening parts than the header (M < N) the old code used z^M (wrong); the new code uses z^N (correct). Net effect: the new code is more correct, but the implicit behavioral change should be documented.

Low — misleading clone comment (see inline comment on L1155-1157)
The total number of Vec allocations for rap_challenges per table is unchanged; to_vec() inside verify_rounds_2_to_4 replaces the clone() at the call site. The comment should attribute the real wins (global alpha powers + single batch inverse) rather than imply the clone was eliminated.

No issues found with

Batch-inverse failure propagation (ok()? correctly rejects any zero denominator from a malformed proof, matching prior semantics).
DeepEntry lifetime validity (all borrows come from the immutably-borrowed proof, so they coexist safely).
all_denoms buffer indexing (e * stride is safe because pass 1 fills exactly num_entries * stride elements with num_entries = num_queries * 2).
LogUp prefix slicing (the .min(logup_alpha_powers.len()) guard is purely defensive and correct).
Single-table verify path (routes through multi_verify, so all improvements apply there too).

diegokingston · 2026-05-29T19:26:15Z

/bench 5

github-actions · 2026-05-29T19:28:55Z

Benchmark — fib_iterative_8M (median of 5)

_{Table parallelism: auto (cores / 3)}

Metric	main	PR	Δ
Peak heap	51689 MB	51854 MB	+165 MB (+0.3%) ⚪
Prove time	25.230s	25.021s	-0.209s (-0.8%) ⚪

✅ No significant change.

✅ Low variance (time: 3.5%, heap: 1.1%)

_{Commit: 2a4cb75 · Baseline: cached · Runner: self-hosted bench}

…ng Vec alloc)

…erify, opening short-circuit)

…ompile

Toggles the keccak-precompile [patch] in an external recursion-guest Cargo.toml, runs the verify bench both ways, prints the cycle comparison. Parameterized by GUEST_DIR + VERIFY_BENCH_CMD (the guest harness lives outside this repo).

diegokingston added 4 commits May 29, 2026 16:17

perf(verifier): batch all DEEP denominators into one inversion

afae49b

perf(verifier): drop per-query lde_base concatenation; hoist z_pow

618f97f

perf(verifier): compute LogUp alpha powers once, slice per table

1958280

refactor(verifier): consolidate common pre-fork section in multi_veri…

2a4cb75

…fy; borrow lookup challenges

claude Bot reviewed May 29, 2026

View reviewed changes

diegokingston added 4 commits May 29, 2026 16:31

perf(verifier): hash Merkle leaves from borrowed slices (no per-openi…

8f93fd8

…ng Vec alloc)

perf(verifier): fold in #626 micro-opts (Vec step-memo, slice multi_v…

d4de863

…erify, opening short-circuit)

feat(recursion): keccak crate shim routing f1600/p1600 to the VM prec…

b85fd46

…ompile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf/verifier recursion opts#636

Perf/verifier recursion opts#636
diegokingston wants to merge 8 commits into
mainfrom
perf/verifier-recursion-opts

diegokingston commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

claude Bot May 29, 2026

Uh oh!

claude Bot May 29, 2026

Uh oh!

claude Bot commented May 29, 2026

Uh oh!

diegokingston commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

diegokingston commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Codex Code Review

Uh oh!

claude Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 29, 2026

Review: Perf/verifier recursion opts

Summary

Issues

No issues found with

Uh oh!

diegokingston commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Benchmark — fib_iterative_8M (median of 5)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant