Bench vs p3 nightly publish#590
Conversation
Co-authored-by: Gabriel Bosio <38794644+gabrielbosio@users.noreply.github.com>
--steps flag for nightly 1M/2M/4M/8M benchmarks
bench_vs/run.sh by switching to sed
Review SummaryThis PR adds a new Medium
Low
|
|
/claude /codex |
Codex Code ReviewFindings
I attempted |
Review: Bench vs Plonky3Overall this is a well-structured, well-documented benchmarking crate. The fairness choices (scalar MMCS, matching AIR shape, same blowup/queries) are clearly reasoned and the README is thorough. A few issues worth fixing: Medium
Low
Info
No security or correctness issues found in the AIR, prover, or verifier wiring. |
|
/codex /claude |
Codex Code ReviewFindings
Verification I attempted |
Review: bench_vs_plonky3 nightly benchmarkOverall the crate is well-structured and the fairness considerations (scalar MMCS, matched blowup/queries, base-field AIR path, extension field differences) are clearly documented. Two actionable items: Medium — CI: P3 steps missing
|
|
/claude |
|
Code Review - Bench vs Plonky3 Nightly. Overall: Well-structured. Fairness docs and trace-equivalence test are solid. Issues below. (1) BUG Low - Wrong comment path: publish_bench_vs.sh says 'Built when bench_vs_artifacts/p3/headline/metrics.txt exists' but the actual path is bench_vs_artifacts/p3/metrics.txt - no headline/ dir. (2) BUG Low - Dead code: plonky3_fibonacci.rs exports pub fn rows_for_lambda_trace but it is never called in the crate. Remove if unused. (3) MEDIUM - Silent failures: Cargo.toml pins all p3 crates to the Plonky3 git repo with no rev. Both new CI steps use continue-on-error: true so a Plonky3 breaking change silently drops the P3 section from nightly reports with no alert. Add a Slack warning when bench_vs_artifacts/p3/metrics.txt is absent. (4) Nit - p3_fmt_seconds uses %.0fms which rounds to 0ms for verify times under 0.5ms; use %.1fms. |
|
/claude |
Review: Bench vs Plonky3Overview: Adds a BugLow — Design NoteUnpinned Plonky3 git deps (inline on No issues found in:
|
New
bench_vs_plonky3/proves a shared Fibonacci AIR on both Lambda and Plonky3. Plonky3's MMCS is configured scalar (no vector lanes) so Merkle compression is one Keccak-f per call on both sides.bench_vs_plonky3/run.shruns the bench, reports median + CV, and writes TSV + metrics to--report-dir.Workflow runs the bench after the SP1 + ethrex steps and appends a "Lambda VM vs Plonky3" section to the same Slack post.