Skip to content

Pin asv version = "1" on every benchmark class#366

Merged
hmgaudecker merged 2 commits into
mainfrom
benchmarks/pin-asv-version
May 25, 2026
Merged

Pin asv version = "1" on every benchmark class#366
hmgaudecker merged 2 commits into
mainfrom
benchmarks/pin-asv-version

Conversation

@hmgaudecker

@hmgaudecker hmgaudecker commented May 25, 2026

Copy link
Copy Markdown
Member

Summary

  • Each benchmark class (MahlerYum, PrecautionarySavingsSolve/Simulate/SimulateWithSolve/SimulateWithSolveIrreg, AcaBaseline, plus the GpuPeakMem base class) now declares version = "1".
  • Subclasses (AcaBaselineDebugLog, every *GpuPeakMem variant) inherit the pinned version via MRO.

Why

asv computes a version per benchmark by hashing the method's source code and only plots data points whose version matches the discovered current value. PR #360 removed check_initial_conditions=False (a no-op once log_level drives validation) from every time_execution / peakmem_execution / track_compilation_time body. That changed the hash of every affected method and collapsed each series on the dashboard at https://open-econ.org/pylcm-benchmarks/ to two data points.

A stable explicit version overrides the auto-hash so cosmetic edits to the same measurement keep the series intact going forward. Restoring the existing pre-#360 history is a separate one-time JSON rewrite in OpenSourceEconomics.github.io.

Test plan

  • pixi run -e benchmarks-cuda12 python -m asv check --config asv.conf.json --environment existing:python → "No problems found."
  • Each affected class' version attribute resolves to "1" (verified via direct attribute access on the 11 classes / subclasses).
  • Next benchmark-main run regenerates the dashboard against version = "1".

🤖 Generated with Claude Code

asv hashes each benchmark method's source into a per-result `version`
field and only plots data points whose `version` matches the discovered
current version. The change in #360 (removing a no-op kwarg from every
`time_execution` / `peakmem_execution` / `track_compilation_time` body)
shifted the auto-hash and collapsed every affected series' history on
the dashboard.

Setting a stable `version` on the class overrides the auto-hash, so
future cosmetic edits to the same measurement don't break continuity
again.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@read-the-docs-community

read-the-docs-community Bot commented May 25, 2026

Copy link
Copy Markdown

@github-actions

github-actions Bot commented May 25, 2026

Copy link
Copy Markdown

Benchmark comparison (main → HEAD)

Comparing d04bf25c (main) → 89e0ae91 (HEAD)

Benchmark Statistic before after Ratio Alert
aca-baseline execution time 15.364 s 15.077 s 0.98
peak GPU mem 915 MB 581 MB 0.63
compilation time 281.84 s 279.98 s 0.99
peak CPU mem 6.97 GB 6.73 GB 0.97
aca-baseline-debug execution time 77.532 s 80.262 s 1.04
peak GPU mem 659 MB 581 MB 0.88
compilation time 381.35 s 374.92 s 0.98
peak CPU mem 7.63 GB 7.57 GB 0.99
Mahler-Yum execution time 4.339 s 4.380 s 1.01
peak GPU mem 529 MB 529 MB 1.00
compilation time 12.90 s 12.84 s 1.00
peak CPU mem 1.68 GB 1.68 GB 1.00
Precautionary Savings - Solve execution time 26.5 ms 26.7 ms 1.01
peak GPU mem 101 MB 101 MB 1.00
compilation time 2.11 s 2.13 s 1.01
peak CPU mem 1.11 GB 1.12 GB 1.01
Precautionary Savings - Simulate execution time 101.0 ms 95.9 ms 0.95
peak GPU mem 349 MB 349 MB 1.00
compilation time 4.81 s 4.89 s 1.02
peak CPU mem 1.32 GB 1.32 GB 1.00
Precautionary Savings - Solve & Simulate execution time 137.5 ms 141.2 ms 1.03
peak GPU mem 586 MB 586 MB 1.00
compilation time 6.41 s 6.45 s 1.01
peak CPU mem 1.28 GB 1.28 GB 1.00
Precautionary Savings - Solve & Simulate (irreg) execution time 267.2 ms 263.6 ms 0.99
peak GPU mem 2.20 GB 2.20 GB 1.00
compilation time 6.68 s 6.76 s 1.01
peak CPU mem 1.34 GB 1.33 GB 0.99

A new helper `_pad_sparse_graphs` walks `.asv/html/graphs/summary/`
and every per-environment leaf, and for each benchmark JSON that
misses revisions covered by sibling benchmarks, appends
`[rev, null]` markers. flot's auto-fit then expands the chart x-axis
to the project-wide commit range while still drawing the line only
where data exists — so a recently-added benchmark like
`AcaBaselineDebugLog.*` shows two real points at the right edge
instead of a line that visually spans the full chart.

Called from `publish()` between `asv publish` and the site copy, so
every benchmark-main run keeps the charts honest.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hmgaudecker hmgaudecker merged commit 913e315 into main May 25, 2026
4 checks passed
@hmgaudecker hmgaudecker deleted the benchmarks/pin-asv-version branch May 25, 2026 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant