tenferro-benchmark

Benchmark suite for tenferro-rs.

The repository keeps the latest human-facing reports per target profile:

macOS CPU: result/mac-cpu/cpu/einsum.md, result/mac-cpu/cpu/cpu_ops.md, result/mac-cpu/cpu/linalg_jvp_vjp.md
Linux/AMD CPU: result/amd-cpu/cpu/einsum.md, result/amd-cpu/cpu/cpu_ops.md, result/amd-cpu/cpu/linalg_jvp_vjp.md
NVIDIA GPU: result/nvidia-gpu/gpu/dense.md, result/nvidia-gpu/gpu/einsum.md, result/nvidia-gpu/gpu/sparse.md, result/nvidia-gpu/gpu/tensornetwork.md

Historical reports are not archived in extra files. Use git history when older results are needed.

The GPU tensor network benchmark uses problem data from extern/TensorNetworkBenchmarks/, based on the upstream TensorNetworkBenchmarks repository. See GPU tensor network suite.

Workflows

macOS CPU workflow: native run, no Docker, Accelerate.
Linux CPU devcontainer workflow: Docker/devcontainer, OpenBLAS default for tenferro, optional oneMKL, detected PyTorch provider.
NVIDIA GPU devcontainer workflow: CUDA devcontainer.
Einsum suite and instance selection: source benchmark, selection rules, diagnostic cases, path strategies.
GPU tensor network suite: TensorNetworkBenchmarks parity on CUDA.
Result layout and metadata: target_profile, suite_id, run.yaml, latest reports.
Architecture terminology: suite, runner, backend, strategy, target profile.
PyTorch einsum dispatch notes: PyTorch source investigation notes.

Quick Smoke

macOS:

uv sync
./scripts/setup_extern_deps.sh
BENCHMARK_TARGET_PROFILE=mac-cpu \
BENCH_INSTANCE=bin_matmul_256 \
BENCH_RUNS=1 \
BENCH_WARMUPS=0 \
PUBLICATION_GATE_SUITE=small \
  ./scripts/run_all.sh 1

Linux devcontainer from the host:

devcontainer up --workspace-folder .
devcontainer exec --workspace-folder . bash -lc '\
  BENCHMARK_TARGET_PROFILE=amd-cpu \
  BENCH_INSTANCE=bin_matmul_256 \
  BENCH_RUNS=1 \
  BENCH_WARMUPS=0 \
  PUBLICATION_GATE_SUITE=small \
    ./scripts/run_all.sh 1'

Linux linalg AD repro report with the devcontainer default OpenBLAS path:

devcontainer up --workspace-folder . --remove-existing-container
devcontainer exec --workspace-folder . bash -lc '
  python3 - <<PY
import ctypes
lib = ctypes.CDLL("/opt/openblas/lib/libopenblas.so")
lib.openblas_get_config.restype = ctypes.c_char_p
lib.openblas_get_parallel.restype = ctypes.c_int
print(lib.openblas_get_config().decode())
print(f"parallel={lib.openblas_get_parallel()}")
PY'
devcontainer exec --workspace-folder . bash -lc '
  export TENFERRO_CPU_FEATURES=system-openblas
  export PUBLICATION_GATE_FEATURES=system-openblas
  export TENFERRO_CPU_BACKEND_KIND=blas
  ./scripts/reproduce_linux_cpu_linalg_jvp_jvp.sh'

The repro writes result/linux-cpu/cpu/linalg_jvp_jvp.md. The devcontainer build installs a source-built OpenBLAS under /opt/openblas and oneMKL under /opt/intel/oneapi/mkl/latest; verify OpenBLAS through the runtime API above instead of relying on strings.

To collect the same repro with tenferro linked against oneMKL, use system-mkl:

devcontainer exec --workspace-folder . bash -lc '
  export TENFERRO_CPU_FEATURES=system-mkl
  export PUBLICATION_GATE_FEATURES=system-mkl
  export TENFERRO_CPU_BACKEND_KIND=blas
  ./scripts/reproduce_linux_cpu_linalg_jvp_jvp.sh'

GPU devcontainer from the host:

devcontainer up --workspace-folder . --config .devcontainer/cuda/devcontainer.json
devcontainer exec --workspace-folder . --config .devcontainer/cuda/devcontainer.json \
  bash -lc 'BENCHMARK_TARGET_PROFILE=nvidia-gpu ./scripts/run_gpu_suite.sh'

Comparison Backends

CPU reports compare:

tenferro-trace
tenferro-eager
pytorch-cpu
jax-cpu

GPU reports compare:

tenferro-cuda-trace
tenferro-cuda-eager
pytorch-cuda
jax-cuda
vendor-specific CUDA backends where meaningful

C++ Torch/LibTorch runners are intentionally removed. PyTorch Python is the ATen comparison backend. The PyTorch CPU provider is detected at run time and recorded in run.yaml and generated reports; Linux does not source-build PyTorch to force OpenBLAS.

Development Checks

Run these after changing benchmark scripts or schemas:

uv run python scripts/validate_benchmark_suite.py benchmarks/cpu/einsum.yaml
uv run python scripts/validate_benchmark_suite.py benchmarks/gpu/dense.yaml benchmarks/gpu/einsum.yaml benchmarks/gpu/sparse.yaml benchmarks/gpu/tensornetwork.yaml
bash tests/test_suite_result_layout.sh
bash tests/test_run_all_docs_outputs.sh
bash tests/test_clean_extern_deps.sh
bash tests/test_setup_extern_tenferro_checkout.sh
cmake -S cpp -B build/cpp-plan-test
cmake --build build/cpp-plan-test --target einsum_plan_test
ctest --test-dir build/cpp-plan-test --output-on-failure

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
.devcontainer		.devcontainer
benchmarks		benchmarks
cpp		cpp
data		data
docs		docs
examples		examples
extern		extern
result		result
schemas		schemas
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
AGENTS.md		AGENTS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
build.rs		build.rs
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

tenferro-benchmark

Workflows

Quick Smoke

Comparison Backends

Development Checks

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

tenferro-benchmark

Workflows

Quick Smoke

Comparison Backends

Development Checks

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages