nano.rust

A pure-Rust, semantics-first NanoAOD analysis framework for high-energy physics, built for the agentic-coding era.

📖 API docs + notes: https://dickychant.github.io/nano.rust/

The idea

Agentic tools can write analysis code faster than anyone can review it. The bottleneck is no longer writing code but guaranteeing it is correct. Soft guardrails — prompts, skills, harnesses, human review — can steer an agent but cannot guarantee the absence of silent analysis bugs (wrong branch, dropped systematic, mixed units, stale outputs). A hard guarantee needs a mechanical enforcer:

Physicists define and review physics semantics. Agents generate implementation. The Rust compiler and a validation layer reject inconsistent states.

So the analysis is modelled as a typed state machine the compiler checks (make invalid analysis states unrepresentable), with Rust's strengths layered on: performance, FFI to legacy libraries, SIMD per-event execution, and TUI-friendly orchestration. Full rationale: docs/vision.md.

What works today

Validated on a real analysis — reproduces ROOT's Higgs→ZZ→4ℓ (df103, three channels: 4μ/4e/2e2μ) on CMS Open Data, read remotely on-demand in pure Rust. The full stacked discovery plot (signal + ZZ background + 2012 data, 11.6 fb⁻¹) is bit-identical to ROOT (and the df102 dimuon spectrum). Plots are in docs/site/plots/; see the blog.
Owned, pure-Rust ROOT I/O (nano-rootio, no ROOT/C++ dependency):
- reads real CMS NanoAODv9 — scalars, jagged collections, windowed reads, bounded-memory streaming (~3 MB to stream a skim of any-size file);
- reads locally and remotely on-demand over HTTPS byte-range (the first 10 events of a 2 GB open-data file fetch ~1.3 MB — only the baskets touched);
- writes ROOT/uproot-readable skims (scalars and jagged);
- validated A/B against the upstream reader and cross-checked against uproot in CI, both read and write.
Typed event model (nano-core): collections, attributes, the Prefix_attr grouping rule, Arc-shared per-event columns (Send + Sync).
Compile-enforced state machine (nano-analysis): Ev<Raw> → Baseline → InRegion<R> → Weighted<R>; filling a histogram requires a Weighted<R>, so wrong-stage / wrong-region / unweighted fills are compile errors (proven by compile-fail tests). Unit newtypes, exhaustive Systematic.
Semantic IR (nano-spec): a physics-facing YAML spec is parsed, statically validated (missing branch / wrong type / missing unit / undefined object are rejected with precise errors), and used to derive the exact read_branches for the reader.

Workspace

crates/
  nano-rootio    owned ROOT TTree read + write (NanoAOD subset; pure Rust)
  nano-core      event model (Event / Collection / ObjectView, branch schema)
  nano-io        streaming reader + skim writer over nano-rootio
  nano-producers analysis channels (muon control region)
  nano-analysis  compile-enforced analysis state machine (typestate)
  nano-spec      semantic compiler: spec -> validate -> derive read_branches -> codegen
  nano-corrections  native correctionlib evaluator (typed SF inputs)
  nano-inference    backend-agnostic ML inference protocol (mock/ONNX/remote/managed)
  nano-cli       the `nano` CLI: validate / branches / inspect / codegen
  nano-mcp       MCP server exposing the same ops as agent tools
  nano-gen-demo, nano-gen-tagger-demo   codegen == hand-written equivalence proofs
  root-io        vendored upstream reader, retained only as a dev/A-B oracle

The architecture, layer by layer:

physics spec (TOML/YAML)  ->  semantic IR (typed, validated) -> Rust codegen
                          ->  Rust execution kernels (typed state machine)
                          ->  Rust-native workflow DAG (planned)

Build, test, run

cargo build
cargo test                 # whole workspace
cargo test --features http # also exercise remote (HTTPS byte-range) reads

# write a small NanoAOD-like file and inspect it (e.g. with uproot)
cargo run -p nano-rootio --example write_demo -- /tmp/demo.root

Real-data tests read a local NanoAOD file from tests/data/muon_validation/inputs/ if present (gitignored) and skip otherwise. The uproot interop + benchmark runs in CI (scripts/bench_vs_uproot.py) against CMS Open Data over HTTPS — no checked-in data files.

Status & roadmap

Built: owned ROOT I/O (read + write, local + remote), the event model, the compile-enforced state machine, the semantic compiler including codegen (proven equal to a hand-written producer), a native correctionlib evaluator, an ML inference protocol, and an agent action space (nano CLI + MCP server). Next: golden tests against the frozen .root references, wiring real corrections/JME systematics into the channel, and a Rust-native workflow DAG orchestrator (the LAW backend is descoped). See docs/ — architecture, compiler roadmap, ADL front-end, orchestrator, vision, versioning, state machine, semantic layer, inference protocol, agent interface, reader rewrite, remote source, migration.

Acknowledgments

nano.rust grew out of, and is inspired by, prior work:

Origins — it began as a C++ port (nano.cpp / NanoAODToolsCpp) of selected NanoAOD-tools / NanoHRT-tools workflows, preserved on the cpp-snapshot branch.
root-io (cbourjau / alice-rs) — the pure-Rust ROOT reader we vendored and grew the owned nano-rootio I/O core (read + write) from; still a differential A/B oracle in tests (MPL-2.0).
uproot (with awkward-array) — for showing that ROOT can be treated as a storage format readable outside ROOT; it is also our independent read/write oracle in CI.
ROOT — the on-disk format and reference implementation; our correctness and performance baseline.
correctionlib — the corrections JSON schema and evaluation model that nano-corrections re-implements natively in Rust.

License

MPL-2.0. crates/root-io is vendored from cbourjau/alice-rs (MPL-2.0); its license and attribution are retained.

Name		Name	Last commit message	Last commit date
Latest commit History 154 Commits
.github/workflows		.github/workflows
app		app
configs		configs
crates		crates
data/jme-derived		data/jme-derived
docs		docs
external		external
include/nano		include/nano
integrations		integrations
scripts		scripts
src		src
templates/condor		templates/condor
tests		tests
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
lychee.toml		lychee.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nano.rust

The idea

What works today

Workspace

Build, test, run

Status & roadmap

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

nano.rust

The idea

What works today

Workspace

Build, test, run

Status & roadmap

Acknowledgments

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages