Skip to content

Split pylcm into a public lcm/ package and a private _lcm/ package#361

Merged
hmgaudecker merged 120 commits into
mainfrom
refactor/phase-2-api-reorganisation
May 25, 2026
Merged

Split pylcm into a public lcm/ package and a private _lcm/ package#361
hmgaudecker merged 120 commits into
mainfrom
refactor/phase-2-api-reorganisation

Conversation

@hmgaudecker

@hmgaudecker hmgaudecker commented May 18, 2026

Copy link
Copy Markdown
Member

Summary

I found the source code increasingly hard to navigate and to predict what might be the public interface and what we should be free to change (case in point that came up in the process: validate_transition_probs, which was removed in #361). I do not want to imagine how that may look for a user who inspects the source code for guidance of what she may use). After a couple of iteration rounds, I think pytask's strategy is the right one: Shallow public modules and all the hard work in _lcm. See the docs page explaining the internal architecture.

After this PR, pylcm's source is two packages with a hard public/private boundary:

  • src/lcm/ — the public surface. Everything a user constructs or consumes: the user-facing classes, the @categorical decorator, as_leaf, the public type aliases, and the exception classes. lcm/__init__.py re-exports the public symbols.
  • src/_lcm/ — the private implementation. The build pipeline, the canonical engine dataclasses, the JAX-traced solve / simulate machinery, validators, I/O plumbing, and the engine-side type aliases and protocols.

Public surface — src/lcm/

__init__.py, ages.py, categorical.py, exceptions.py, grids.py, model.py, params.py, persistence.py, processes.py, regime.py, result.py, transition.py, typing.py. Users keep writing from lcm import Model, Regime, ... — the import path is unchanged.

Private implementation — src/_lcm/

Every engine internal, plainly named (no leading underscore): engine.py, model_processing.py, pandas_utils.py, state_action_space.py, variables.py, dtypes.py, transition_checks.py, the bootstrap modules (jaxtyping_patch.py, beartype_conf.py, config.py), typing.py, and the grids/, processes/, persistence/, regime/, regime_building/, solution/, simulation/, params/, utils/ subpackages.

Typing split

  • lcm/typing.py — model-authoring aliases (jaxtyping array shapes, Period, Age) and the User* boundary aliases. Imports nothing from _lcm.
  • _lcm/typing.py — engine-side string labels, compound mapping aliases, canonical post-processing forms, and the structural Protocol classes.

Exceptions

The PyLCMError subclasses stay public in lcm/exceptions.py and are re-exported from lcm — both from lcm.exceptions import InvalidParamsError and except lcm.InvalidParamsError work. format_messages, internal validation plumbing, lives in _lcm/utils/error_messages.py.

Params

lcm/params.py exposes as_leaf and re-exports MappingLeaf, UserMappingLeaf, SequenceLeaf, UserSequenceLeaf. The leaf-class definitions and the engine params machinery live in _lcm/params/.

Bootstrap

_lcm/__init__.py applies the jaxtyping "..."-sentinel patch. lcm/__init__.py imports _lcm first, then registers beartype's package claw on both _lcm and lcm before any submodule loads.

Renames

  • interfaces.py has always been confusing me as it sounds too much like API/UI IMO. Went for engine.py after a brainstorming session, but happy to adjust that.
  • The "shocks" has become too narrow at least since we have been allowing for means and variances to differ from 0/1. I think the correct vocabulary for what those objects describe are (approximations to) continuous stochastic processes. The PR updates the docs to make clear that these objects bundle grids and transitions, e.g. first sentence here. The engine vocabulary has thus become process in place of shock: the seven *Process classes (UniformIIDProcess, NormalIIDProcess, LogNormalIIDProcess, NormalMixtureIIDProcess, TauchenAR1Process, RouwenhorstAR1Process, TauchenNormalMixtureAR1Process), VariableInfo.is_process, Variables.process_names, and the ProcessName typing alias. Each *Process class bundles a discretization grid and a transition mechanism — instances go in Regime(states={...}). All of these are directly importable from lcm now.
  • Piece has been renamed to PiecewiseGridSegment (looking at the public surface told me this needed explanation) and the corresponding keywords from pieces to segments (we were using that in prose, anyhow)

Build

hatch-vcs writes the generated version file to src/_lcm/version.py.

Migration guide for downstream code

  • from lcm import Model, Regime, AgeGrid, UniformIIDProcess, ... — unchanged.
  • from lcm.params import MappingLeaf, as_leaf — unchanged.
  • from lcm.typing import FloatND, ScalarInt, Period, Age, ... — unchanged for the model-authoring aliases.
  • from lcm.exceptions import InvalidParamsError, ... — unchanged.
  • from lcm._grids... import ...from _lcm.grids... import .... The grid / process ABCs (Grid, ContinuousGrid, _ContinuousStochasticProcess) are private; import the public leaf classes from lcm.
  • Engine internals — lcm.engine, lcm.model_processing, lcm.regime_building, lcm.solution, lcm.simulation, lcm.variables, lcm.pandas_utils, lcm.state_action_space, lcm.dtypes, lcm.utils — are now under _lcm.*.
  • Explanation notebooks that demonstrate _lcm internals must import lcm before importing any _lcm submodule, so the package bootstrap completes first.

Test plan

hmgaudecker and others added 16 commits May 18, 2026 18:57
Move `collect_state_transitions`, `_make_identity_fn`, and
`_add_raw_transition` from `regime_building/validation.py` to a new
focused module. Update imports in 5 callers. Drop unused imports from
`validation.py`. Part of the Phase 1 effort to delete the
"validation" and "error_handling" umbrellas; see
`Phase 1 — Validation Cleanup.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move 8 validators from `regime_building/validation.py` into
`user_regime.py` and privatize the two formerly-public ones:
- `validate_mapping_contents` -> `_validate_mapping_contents`
- `validate_logical_consistency` -> `_validate_logical_consistency`
- `_validate_distributed_grids`, `_validate_function_output_grid_indexing`,
  `_find_function_output_grid_indexing`, `_validate_active`,
  `_validate_state_transitions`, `_validate_per_target_dict`

The validators are sole-called from `UserRegime.__post_init__`;
co-locating them with the class eliminates a misleading umbrella
module and the cross-module delayed import in `__post_init__`. Delete
`regime_building/validation.py`. Part of Phase 1 — see
`Phase 1 — Validation Cleanup.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move `_get_func_indexing_params`, `_slice_references_params`,
`_collect_subscripts`, `_extract_bare_names` from
`utils/error_handling.py` into a new focused module. Update imports
in `pandas_utils.py` and `tests/test_validate_array_indexing.py`.
`error_handling.py::validate_transition_probs` still uses
`_get_func_indexing_params` and now imports it from the new module
(deferred to M5' for full extraction). Part of Phase 1 — see
`Phase 1 — Validation Cleanup.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Create `regime_building/runtime_checks.py` and absorb two families
from `utils/error_handling.py`:
- V family (`validate_V`, `_enrich_with_diagnostics`,
  `_summarize_diagnostics`, `_format_diagnostic_summary`)
- regime-prob family (`validate_regime_transition_probs` ->
  `_validate_regime_transition_probs`, `_format_sum_violation`,
  `validate_regime_transitions_all_periods`,
  `_validate_regime_transition_single`,
  `_validate_no_reachable_incomplete_targets`)

Both families fit the unifying concept "defensive checks on JAX
arrays produced during solve/simulate." Privatize
`validate_regime_transition_probs` (only tests call it directly).
Update imports in 6 callers (3 src, 3 test). `diagnostics.py` keeps
its name. Part of Phase 1 — see `Phase 1 — Validation Cleanup.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_regime.py

Drop the regime-probs overload — that mode is redundant with
`validate_regime_transitions_all_periods` which runs unconditionally
during `model.solve()` and `model.simulate()` and additionally
checks inactive-regime probability and reachability.

Keep the state-probs mode. It is the only defence against four
silent-correctness bug classes in user-written MarkovTransition
functions for states: wrong-shape broadcasting, values outside
[0, 1], rows not summing to 1, and subscript-order swaps relative
to the function signature.

Move slimmed function + helpers (`_extract_markov_transition`,
`_build_grids`, `_build_expected_shape`) to `user_regime.py` next to
the `Regime` class they operate on. Update `lcm/__init__.py` import.
Drop three regime-probs tests from `tests/test_pandas_utils.py`.
Update `docs/user_guide/pandas_interop.md` accordingly. Part of
Phase 1 — see `Phase 1 — Validation Cleanup.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All contents have been absorbed: V family + regime-prob family into
`regime_building/runtime_checks.py` (M4); AST helpers into
`utils/ast_inspection.py` (M3); slimmed `validate_transition_probs`
into `user_regime.py` (M5'). The "error_handling" umbrella was
misleading from the start — three unrelated concerns under one
name. Part of Phase 1 — see `Phase 1 — Validation Cleanup.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…me.py

Three renames driven by Phase 1's src moves:
- `tests/test_error_handling_invalid_vf.py` → `tests/test_invalid_vf.py`
  (the "error_handling" concept is gone from src).
- `tests/test_validate_array_indexing.py` → `tests/test_ast_inspection.py`
  (AST helpers moved to `lcm/utils/ast_inspection.py`).
- Extract the four `validate_transition_probs` state-probs tests from
  `tests/test_pandas_utils.py` into `tests/test_regime.py` (the
  function now lives in `lcm/user_regime.py`). Duplicate the
  three-line `_make_partner_probs_array` helper into the new
  location rather than imposing a cross-file import.

Closes the Phase 1 plan in `Phase 1 — Validation Cleanup.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Validate state transition probability functions automatically — both
statically at process time and numerically at solve time — so users no
longer need to call `lcm.validate_transition_probs` manually for state
transitions. Plan: `Phase 1b — Automatic State Transition Validation.md`.

What runs when:
- **Process time** (during `process_regimes`, always on, cheap):
  AST subscript-order check on every `MarkovTransition.func` —
  permissive: skipped when the function doesn't use the
  `probs_array[...]` pattern. Outcome-axis size is derived from the
  state's `DiscreteGrid` and cached on the canonical `Regime` via the
  new `stochastic_state_transitions` field. For per-target dicts, the
  target regime's grid wins (cross-grid state spaces).
- **Solve / simulate time** (gated by `log_level != "off"`):
  new `validate_state_transitions_all_periods` evaluates each
  `MarkovTransition` function on the Cartesian product of the
  function's accepted grid args (via vmap) and checks outcome-axis
  size, [0, 1] range, and sum-to-1 along the last axis. Raises a new
  `InvalidStateTransitionProbabilitiesError` on failure.

Fast-exits when no regime has any `MarkovTransition` state transition.

The slimmed `lcm.validate_transition_probs` (Phase 1) is deprecated
with a `DeprecationWarning` pointing at the automatic validator. It
will be removed in a subsequent phase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move user-facing type aliases (UserAge, UserParams,
UserInitialConditions, UserFunction, UserFacingParamsTemplate, plus
the private _UserParamsLeaf) from lcm/typing.py to a new
lcm/api/typing.py. lcm/typing.py keeps the canonical / engine-side
aliases and adds a bottom-of-file shim re-exporting the User* names
so existing `from lcm.typing import UserParams` keeps working.

Adjust the TYPE_CHECKING-only imports in `lcm.params.mapping_leaf`
and `lcm.params.sequence_leaf` to source `_UserParamsLeaf` from its
new home.

First step of `Phase 2 — api Reorganisation.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ariableInfo

Rename `lcm/interfaces.py` to `lcm/engine.py` (canonical / engine-side
dataclasses consumed by the DP machinery). Hoist `Variables` and
`VariableInfo` dataclasses from `lcm/variables.py` into `engine.py`.
`variables.py` retains the factories (`from_regime`, `get_grids`,
`_raw_variable_info`, `_ordered_state_action_names`,
`_bind_forward_refs`), which now import `Variables` / `VariableInfo`
from `engine.py`.

`Variables.from_regime` classmethod becomes the module-level
`lcm.variables.from_regime` factory; src and test call sites are
updated. All `from lcm.interfaces` imports are rewritten to
`from lcm.engine`.

Second step of `Phase 2 — api Reorganisation.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mulateFunctionPair

Move `lcm/user_regime.py` → `lcm/api/regime.py`. The user-facing
`Regime` class (still defined as `class Regime`), `MarkovTransition`,
`_default_H`, `_IdentityTransition`, the Phase 1 absorbed validators,
and the slimmed `validate_transition_probs` ride along.

Relocate `SolveSimulateFunctionPair` from `lcm/engine.py` to
`lcm/api/regime.py` — it's user-facing (listed in `__all__`),
constructed by users for `Regime.functions` values. Its natural home
is alongside `Regime`. Update src and test importers.

Sed-rewrite `from lcm.user_regime` → `from lcm.api.regime` across
src/ and tests/. Update stale docstring references.

Third step of `Phase 2 — api Reorganisation.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move `lcm/model.py` → `lcm/api/model.py`. Heavy build machinery
(`build_regimes_and_template`, `_validate_param_types`,
`_resolve_fixed_params`, etc.) stays in `lcm/model_processing.py`.
The `Model` class + tight privates (`_merge_derived_categoricals`,
`_validate_log_args`) ride along.

Rewrite `lcm.model.Model` annotations (forward refs in
`api/regime.py`) to `lcm.api.model.Model`, plus the `TYPE_CHECKING`
import. Sed-update `from lcm.model import` / `import lcm.model`
across src and tests.

Fourth step of `Phase 2 — api Reorganisation.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `lcm/ages.py` → `lcm/api/ages.py` (AgeGrid + small validators).
- `lcm/persistence.py` → `lcm/api/persistence.py` (SolveSnapshot,
  SimulateSnapshot, load_*, save_* + I/O helpers).
- `lcm/simulation/result.py` → `lcm/api/result.py` (SimulationResult +
  _compute_metadata).

Sed-update `from lcm.ages` / `from lcm.persistence` /
`from lcm.simulation.result` imports across src and tests.
`tests/test_persistence.py` had a `from lcm import persistence as
_persistence` form requiring a hand edit to `from lcm.api`.

Fifth step of `Phase 2 — api Reorganisation.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Create thin re-export modules `lcm/api/grids.py` and
`lcm/api/categorical.py` covering the user-facing leaf grid classes
(`LinSpacedGrid`, `LogSpacedGrid`, `IrregSpacedGrid`, `DiscreteGrid`,
`PiecewiseLinSpacedGrid`, `PiecewiseLogSpacedGrid`, `Piece`) and the
`@categorical` decorator. The internal `lcm.grids` package is
unchanged for this PR — the deeper restructure
(`lcm/grids/` → `lcm/_grids/`, ABCs / validators / coordinates split
into `_base.py` / `_validators.py` / `_coordinates.py`) is deferred
to a follow-up; this PR is already large.

Sixth step of `Phase 2 — api Reorganisation.md` in slimmed form.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ss names

Create `lcm/api/processes.py` exposing the user-facing leaf process
classes under their canonical `<Distribution><Kind>Process` names.
The names are import-time aliases of the internal
`lcm.shocks.{iid,ar1}` classes (`Uniform`, `Tauchen`, ...).

The full rename (`lcm/shocks/` → `lcm/_processes/`,
`_ShockGrid` → `_ProcessGrid`, `is_shock` → `is_process`,
`shock_names` → `process_names`, plus renaming the underlying
classes) is deferred to a follow-up — this PR is already large and
downstream user code (lcm_examples, tests) would need a coordinated
update.

Seventh step of `Phase 2 — api Reorganisation.md` in slimmed form.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ss names

Wire `lcm/__init__.py` to import every public symbol via `lcm.api.*`:
grid classes via `lcm.api.grids`, `@categorical` via
`lcm.api.categorical`, and add the seven new `*Process` aliases from
`lcm.api.processes` (UniformIIDProcess, NormalIIDProcess,
LogNormalIIDProcess, NormalMixtureIIDProcess, TauchenAR1Process,
RouwenhorstAR1Process, TauchenNormalMixtureAR1Process).

Extend `__all__` with the new names. The old shock aliases
(`Uniform`, `Tauchen`, ...) remain reachable via `lcm.shocks`
during the deprecation grace period; the next phase removes them.

Smoke test (per the plan):

  python -c "from lcm import AgeGrid, DiscreteGrid, ..., UniformIIDProcess,
                          TauchenAR1Process, ...; print('Public API intact')"

passes.

Ninth step of `Phase 2 — api Reorganisation.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@read-the-docs-community

read-the-docs-community Bot commented May 18, 2026

Copy link
Copy Markdown

@github-actions

github-actions Bot commented May 18, 2026

Copy link
Copy Markdown

Benchmark comparison (main → HEAD)

Comparing 9f9483f6 (main) → f8c5d1fc (HEAD)

Benchmark Statistic before after Ratio Alert
aca-baseline execution time 16.448 s 14.858 s 0.90
peak GPU mem 2.23 GB 1.33 GB 0.60
compilation time 276.50 s 281.84 s 1.02
peak CPU mem 6.86 GB 6.82 GB 0.99
aca-baseline-debug execution time 77.351 s 79.843 s 1.03
peak GPU mem 581 MB 773 MB 1.33
compilation time 372.05 s 380.35 s 1.02
peak CPU mem 7.54 GB 7.60 GB 1.01
Mahler-Yum execution time 4.432 s 4.285 s 0.97
peak GPU mem 529 MB 529 MB 1.00
compilation time 12.74 s 12.85 s 1.01
peak CPU mem 1.68 GB 1.69 GB 1.00
Precautionary Savings - Solve execution time 25.9 ms 25.7 ms 0.99
peak GPU mem 101 MB 101 MB 1.00
compilation time 2.14 s 2.07 s 0.97
peak CPU mem 1.11 GB 1.12 GB 1.00
Precautionary Savings - Simulate execution time 94.7 ms 96.8 ms 1.02
peak GPU mem 349 MB 349 MB 1.00
compilation time 4.83 s 4.89 s 1.01
peak CPU mem 1.32 GB 1.31 GB 1.00
Precautionary Savings - Solve & Simulate execution time 135.4 ms 131.0 ms 0.97
peak GPU mem 586 MB 586 MB 1.00
compilation time 6.38 s 6.47 s 1.01
peak CPU mem 1.28 GB 1.28 GB 1.00
Precautionary Savings - Solve & Simulate (irreg) execution time 263.4 ms 259.5 ms 0.99
peak GPU mem 2.20 GB 2.20 GB 1.00
compilation time 6.83 s 6.76 s 0.99
peak CPU mem 1.33 GB 1.33 GB 1.00

hmgaudecker and others added 4 commits May 19, 2026 08:30
`lcm/grids/` is internal grid infrastructure (ABCs, validators,
coordinate helpers, the leaf classes whose user-facing copies live
in `lcm/api/grids.py`). The leading underscore signals "private —
don't import from user code"; users keep reaching for grid classes
through `from lcm import LinSpacedGrid` or `from lcm.api.grids
import LinSpacedGrid`.

Pure rename via `git mv` to preserve blame. Sed-rewrite of `from
lcm.grids` / `import lcm.grids` / `lcm.grids.` across src and tests.
Update docstring references in `api/grids.py` and `api/categorical.py`.

Step A of `Phase 2 — api Reorganisation.md`'s deferred internal
restructure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_ProcessGrid*

`lcm/shocks/` is internal process infrastructure (the `_ShockGrid`
ABC hierarchy plus the seven leaf distribution / discretization
classes). The leading underscore matches the `lcm/_grids/` rename
from step A — these are private packages users shouldn't reach into
directly.

Renames:
- `lcm/shocks/` → `lcm/_processes/`
- `_ShockGrid` → `_ProcessGrid`
- `_ShockGridIID` → `_ProcessGridIID`
- `_ShockGridAR1` → `_ProcessGridAR1`

Sed-rewrite of `from lcm.shocks` / `import lcm.shocks` /
`lcm.shocks.` and the three internal class names across src, tests,
and lcm_examples. `lcm/__init__.py` drops `from lcm import shocks`
and the `"shocks"` entry from `__all__` — the public surface is now
exclusively `from lcm import UniformIIDProcess` (etc.) via
`api/processes.py`.

`lcm_examples/precautionary_savings.py` and
`lcm_examples/mahler_yum_2024/_model.py` had bare-attribute access
to `lcm.shocks.{iid,ar1}.{Normal,Uniform,Rouwenhorst,Tauchen}`;
those rewrite to top-level imports (`NormalIIDProcess`,
`UniformIIDProcess`, `RouwenhorstAR1Process`, `TauchenAR1Process`)
because `lcm._processes` is private and ruff (rightly) flags the
underscore access. The internal class names themselves
(`Uniform`, `Tauchen`, ...) are renamed in step D; today the
`*Process` names are aliases declared in `api/processes.py`.

Step B of `Phase 2 — api Reorganisation.md`'s deferred internal
restructure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ShockName → ProcessName

Rename the derived attribute and type-alias names so the codebase
speaks one language: `process` for any stochastic-process state.

- `VariableInfo.is_shock` → `VariableInfo.is_process`
- `Variables.shock_names` → `Variables.process_names`
- `typing.ShockName` → `typing.ProcessName`
- `non_shock_names` local var in `api/regime.py` → `non_process_names`
- `test_shock_names_filters_is_shock` → `test_process_names_filters_is_process`

Sed-rewrite across src and tests.

Step C of `Phase 2 — api Reorganisation.md`'s deferred internal
restructure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cess, ...)

Rename the seven user-facing process leaf classes in their
definition files to the canonical `<Distribution><Kind>Process`
names. `api/processes.py` becomes a plain re-export without the
`as ...` aliases. Tests that previously reached for the classes via
`lcm._processes.iid.X` qualified access now import them from the
top-level `lcm` namespace.

Last step of `Phase 2 — api Reorganisation.md`'s deferred internal
restructure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hmgaudecker hmgaudecker changed the title Phase 2: carve out lcm/api/ for the user-facing surface Phase 2: api/ surface + internal restructure (_grids, _processes, *Process classes) May 19, 2026
@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

hmgaudecker and others added 4 commits May 19, 2026 12:08
…ayout

Map the post-Phase-2 source tree (api/, _grids/, _processes/,
engine.py, model_processing.py, regime_building/, solution/,
simulation/, params/, utils/) and explain the organising principle:
proximity to user input versus proximity to JAX-traced DP machinery.

Covers:
- The api/ boundary and why physical separation, not just naming
- Why _grids/ and _processes/ are leading-underscore packages
- The engine.py (canonical / engine-side) vs api/ (boundary) split
- The two-step build pipeline (model_processing.py →
  regime_building.processing.process_regimes)
- Static (process-time) vs runtime (solve / simulate) checks
- Boundary form vs canonical form in params/
- The User* typing aliases in api/typing.py vs engine-side aliases
  in typing.py
- A suggested reading order for new contributors.

Wired into the Explanations index and the myst.yml TOC.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move validators, default Bellman aggregator, and validate_transition_probs
helpers behind a leading underscore so lcm.api.regime is a thin layer of
class definitions plus the deprecated public validate_transition_probs.

New: lcm/_regime/{_helpers,_validation,_transition_probs}.py
Moved: _IdentityTransition → regime_building/transitions.py (colocated
with _make_identity_fn).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move STEP_UNITS, PSEUDO_STATE_NAMES, parse_step (→ _parse_step), and the
range/values/grid validators behind a leading underscore. api/ages.py is
now just the user-facing AgeGrid class.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nternals

Move I/O helpers (atomic_dump → _atomic_dump, _save_pkl, _save_h5, _load_h5,
_get_platform, _next_counter, _enforce_retention, _strip_V_arr_from_result,
…) and the snapshot writers (save_solve_snapshot → _save_solve_snapshot,
save_simulate_snapshot → _save_simulate_snapshot) to lcm/_persistence/.

api/persistence.py now exposes only:
- SolveSnapshot / SimulateSnapshot dataclasses
- load_snapshot / save_solution / load_solution
- _bind_forward_refs (delegator)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hmgaudecker and others added 2 commits May 22, 2026 18:34
The GPU benchmark runner's pixi 0.69 install resolves on the bare
`pixi` invocations without a $GITHUB_PATH prepend, same as before the
lockfile bump. Restore the workflows to that state.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
hmgaudecker and others added 13 commits May 23, 2026 06:45
Renames `AcaBaselineDebugLog`'s display label to `aca-baseline-debug`
and places it second in the PR-comment table, right after the
`aca-baseline` block. Adds `AcaBaselineDebugLogGpuPeakMem` so the
debug-mode block carries a peak GPU mem row symmetric with
`aca-baseline`.

`AcaBaselineDebugLog.setup_for_gpu_measurement` mirrors `setup`'s
`log_path` setup so the cold-measurement subprocess exercises
snapshot writing too. The tmpdir leaks at subprocess exit — `/tmp`
gets OS-cleaned, and the subprocess doesn't run ASV's teardown path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds `paths-ignore` to the `pull_request` triggers of `main` and
`benchmark-pr`. Doc-only PRs (Markdown, notebooks under `docs/`)
no longer spin up the GPU runner pool or the self-hosted benchmark
runner. `main` also skips when the diff is benchmark-only — the
benchmark workflow covers that surface. Pushes to `main` still
exercise the full matrix; src/test changes still trigger everything.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ransition validator

`regime.resolved_fixed_params` and the per-iteration `flat_params` for a
regime both key their entries by qualified names (`next_<state>__<param>`,
or `next_<state>__<target>__<param>` for per-target dicts). The validator
calls the `MarkovTransition`'s user function with the raw parameter names
from its signature, so without the strip every transition-function param
that isn't a grid axis falls through to the "not numerically validated"
skip branch and the per-transition numerical check never runs.

Adds a `_params_callable_for_state_transition` helper that merges fixed
and flat params (same merge order as `solve`) and returns a
`FlatRegimeParams` keyed by the raw signature names accepted by one
specific transition. The state-transition validator calls into it before
dispatching to `_validate_state_transition_single`.

Adds two regression tests on a model whose `health` `MarkovTransition`
reads a parameter from `fixed_params`:
- one asserts no "not numerically validated" warning fires;
- one asserts that an invalid probability *is* surfaced at log_level=debug,
  proving the validator actually ran rather than silently skipping.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Each Python-level batch is its own `jax.jit` dispatch in the solve
loop, and on a distributed axis every dispatch carries a cross-device
collective. Batching therefore multiplies the per-period collective
count by `ceil(n_per_device / batch_size)`; for small `batch_size`
the collective overhead per kernel dwarfs the compute per kernel and
sharding becomes a regression rather than a speedup.

Adds `_fail_if_batch_size_combined_with_distributed` in `grids/base.py`
and calls it from `_init_uniform_grid` (covers Lin/LogSpacedGrid),
`IrregSpacedGrid.__init__`, and `DiscreteGrid.__init__`. Piecewise
grids inherit `batch_size=0, distributed=False` defaults from
`ContinuousGrid` and don't expose them in `__init__`, so they need
no change.

Error message points users at the right escape valves — more devices
or another distributed axis — rather than restoring batch_size.

Adds construction-time tests across all four grid types: the (bs=1,
distributed=True) combo raises, the (bs=0, distributed=True) combo
constructs cleanly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The cascade merge from #360 brought in the call sites under
`src/_lcm/grids/{continuous,discrete}.py` but the existing `from
_lcm.grids.base import Grid` lines on #361 weren't extended to also
import the helper. ty caught it on the post-cascade run; now both
modules import the helper alongside `Grid`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The pre-flight numerical validator's `_params_callable_for_state_transition`
strips the qualified prefix from `regime.resolved_fixed_params` /
`flat_params` so the user's transition function can be called with its
raw parameter names. The prefix for per-target dict transitions used
`next_<state>__<target>__`, but `create_regime_params_template` builds
the canonical key as `to_<target>_next_<state>__<param>`. The mismatch
made every per-target MarkovTransition with a custom param fall through
to the "not numerically validated" skip-and-warn branch — the per-target
numerical check never ran in production.

Aligning the validator's prefix with the template builder's key lets
per-target transitions exercise the same numerical-validation path that
already covers simple `next_<state>__<param>` transitions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`solve()` and `simulate()` only dispatch a per-target MarkovTransition
for the targets in `active_regimes_next_period` at the source's period;
targets that deactivate before the source can reach them never fire at
runtime. The pre-solve validator mirrors that gate so a per-target
function whose output shape only needs to match the (always-zero-
weighted) target's outcome grid in principle is not numerically
evaluated against the source's state grid.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…factor/phase-2-api-reorganisation

# Conflicts:
#	.gitignore
…reorganisation

# Conflicts:
#	pixi.lock
#	src/_lcm/jaxtyping_patch.py
#	src/lcm/__init__.py

@timmens timmens left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice changes; approved from my side 🚀

As you mentioned on Zulip, this PR cannot be properly reviewed, so I did not go through each change on GitHub. Hence, there may certainly be things that I missed; but of course they can be fixed easily with an agent afterwards.

I actually really like the changes. Given the state of the library and all of the features we (you and Max) have added over the last months, this refactoring was needed! I strongly believe this will make the codebase easier to work with for humans and agents, definitely for me.


Coding agents have gotten so good that if there is need, the architecture is thought through, and the names are well chose, the agent-generated code is usually pretty good. And in this case here, I think the new src-layout, and the renamings are a huge improvement. I was actually slightly confused in the beginning by the new nomenclature for the shocks, but processes is simply the better and more accurate name; well chosen! 🙂

hmgaudecker and others added 5 commits May 25, 2026 07:35
The merge of main into this branch brought in 4 tests that called
`validate_transition_probs` — a function this branch deleted as part
of the auto-validate refactor. Those tests are dead; the auto-validator
covers the same ground. Also drops the corresponding unused imports
(`jnp`, `_get_func_indexing_params`, `TYPE_CHECKING`) and the dangling
comment in `user_regime.py` that referenced the removed function. The
prior amend missed `pixi.lock` (the jaxtyping 0.3.10 bump) — re-locked.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds distributed-first ordering to `_ordered_state_action_names` so the
sharded axis becomes the outermost productmap axis within its topology
group. XLA can then place the cross-device collective at the outer
loop, wrapping a purely per-device kernel.

Sort key per state is `(not distributed, batch_size)` with 0 last.
…factor/phase-2-api-reorganisation

# Conflicts:
#	pixi.lock
#	src/lcm/variables.py
#	tests/test_variables.py
beartype 0.22.9 leaves a debug print in
`beartype._util.func.utilfunctest:1083` that fires once per imported
pseudo-callable, polluting every pytask / test invocation. Monkey-patch
that module's `print` to a no-op before the claw runs.
Base automatically changed from feat/phase-1b-auto-state-transition-validation to main May 25, 2026 08:00
…reorganisation

# Conflicts:
#	src/lcm/_transition_checks.py
#	src/lcm/grids/base.py
#	src/lcm/user_regime.py
#	src/lcm/variables.py

@mj023 mj023 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good changes. Makes it much clearer to the user, where he can look for things to import, the old structure was becoming very complicated.

I'm also okay with the renamings, Shocks never really fit in my opinion, calling them stochastic processes is better, considering that they will often be used for income processes etc.

Regarding commit 1e702fd:
I was at first confused how this escaped my tests, but I just didn't realize that AOT-Compilation only activates when setting n_subjects. Thanks for fixing this.

@hmgaudecker hmgaudecker merged commit d04bf25 into main May 25, 2026
11 of 12 checks passed
@hmgaudecker hmgaudecker deleted the refactor/phase-2-api-reorganisation branch May 25, 2026 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants