Status: design draft Date: 2026-05-15 Author: Ben Mazin (with brainstorming assistance) Supersedes: the bespoke KILO and NAYRA simulation packages, which become reference examples.
nullsim is a Python package and command-line pipeline for simulating photonic nulling instruments. A user describes an instrument and a study in a single TOML file. The pipeline composes physics modules (atmosphere, AO, injection, fiber transport, delay lines, photonic chip, detectors, post-processing) as an ordered sequence of stage instances with declared dependencies validated before execution, runs the requested simulation, optionally sweeps over parameter axes, and writes a chosen subset of standardized plots and tables. Stages, plots, and tables are pluggable: a user adds a custom thermal-drift stage or a custom output by registering one Python file. The same package targets ground-based interferometric nullers (KILO-class multi-telescope arrays), single-pupil ELTs (NAYRA-class), space-based nullers (HWO and successors), and lab-only chip characterization, by enabling or disabling stages rather than by editing code.
The KILO and NAYRA codebases each implement a photonic nulling simulator that produces noise budgets, contrast curves, integration-time estimates, and a detection-space figure that overlays the achievable contrast on the known exoplanet population. The two packages share roughly seventy-five percent of their physics: NAYRA already routes its simulations through a KILO adapter, and the two packages reimplement parallel modules for atmosphere, AO, injection, fiber transport, chip optimization, detectors, and plotting.
Continuing to maintain two packages with a partial adapter between them is the wrong shape going forward. The next instrument target, a Habitable Worlds Observatory (HWO) successor, will reuse the chip and detector physics but discard the atmospheric stages entirely. Adding it as a third sibling repository would compound the problem.
nullsim extracts the shared physics into a generic pipeline with one configurable entry point. The package was initially built guided by the KILO and NAYRA source trees, not by working from papers alone. KILO in particular encodes a large amount of debugged-by-fire knowledge — sign conventions, dtype choices, edge cases in coupling formulae, baseline-indexing fixes, calibration ordering — that does not appear in any reference. It is fine, and often desirable, to rewrite the code cleanly to match nullsim pipeline standards (typed stages, sub-budget contributions, no global state); the rule is that hard-won physical behavior should be captured by nullsim-native tests and invariants. The KILO and NAYRA repositories themselves are unmodified by this process and continue to serve as paper-reproducibility archives.
Goals
- One package and one CLI that simulate any photonic nulling instrument whose physics is expressible as a sequence of stages between a stellar field and a detector.
- TOML configuration as the user-facing surface. Hand-editable, diff-friendly, version-controllable.
- An ordered sequence of stage instances: the user lists stage instances (each with its own
idandtype) in execution order, the runner validates that consumes/produces dependencies are satisfied before running, then executes in order. - First-class parameter sweeps inside the config file. A study is one TOML file.
- A thin study layer above that, for comparing across configs (e.g. KILO vs. HWO on a shared exoplanet catalog).
- A standardized catalog of plots and tables, with
detection_space(planet/star contrast vs. angular separation overlaid with the known exoplanet population) as the headline output. - Stage-level result caching keyed by a conservative content hash so re-runs of unchanged studies are near-instant. Distinguish
config_hash(resolved config only) fromrun_hash(config + code + data + catalog snapshots) so output identity is honest. - Extensibility through two paths: an in-tree extension module listed in the config, and an installable plugin discovered via Python entry points.
- Reproducibility: each run writes its resolved config, a manifest of package and dependency versions, and a human-readable timestamped output directory by default (
results/{run.name}/{timestamp};{run_hash}and{config_hash}remain available as alternate template variables inoutput.dir).
Non-goals
nullsimis not a Fourier-optics simulator. It does not replace HCIPy, Poppy, or Prysm. If ground-truth wavefront propagation cross-checks are needed later, HCIPy is the natural fit; nothing in the core depends on it today.nullsimis not a job scheduler. Sweep parallelism is in-process viaconcurrent.futures; users running studies that exceed a single workstation use external schedulers around the CLI.nullsimdoes not aim for bit-identical reproduction of KILO or NAYRA paper figures. Those repositories are static design references; nullsim's CI checks its own physics, stage contracts, and example configs.
config.toml --> [config layer] pydantic schema, preset merge, sweep expansion
|
v
[pipeline layer] dependency validation, runner, caching, telemetry
|
v
[stage layer] physics stages consume/produce SimulationState
|
v
[output layer] registered plots and tables, sweep-aware
|
v
results/<run>/<YYYYMMDD-HHMMSS>/ plots, tables, manifest, resolved config
Three guiding splits:
- Math vs. pipeline. Pure mathematical primitives (Kolmogorov spectrum, Clements decomposition, Ruilier–Cassaing coupling formula, Jones matrices, photometric zero points) live in
nullsim/physics/. They have no knowledge ofSimulationStateor TOML and can be imported and tested in isolation. Stages innullsim/stages/wrap that math into the pipeline. - Stages vs. families. Stage families (atmosphere, AO, injection, transport, delay lines, chip, detector, postprocess) are organizational conventions, not Python inheritance hierarchies. Each family directory holds multiple alternative implementations selectable by
type = "..."in TOML. - Catalog vs. registry. A standardized catalog of named plots and tables ships with the package. The registry that backs the catalog is the same one user-defined outputs register into; user code and built-in code are indistinguishable to the runner.
nullsim/
├── pyproject.toml
├── nullsim/
│ ├── __init__.py
│ ├── config/
│ │ ├── schema.py # pydantic v2 models per stage family + top level
│ │ ├── loader.py # TOML parse, preset merge, environment overrides
│ │ ├── sweeps.py # grid/zip expansion, sweep coordinate algebra
│ │ └── presets/ # built-in instrument presets (TOML)
│ ├── pipeline/
│ │ ├── stage.py # Stage protocol, StageParams, RunContext
│ │ ├── state.py # SimulationState and sub-budget dataclasses
│ │ ├── registry.py # stage/plot/table registry, plugin discovery
│ │ ├── runner.py # dependency validation, execution, sweep dispatch
│ │ ├── cache.py # content-hash keyed disk cache
│ │ └── _canonical.py # type-tagged canonical-bytes hashing (used by state, rng, cache)
│ ├── stages/
│ │ ├── scene/ # StandardStarPlanet, PointSource, ExozodiModel
│ │ ├── telescope/ # ArrayGeometry, SingleAperture, MultiAperture
│ │ ├── atmosphere/ # KolmogorovAtmosphere, FixedStrehl, NoAtmosphere
│ │ ├── ao/ # PyramidWFS, ShackHartmann, NoAO
│ │ ├── injection/ # SingleModeFiber, MultiModeFiber, Ideal
│ │ ├── transport/ # SMF28, PM980, ZeroLossFiber
│ │ ├── delay_lines/ # GeometricDelayLine, NoDelay
│ │ ├── chip/ # KernelMZIMesh, ClementsMesh, Identity
│ │ ├── detector/ # MKID, SNSPD, EMCCD, IdealCounter
│ │ ├── sensitivity/ # DetectionCurve (per-sep throughput + disk floor + realized null)
│ │ └── postprocess/ # ChipOptimization, FringeTracking, Calibration,
│ │ # ImageReconstruction, PerformanceMetrics
│ ├── physics/
│ │ ├── kolmogorov.py
│ │ ├── ruilier_cassaing.py
│ │ ├── clements.py
│ │ ├── jones.py
│ │ ├── photometry.py
│ │ ├── pupil_geometry.py
│ │ ├── stellar_disk.py # uniform-disk null floor integral
│ │ ├── planet_throughput.py # PA-averaged dark-port throughput vs separation
│ │ ├── realized_null.py # AO-piston Monte Carlo realized null + instability
│ │ ├── uv_coverage.py
│ │ └── fresnel.py
│ ├── sites/ # built-in site database (TOML)
│ │ ├── maunakea.toml
│ │ ├── cerro_armazones.toml
│ │ └── l2_halo.toml
│ ├── targets/ # exoplanet catalogs, target-list helpers
│ ├── outputs/
│ │ ├── plots/ # registered plot functions
│ │ ├── tables/ # registered table writers
│ │ └── styles.py # publication matplotlib styling
│ ├── data/ # shipped reference curves and snapshots
│ │ ├── atran/ # atmospheric transmission per site
│ │ ├── fibers/ # SMF-28e, PM980-XP attenuation and birefringence
│ │ ├── photometry/ # Vega zero-points, filter curves
│ │ ├── kilo_reference/ # KILO paperv3 cache snapshots for plot overlays
│ │ └── loader.py # path resolution, override via TOML
│ ├── study/ # cross-config study layer
│ │ └── runner.py
│ └── cli.py # nullsim run|validate|list-stages|inspect|cache
├── examples/
│ ├── kilo_maunakea.toml
│ ├── nayra_eelt.toml
│ ├── hwo_space.toml
│ └── chip_only_lab.toml
└── tests/
├── physics/
├── stages/
├── outputs/
└── examples/ # end-to-end config and smoke tests
A Stage is the type (the implementation). A StageInstance is a named occurrence in the pipeline — one TOML entry with an id. The same Stage type can be instantiated multiple times (two transport.smf28 instances, one for the sub-aperture-to-chip run and one for the chip-to-detector run).
class Stage(Protocol):
type_name: ClassVar[str] # registry key, e.g. "transport.smf28"
family: ClassVar[str] # "atmosphere" | "transport" | ...
consumes: ClassVar[frozenset[StateKey]] # references resolved to instance IDs
produces: ClassVar[frozenset[StateKey]]
def __init__(self, params: StageParams, context: RunContext) -> None: ...
def apply(self, state: SimulationState) -> SimulationState: ...
def diagnostics(self) -> dict: ...
def external_dependencies(self) -> list[ExternalDep]: ...A StateKey is a dotted path like "field", "opd_budget.atmos_piston", "geometry.baselines", or "diagnostics.<instance_id>.<key>". The runner validates the pipeline before execution by checking that every consumed key is produced by some earlier instance. Mistakes such as listing chip before injection fail at validation time.
diagnostics() returns optional per-instance data keyed by instance ID so multiple instances of the same stage type stay distinguishable. The atmosphere instance exposes its r₀, AO Strehl per wavelength, and OPD variance components; the chip instance exposes its phase matrix and optimization history.
external_dependencies() declares external resources the stage reads (shipped data files, fetched catalog snapshots, external optimizer configurations). The runner hashes each declared dependency and folds it into the cache key. Stages do not define their own cache-key logic; the cache layer derives the key from a fixed recipe (see §6.4). This is the conservative-by-default choice — stages cannot accidentally under-include inputs.
SimulationState is a typed payload that flows through the pipeline. It is not one monolithic blob; it is a handful of named sub-budgets that stage instances opt into reading and writing:
| Sub-budget | Type | Producer families |
|---|---|---|
scene |
Scene — astrophysical inputs: star SED + angular diameter + distance, planet(s) contrast spectrum + (separation, PA) + optional phase, exozodi model, background sources, sky background spectrum |
scene (source factory stages) |
geometry |
Geometry — array geometry: aperture positions in ENU and on the pupil, sub-aperture mapping, baselines as a function of time/hour angle, parallactic angle, UV coverage trajectory |
telescope, array_geometry stages |
field |
Field — complex amplitudes of shape [n_modes, n_wavelength_bins, n_pol] with metadata for the wavelength grid and the mode-to-aperture mapping |
injection, chip |
opd_budget |
dict[str, OPDComponent] — named RMS contributions (atmos_piston, fiber_dispersion, thermal_drift, ...) |
atmosphere, transport, delay_lines, postprocess |
throughput |
Throughput — multiplicative transmission spectrum on the wavelength grid, with a per-component log |
atmosphere, transport, chip |
wavefront |
Wavefront — Strehl and amplitude jitter per wavelength, per aperture |
atmosphere, ao |
photon_rates |
PhotonRates — per-port, per-wavelength rates with named contributors (star, sky, dark, planet, zodi) |
sky_background, detector |
results |
Results — final products: null depth, contrast curve, SNR, calibration time |
postprocess |
scene and geometry are populated by dedicated stages at the front of the pipeline (scene and telescope family stages). They are load-bearing inputs to injection, chip, detector, and to outputs such as detection_space, uv_coverage, and contrast_curve. Stellar-diameter leakage, exozodi photon backgrounds, planet position-dependent throughput, and hour-angle-dependent UV coverage all read from these two sub-budgets.
Each sub-budget is a frozen dataclass with an add_component(name, value) constructor so contributions are traceable by name. A throughput stacked-area plot decomposed by component requires no extra bookkeeping inside individual stages.
def run(config: ResolvedConfig) -> RunResult:
state = SimulationState.empty(config.grid)
instances = [registry.build_instance(spec, context) for spec in config.pipeline.stages]
validate_dependencies(instances) # consumes/produces check, not a topology pass
for inst in instances:
key = cache.key_for(inst, state, context)
if cache.has(key):
state, diag = cache.load(key)
else:
with telemetry(inst.id):
state = inst.apply(state)
diag = inst.diagnostics()
cache.store(key, state, diag)
return RunResult(state=state, diagnostics=collect(instances))Stage execution is an explicit ordered sequence: the TOML lists stage instances in execution order, the runner respects it, and dependency validation only checks that every consumed key has been produced by an earlier instance. There is no topological reordering. Explicit beats clever for a configuration file that researchers will diff against published versions.
Stage-level caching is on by default and content-addressed with a conservative default key recipe. Stages do not define their own cache-key function; the cache layer computes the key from a fixed formula. This is the load-bearing safety property: a stage author cannot accidentally produce a stale cache hit by forgetting to include an input.
The cache key for a stage instance is the hash of:
- The instance's resolved params (frozen, sorted).
- A digest of every sub-budget the instance declares it
consumes. The state object hashes its own sub-budgets by content. - A hash of the stage type's source file (the
.pycontaining the class). - The package version.
- The hashes of all
external_dependencies()the stage declares (e.g. an ATRAN transmission curve, a NASA Exoplanet Archive snapshot, a fiber attenuation table). Stages opt into declaring additional externals — they do not opt out of the consumed-state digests. - The user-bumpable
cache.versionfield in the config (escape hatch for forcing a rebuild).
Two related hashes the cache layer also computes for the run as a whole:
config_hash= hash ofconfig.resolved.tomlonly. Identifies "what the user wrote."run_hash= hash ofconfig_hash+ package version + everyexternal_dependencies()hash across all instances + the runtime Python/numpy/scipy/torch versions. Identifies "what actually got produced."
run_hash is what names the output directory. Two runs with the same config_hash but different run_hash (because the exoplanet catalog snapshot changed, or because torch was upgraded) write to different directories. The manifest records both hashes.
Cache hits restore the post-instance SimulationState and diagnostics. Misses execute normally and write to disk. The cache lives in .nullsim_cache/ next to the config by default. CLI flags --no-cache and --clear-cache provide the escape hatches.
Sweeps benefit directly: a sweep over chip.n_modes caches each unique value once. Re-running the same sweep, or a different sweep touching the same chip values, hits the cache.
Most stages today are analytic noise-budget calculators that return deterministic outputs. Some physics — fringe-tracking residuals, thermal drift over an observation, telescope vibrations — naturally wants Monte Carlo. A stage opts in by setting mode = StageMode.MONTE_CARLO and reading context.rng.
The RNG handed to each stage instance is derived deterministically so that the result of a stochastic run is independent of how the runner is parallelized. numpy.random.SeedSequence only accepts integer entropy, but the inputs we want to mix in (config_hash is a hex string, sweep_coord_tuple is tuple[(str, Any), ...], instance_id is a string) are not integers. The recipe is therefore a two-step canonicalization:
# 1. Type-tagged canonical encoding of the non-int inputs (prevents collisions
# between e.g. the string "1" and the integer 1), then SHA-256, then split
# the digest into eight little-endian uint32 entropy words.
canon = canonicalize((config_hash, sweep_coord_tuple, instance_id))
digest = hashlib.sha256(canon).digest()
entropy_words = struct.unpack("<8I", digest)
# 2. Pass (root_seed, *entropy_words) to numpy SeedSequence.
ss = np.random.SeedSequence((int(root_seed), *entropy_words))
context.rng = np.random.default_rng(ss)root_seed comes from [run] seed. config_hash and sweep_coord_tuple are known at config-load time. instance_id is the stage's TOML id. The resulting RNG stream is reproducible across runs and unchanged whether the run executes with --workers 1 or --workers 32. The same trick gives every stage its own independent stream without manual seed plumbing.
The canonicalization helper, the SHA-256 step, and the LE-uint32 split are implementation details; the load-bearing property is that derive_rng(root_seed, config_hash, sweep_coord_tuple, instance_id) is a pure function of its arguments and that those four arguments uniquely identify a stage instance within a sweep cell. See nullsim/pipeline/rng.py for the source of truth.
The chip optimizer is its own postprocess stage rather than a hidden side effect of the chip stage. The TOML lists chip followed by chip_optimization explicitly. Two reasons:
- A/B comparison of optimizers becomes a one-line config change.
- A user studying a perfectly-tuned chip versus one calibrated on photon counts can swap or omit the optimization stage without touching the chip stage.
The optimizer dispatches across backends (scipy, torch) through separate stage classes (ChipOptimizationScipy, ChipOptimizationTorch) sharing a common ChipOptimizationParams base, consolidating the chip optimization code that was previously spread across multiple files in KILO and NAYRA.
A complete configuration file:
# ─── Run metadata ─────────────────────────────────────────────
[run]
name = "kilo_sensitivity_paper_fig3"
description = "Contrast vs J-mag, 4 Maunakea telescopes, H-band"
seed = 12345
extensions = ["my_lab_package.stages"]
# ─── Spectral / spatial grid ──────────────────────────────────
[grid]
wavelength_center_um = 1.65
wavelength_bandwidth_um = 0.30
n_wavelength_bins = 32
# ─── Site & telescope (preset + override pattern) ─────────────
[site]
preset = "maunakea"
[telescope]
preset = "keck_pair_plus_subaru_gemini"
# ─── Scene (source models, populated into state.scene) ────────
[scene.star]
target = "tau_Ceti" # resolves via astropy/SIMBAD, can override below
jmag = 4.5
angular_diameter_mas = 2.08
distance_pc = 3.65
[scene.planet]
contrast = 1.0e-7
separation_mas = 100.0
position_angle_deg = 45.0
[scene.exozodi]
level_zodi = 3.0 # in units of solar-system zodi
# ─── Observation geometry ─────────────────────────────────────
[observation]
hour_angle_h = 0.0
integration_time_s = 3600
# ─── Pipeline (ordered instance list, each with id + type) ────
[[pipeline.stages]]
id = "scene"
type = "scene.standard_star_planet"
[[pipeline.stages]]
id = "array"
type = "telescope.array_geometry"
[[pipeline.stages]]
id = "atmosphere"
type = "atmosphere.kolmogorov"
[[pipeline.stages]]
id = "ao"
type = "ao.pyramid_wfs"
n_actuators = 3500
[[pipeline.stages]]
id = "injection"
type = "injection.single_mode_fiber"
[[pipeline.stages]]
id = "fiber_to_chip"
type = "transport.smf28"
length_m = 30.0
[[pipeline.stages]]
id = "delay_lines"
type = "delay_lines.geometric"
length_m = 50.0
[[pipeline.stages]]
id = "chip"
type = "chip.kernel_mzi_mesh"
n_modes = 4
n_bright_ports = 3
n_dark_ports = 1
[[pipeline.stages]]
id = "chip_opt"
type = "postprocess.chip_optimization"
algorithm = "broadband_bfgs"
backend = "torch"
max_iter = 500
[[pipeline.stages]]
id = "fiber_to_detector"
type = "transport.smf28"
length_m = 5.0
[[pipeline.stages]]
id = "fringe_tracking"
type = "postprocess.fringe_tracking"
[[pipeline.stages]]
id = "detector_bright"
type = "detector.snspd"
ports = "bright"
[[pipeline.stages]]
id = "detector_dark"
type = "detector.mkid"
ports = "dark"
max_count_rate_hz = 50_000
[[pipeline.stages]]
id = "performance"
type = "postprocess.performance_metrics"
# ─── Sweeps (first-class) ─────────────────────────────────────
[[sweep]]
param = "scene.star.jmag"
linspace = { start = 4, stop = 12, num = 9 }
mode = "grid"
[[sweep]]
param = "pipeline.stages.chip.n_modes"
values = [3, 4, 5, 6, 8]
mode = "grid"
# ─── Outputs ──────────────────────────────────────────────────
[output]
dir = "results/{run.name}/{timestamp}" # human-readable; {run_hash}/{config_hash} also supported
plots = ["detection_space", "contrast_curve", "snr_vs_jmag",
"throughput_breakdown", "null_depth_vs_nmodes", "uv_coverage"]
tables = ["target_detectability", "throughput_budget", "performance_summary"]
formats.plots = ["pdf", "png"]
formats.tables = ["csv", "json", "parquet"]
sweep_table = "parquet" # tidy long-format sweep results
# ─── Catalog snapshot (versioned) ─────────────────────────────
[catalog.exoplanets]
source = "nasa_exoplanet_archive"
query = "default"
snapshot = "2026-05-01" # resolves to data/catalogs/<snapshot>.parquet
classification = "bins_v2" # named binning rule for Rocky/SE/Neptune/GG
# ─── Cache ────────────────────────────────────────────────────
[cache]
enabled = true
dir = ".nullsim_cache"
version = 1Notes on the schema:
- Stage instances are addressed by
id(chip,fiber_to_chip,detector_bright). Thetypefield selects the implementation. Multiple instances of the same type are allowed (twotransport.smf28instances, two detector instances on different port groups). - Sweep params reference instance IDs (
pipeline.stages.chip.n_modes) or scene/observation paths. - The example shows the multi-instance shape that KILO actually needs (separate fibers before and after the chip, separate detectors on bright vs. dark ports), which the previous single-stage-per-family schema couldn't express.
[site], [telescope], [scene], and [observation] accept preset = "name" which loads from the built-in TOML database, then deep-merges any inline fields the user provides. The user can override individual fields without copy-pasting a full preset. Pipeline stage lists do not use presets — they are explicit by design — but a stage's params table can reference a preset for stage-internal defaults.
Each stage class ships a pydantic v2 Params model. The loader:
- Parses the TOML.
- Merges presets for
[site],[telescope],[scene],[observation]. - Validates each
[[pipeline.stages]]entry: thetyperesolves through the registry, and the remaining fields are validated against that type'sParamsmodel. Unknown keys are errors. Type mismatches surface as path-aware errors (pipeline.stages[chip].n_modes: expected int, got "four"). - Validates
[catalog.*]snapshots resolve to on-disk files (or are fetchable). - Expands sweeps into a cross product of resolved configs.
- Calls
validate_dependencies()on the resolved pipeline (everyconsumeskey matches some earlier instance'sproduces).
nullsim validate config.toml runs the full validation pipeline without executing anything.
Each [[sweep]] table declares one parameter axis. Values can be given as values = [...], range = {start, stop, step}, linspace = {start, stop, num}, or logspace = {start, stop, num}. Mode is grid (default) or zip.
- All
grid-mode sweeps form a Cartesian product. - All
zip-mode sweeps iterate in parallel; their joint length must agree, and they contribute one axis (the zipped tuple) to the cross product.
Output products are indexed by a tuple of sweep coordinates. Plot functions decide which axes to render against. Sweep cells are independent; the runner parallelizes via concurrent.futures with --workers N.
Every run writes alongside its outputs:
config.toml— the original.config.resolved.toml— presets expanded, defaults filled in.manifest.json—config_hashandrun_hash, package version, git commit (when in a repo), Python and dependency versions, RNG seeds, host info, per-instance timings, cache hit/miss summary, and per-instanceexternal_dependencies(source + retrieval date + content hash for each).
Output directories are named by run_hash, not config_hash. That guarantees re-running with the same TOML but a different package version, dependency stack, or catalog snapshot writes to a different directory and does not silently overwrite earlier results. Both hashes are recorded in the manifest so you can group runs by either dimension.
The catalog is the single source of named outputs. Each entry is a registered function (RunCollection, RunContext) -> Figure or ... -> DataFrame with declared input requirements that are validated at config-load time. A request for uv_coverage when no stage produces baselines fails before the run starts, not in the plotting step.
Plots (tier 1):
| Name | What it shows | Sweep-aware |
|---|---|---|
detection_space |
Headline figure. Planet/star contrast vs. angular separation, one detection-limit curve per (instrument config × J-mag × integration time), filled region above each curve = detectable. Overlays the known exoplanet population: color = planet type (Rocky, Super-Earth, Neptune-like, Gas Giant), marker shape = discovery method (RV, Transit, Imaging, Other), optional text labels for notable systems. Stellar diameter assumption in the caption. Consumes contrast curves from one or more sweep cells, plus a versioned exoplanet catalog snapshot declared in [catalog.exoplanets] (source, query, retrieval date, snapshot hash, classification rule). The catalog snapshot hash enters run_hash so a catalog refresh produces a new output directory and a new manifest entry rather than silently shifting the points. |
yes |
contrast_curve |
Achievable contrast vs. angular separation, one curve or a small set. | optional |
contrast_vs_<param> |
Contrast at fixed separation as a function of any swept parameter. | yes |
snr_vs_jmag |
SNR (or integration time to 5σ) vs. stellar J-magnitude. | yes |
null_depth_spectrum |
Instantaneous and calibrated null depth vs. wavelength. | no |
throughput_breakdown |
Stacked-area transmission vs. wavelength, decomposed by named component. | no |
opd_budget_bar |
Bar chart of OPD RMS contributions, named, stacked by reduction stage (open-loop → AO → FT). | no |
photon_rate_budget |
Per-port photon rates broken down by source (star, sky, dark). | no |
uv_coverage |
(u, v) coverage for the array and observation. | no |
pupil_layout |
Sub-aperture geometry on the pupil. | no |
chip_phase_matrix |
Optimized MZI phase settings (heatmap). | no |
chip_optimization_history |
Optimizer loss vs. iteration. | no |
calibration_time_vs_null |
Calibration time required to reach a target null depth. | yes |
target_detectability_map |
Sky scatter of exoplanet targets colored by achievable SNR. | no |
Tables (tier 1):
| Name | Contents |
|---|---|
throughput_budget |
Per-component transmission contributions, per wavelength bin. |
opd_budget |
Named OPD RMS contributions and their reduction stages. |
photon_rate_budget |
Per-port photon rates, star / sky / dark breakdown. |
target_detectability |
Per-target: RA, Dec, magnitude, planet contrast, achieved SNR, integration time to 5σ. |
chip_parameters |
Final MZI phase and amplitude settings for each cell. |
performance_summary |
Top-level scalar metrics: best null depth, contrast at 1λ/D, calibration time. |
run_manifest |
Tabular form of the JSON manifest, for paper inclusion. |
Plots default to PDF and PNG; tables default to CSV and JSON.
Sweeps are first-class, so sweep-aggregated output is a first-class concern, not a v2 extension:
sweep_results.parquet— a tidy long-format table with one row per (sweep cell × output metric × wavelength bin, where applicable). Columns: every sweep coordinate, the metric name, the value, plus acell_run_hashlinking back to the per-cell directory. This is the file figures load from; it makes ad-hoc analysis in pandas trivial.sweep_manifest.json— collection-level metadata: the sweep axes, mode (grid/zip), cell count, cellrun_hashlist, total wall time, cache hit rate.- Per-cell directories still exist under
cells/<cell_run_hash>/for runs where you need per-cell plots or diagnostics, but the standard outputs never need to walk them.
Tables also write to CSV/JSON by default; Parquet is enabled via formats.tables = ["csv", "parquet"] or by setting sweep_table = "parquet". For studies with 10³+ cells this is the difference between a usable artifact and a directory tree the OS struggles to list.
The detection-space plot is the headline output and drives two design choices:
- Plot functions take a
RunCollection, not a singleRunResult. Detection space consumes contrast curves from many sweep cells (J-mag axis) at once. - Cross-config comparison needs a thin study layer. Within one TOML, sweeps cover one config × many param values. Comparing across configs (KILO vs. HWO vs. NAYRA) is the next level up.
A study.toml references multiple run configs by path and tells the output system which runs to overlay:
[study]
name = "kilo_vs_hwo_paper_fig"
runs = [
{ config = "kilo_maunakea.toml", label = "KILO Maunakea" },
{ config = "hwo_space.toml", label = "HWO baseline" },
]
[output]
plots = ["detection_space", "contrast_curve_overlay"]
formats.plots = ["pdf", "png"]nullsim study run study.toml executes each referenced config (cache-aware), then dispatches outputs against the combined RunCollection. No new pipeline machinery is required; the study layer is a thin multiplexer.
Two paths, intentionally redundant:
In-tree (one-off custom code). Write a Python module that exposes register(registry). List it under [run] extensions = [...]. The loader imports it and calls its register() function. Lowest barrier; right for a paper-specific custom stage.
# my_thermal_drift.py
from nullsim.pipeline import Stage, register_stage
@register_stage
class ThermalDrift(Stage):
name = "thermal_drift"
family = "postprocess"
consumes = frozenset({"opd_budget", "chip.diagnostics"})
produces = frozenset({"opd_budget.thermal"})
...Out-of-tree (installable plugin). Declare a nullsim.stages (or .plots or .tables) entry point in your package's pyproject.toml. The registry discovers it at import time. Right for code that more than one project depends on.
# external pyproject.toml
[project.entry-points."nullsim.stages"]
my_custom_chip = "my_pkg.stages:MyChip"Both paths produce stages, plots, and tables that are indistinguishable from built-ins to the runner: same schema validation, same caching, same diagnostics. The CLI nullsim list-stages and nullsim list-outputs show built-ins and user-registered entries in one list.
nullsim run config.toml [--out DIR] [--workers N] [--no-cache] [--dry-run]
nullsim study run study.toml [--workers N] [--no-cache]
nullsim validate config.toml # schema + dependency validation, no execution
nullsim list-stages [--family chip] # show registered stages
nullsim list-outputs # show registered plots & tables
nullsim inspect config.toml # resolved config + dependency visualization
nullsim cache info|clear [--config config.toml]
nullsim run is the main entry point. nullsim inspect is the discoverability tool: it prints the resolved config, the stage instance list with consumes/produces annotations as ASCII art (or a Graphviz file with --graphviz), and a table of which outputs each instance feeds. New users learn the system through inspect.
Three layers:
- Physics primitives in
nullsim/physics/get pure unit tests. Kolmogorov spectrum normalization, Clements decomposition invertibility, Ruilier–Cassaing analytical limits. - Each stage gets a contract test: build a minimal
SimulationStatecontaining only what the stageconsumes, run the stage, assert that every key inproducesappears and that conservation properties hold (throughput ≤ 1, photon rates non-negative, OPD components RMS-stack correctly). - Example and integration tests run shipped configs far enough to catch wiring mistakes, missing components, and broken output contracts. KILO/NAYRA-derived formulas still get local unit or stage tests when their physical assumptions are load-bearing, but CI does not compare nullsim against frozen external scalar outputs.
A future nullsim/validation/ package could wrap HCIPy for ground-truth checks of injection coupling and atmospheric OPD stages. None ship today; HCIPy is intentionally absent from the dependency surface until a real cross-check is written.
Four example configs ship with the package and anchor the design. They double as executable examples for end-to-end config checks.
| File | What it reproduces |
|---|---|
kilo_maunakea.toml |
Four-telescope Maunakea KILO-style J-mag sweep. |
nayra_eelt.toml |
Single-pupil E-ELT NAYRA-style study. |
hwo_space.toml |
HWO-class space nuller. No atmosphere stage, formation-flying delay budget instead of AO + atmosphere. MKID detector. |
chip_only_lab.toml |
Bare chip + detector. No telescope, no atmosphere. Lab characterization mode. |
A fifth file, examples/kilo_vs_hwo_study.toml, is a study-layer config that overlays the KILO and HWO configs in one detection-space figure.
Python target: requires-python = ">=3.12". tomllib is stdlib from 3.11+; no third-party TOML parser needed. Dev environment for this project is conda py313; the package itself does not require 3.13.
Core dependencies (required for pip install nullsim):
numpy,scipy,matplotlib,astropy,pandas,pyarrow(Parquet),pydantic >= 2tqdm(sweep progress)
Optional dependencies (extras):
torch— chip optimizer fast path on CPU/CUDApyvo— live NASA Exoplanet Archive TAP queries (ships with cached CSV fallback)
kilo and nayra are not runtime dependencies — nullsim does not import kilo at runtime, and pip install nullsim does not pull them in. They are however useful source references for physical models, kept in separate repositories outside this tree. New nullsim/physics/ and stage code may be a clean rewrite to fit pipeline conventions, but every hard-won physical lesson should land in nullsim as a local test, not as folklore.
These are not blocking the design but want answers before the implementation plan is final.
- Site and telescope preset ownership. The built-in presets ship with the package. Should new preset additions (e.g. a new VLT configuration) live in the user's project or upstream in
nullsim/sites/? My recommendation: presets that correspond to real, published instruments ship upstream; experimental presets live in user projects. - Polarization Jones tracking. The
fieldsub-budget has an_polaxis. NAYRA'spolarization.pyis currently stub-level. Should Jones-matrix propagation be a first-class stage family (polarization), or does it live inside transport/chip stages? My recommendation: apolarizationfamily that defaults to ascalarno-op stage; dual-pol instruments swap in ajones_chainstage. - Time-domain output mode. Stochastic stages return distributions today. If multiple stages are stochastic and the user wants a true time series of detector counts, that is a separate output type the catalog does not yet describe. Defer to a v2 design pass.
- Catalog snapshot distribution. Exoplanet catalog snapshots are versioned and hashed, but the distribution channel is open: ship a frozen snapshot with the package and refresh on release, or fetch on first use and cache locally? My recommendation: ship a frozen snapshot (last 30 days before release), allow
nullsim catalog refreshto fetch a fresher one explicitly. Avoids first-run network dependency for reproducibility.
Out of scope for this design doc, but to anchor expectations. The first milestone is a vertical slice that exercises the architecture end-to-end before any real physics gets ported — that way the interfaces get pressure-tested while they are still cheap to change.
-
Vertical slice (architecture validation). Build the minimum pieces of every layer at once:
nullsim/pipeline/skeleton: stage protocol,SimulationStatewithscene/geometry/field/throughput/photon_rates/results, runner, conservative cache, registry, RNG derivation.nullsim/config/skeleton: TOML loader, pydantic schemas, sweep expansion, preset merge.- 4–5 trivial stages:
scene.point_source,telescope.fixed_aperture,injection.ideal,chip.identity,detector.ideal_counter. - One plot (
contrast_curve) and one table (performance_summary). - Manifest with
config_hashandrun_hash. - End-to-end test: a 5-line
vertical_slice.tomlruns, writes outputs, cache hits on re-run, sweep overscene.star.jmagproduces asweep_results.parquet.
-
Build
nullsim/physics/andnullsim/optimization/guided by the KILO and NAYRA source trees. The math may be rewritten cleanly to fitnullsimconventions (typed signatures, no global state, explicit units, dependency injection at boundaries), but the hard-won lessons must come along. For each target module (kolmogorov, ruilier_cassaing, clements, jones, photometry, fresnel, chip optimizer backends), the workflow is: (a) read the KILO/NAYRA implementation in full, including comments and any# fixme/# notemarkers, and write a short notes file capturing the non-obvious invariants (sign conventions, dtype choices, edge cases, ordering constraints); (b) lift the corresponding KILO/NAYRA tests intotests/physics/first so those invariants are pinned before any rewrite; (c) write the newnullsimimplementation, refactoring naming and signatures, and verify every lifted test still passes; (d) add new tests for any behavior the original lacked coverage for. A clean rewrite that drops a debugged behavior because "the paper doesn't say so" is the failure mode this guards against.Step 2 first scoped pass (2026-05-15): three modules shipped —
nullsim/physics/kolmogorov.py(Kolmogorov phase PSD + structure function + r0 wavelength scaling, cross-checked between KILO and NAYRA atmosphere implementations),nullsim/physics/photometry.py(J-band Vega zero-point + magnitude→photon-rate per Cohen+ 2003), andnullsim/physics/ruilier_cassaing.py(single-mode-fiber coupling efficiency vs Strehl with optional central obscuration, Ruilier 1999 analytical limit pinned at η ≈ 0.8145 for matched circular aperture). Each module ships with a notes file intests/physics/notes/<module>.mdcapturing the cross-check decisions and adopted constants. Remaining step-2 modules (clements, jones, fresnel, chip optimizer backends) ship in subsequent passes. -
Port the real stages: atmosphere (Kolmogorov), AO (pyramid WFS), injection (Ruilier–Cassaing), transport (SMF28, PM980), delay lines, chip (kernel MZI mesh), chip_optimization, detectors (MKID, SNSPD), fringe tracking, calibration, performance. Each stage wraps the already-built
nullsim/physics/primitives. Where KILO's structure crosses what is now a stage boundary (e.g. a single KILO function that does atmosphere + AO), split along the boundary but verify behavior end-to-end against the corresponding KILO run.Step 3 first scoped pass (2026-05-16): two stages shipped —
nullsim/stages/atmosphere/kolmogorov_screens.py(KolmogorovScreens wraps the Kolmogorov physics primitive; writesOPDBudget["atmos_piston"]and a typedWavefront.r0_at_500nm_mchannel) andnullsim/stages/ao/pyramid_wfs.py(PyramidWFS sums KILO's three residual terms — fitting, temporal, WFS-noise — in quadrature; marks the atmosphere contribution as superseded viaOPDComponent.superseded). KILO numeric references pinned at rel=0.01 in the stage tests. Remaining step-3 stages (injection real, transport SMF28/PM980, delay lines, chip kernel MZI mesh, chip_optimization, detectors MKID/SNSPD, fringe tracking, calibration, performance) ship in subsequent passes.Step 3 second scoped pass (2026-05-16): three stages shipped —
nullsim/stages/injection/single_mode_fiber.py(SingleModeFiber wraps the Ruilier–Cassaing physics primitive; multiplies the field amplitude by √coupling and writes a per-wavelengthinjection.smf_couplingthroughput component),nullsim/stages/transport/smf28.py(SMF28 applies fiber loss + chromatic dispersion phase + per-telescope common-mode phase noise; writestransport.smf28throughput component andtransport.smf28_phaseOPD component), andnullsim/stages/delay_lines/geometric.py(GeometricDelayLines applies pointing-dependent path-equalization delays; writesdelay_lines.geometricthroughput component anddelay_lines.geometric_residualOPD component). All three components adopt the canonicalThroughput.componentspayload shape{wavelength_grid_um, transmission_fraction, +provenance}, andScenegains typedaltitude_deg/azimuth_degfields replacing the prior string-keyedcomponentslookup. Polarization-aware variants (PM980 + Jones-fiber) deferred to step 3 third pass alongside chip work.Step 3 third scoped pass (2026-05-16): chip stage + optimizer backends shipped —
nullsim/stages/chip/kernel_mzi_mesh.py(KernelMZIMesh wraps the Clements physics primitive; applies a per-wavelength M×M unitary to the field tensor on the n_modes axis withthetaunscaled andphi/alpharescaled bywl_design/wl_target; DAC-quantization applied BEFORE wavelength rescaling; writes a unit-transmissionchip.kernel_mzi_mesh_lossthroughput component since unitaries preserve power) replaces thechip.identityplaceholder, plus three optimizer-backend stages undernullsim/stages/chip_optimization/{scipy,torch,mlx}.py(scipy uses L-BFGS-B + optional SLSQP polish; torch uses lazy-import + real/imag-split autograd with TF32 disabled per KILO's hard-won basin-flip lesson; mlx uses Apple Metal float32 + same real/imag trick). Optimizers writeoptimal_chip_paramsintostate.results.componentsrather than threading chip params throughSimulationState(Pass 3 design decision per Simplifier YAGNI verdict —state.pyis locked, optimal params are read out by the user and baked into a follow-up TOML). Optional dependency extras[torch]and[mlx]added topyproject.toml. KILO pinning testsTestOptimizeNull::test_deep_null_small_systemandTestClementsBackendsAgreelifted as xfail-strict tripwires; nullsim-native cross-backend agreement tests run at 1% relative tolerance. Polarization-aware variants (DualPolField + Jones-fiber + PM980) deferred to step 3 fourth pass alongside polarization-aware injection/transport.Step 3 fourth scoped pass (2026-05-16): detector + post-detection stages shipped —
nullsim/stages/detector/mkid.py(MKIDDetector wrapsnullsim/physics/photometry.pyto convert chip-output|field|²× throughput into per-port photon counts; per-pixel saturation cap default 50 kHz KILO / 100 kHz NAYRA; no MKID dark counts),nullsim/stages/detector/snspd.py(SNSPDDetector adds 10 Hz dark rate and 1e8 Hz fixed cap),nullsim/stages/fringe_tracking/closed_loop.py(ClosedLoopFringeTracker quadrature-sums Shao-Colavita 1992 shot-noise + Conan 1995 servo-lag piston RMS; emitsfringe_tracking.closed_loop_residualOPDComponent and marks the upstreamatmos_pistonas superseded via the Pass 1 INS-8-001 pattern; KILOpredictive_lag_reduction=1.0default vs NAYRA Kalman 0.05 captured as a Param),nullsim/stages/calibration/floor_lookup.py(CalibrationFloorLookup applies a default ~1e-6 systematic floor or interpolates a pre-computed JSON lookup table; clampsnull_depth_calibrated = max(raw - floor, 0)since physical nulls cannot go negative), andnullsim/stages/performance/standard.py(StandardPerformance terminal stage;SNR = S / √(S + B + D + Sky + systematic)with the systematic termeps²B²·t/(2·f_servo)when fringe tracking is in the pipeline or(eps·B)²without — KILOsignal_to_noiselines 204-292; reads port rates once from the detector component to honour KILO'sdetected_stellar_rate_at_telescopeconsolidation lesson). 446 passed + 3 xfailed (was 367 + 3; +79 tests). Polarization-aware variants and per-arm fringe-tracking diagnostics deferred to a later pass. -
Build out the standardized output catalog including
detection_spaceand the catalog-versioning machinery.Step 4 first scoped sub-pass (2026-05-16): headline
detection_spaceplot + catalog-versioning machinery shipped —nullsim/outputs/plots/detection_space.py(contrast-vs-separation curves with detectable-region fill, per-(config×J-mag×integration-time) cell, type/method-coded exoplanet scatter overlay, footer caption with stellar diameter + catalog snapshot SHA-256 + retrieval date),nullsim/catalogs/exoplanets.py(ExoplanetCatalogConfig,Exoplanetpydantic models,load_catalogfor IPAC/NASA-Exoplanet-Archive CSV,classify_planetversioned rule with Rocky/Super-Earth/Neptune-like/Gas-Giant thresholds,compute_snapshot_hashfor SHA-256 reproducibility),nullsim/config/catalog.py(TOML schema for[catalog.exoplanets]).nullsim/cli.py_run_hash_with_externalsnow folds the catalog content hash intorun_hash, so a catalog refresh produces a NEW output directory rather than silently shifting points on the headline figure. 479 passed + 3 xfailed (was 446 + 3; +33 tests, including 5-planet fixture catalog covering all four planet types and discovery methods). Remaining sub-passes covered supporting plots, supporting tables, and the study-layer multi-config multiplexer. -
Ship
kilo_maunakea.tomland functional tests for the KILO-style sensitivity example.Step 5 first scoped pass (2026-05-16):
examples/kilo_maunakea.tomlshipped as a 4-telescope KILO-style Maunakea config with a J-band sweep. Early development used external scalar references; those reference-comparison tests were later removed once nullsim's own physics/stage contracts became the active test surface. -
Ship
nayra_eelt.tomland functional tests for the NAYRA-style E-ELT example.Step 6 first scoped pass (2026-05-16):
examples/nayra_eelt.tomlshipped as a single-aperture 39 m E-ELT config with no delay-line stage, MKID cap 100 kHz,predictive_lag_reduction=0.05for the Kalman/LQG fringe tracker, and a J-mag sweep. -
Ship
hwo_space.tomlas the first config that did not exist before the new package.Step 7 first scoped pass (2026-05-16): HWO-class space nuller config shipped —
examples/hwo_space.toml(6 m monolithic primary at Y-band λ=1.0 µm, 1-day integration, M-dwarf-like host at 10 pc, planet contrast=1e-10 at 100 mas, exozodi 3× solar, pipeline strips atmosphere/AO/fringe-tracking/delay-lines, calibration floor 10⁻¹⁰),tests/examples/test_hwo_space_sanity.py(sanity asserts include end-to-end pipeline run, design-floor pin at 10⁻¹⁰, no atmosphere/AO/FT components in the OPD budget, contrast-sweep monotonicity, detection_space PNG smoke), andexamples/kilo_vs_hwo_study.toml(cross-config study-layer placeholder per SPEC §8.3 — runner deferred per task #36 sub-pass 36d, but the TOML schema is in place).HWO photon-baseline and finite-bandwidth warm-start pass (2026-05-21):
examples/hwo_space.tomlnow models an 8 m unobstructed HWO-class primary at 0.8 µm with a 10% band, 24 pupil modes, six bright/dump ports, 18 dark ports, all-sky observability, dual-pol PBS/PM-fiber transport, and an ideal order-3 kernel projector for the photon-noise upper bound.nullsim/stages/chip_optimization/ideal_kernel.pyadds thechip_optimization.ideal_kernelstage, which routes the stellar Taylor subspace into bright ports and can either publish an achromaticoptimal_u_stackor decompose the center-wavelength unitary into a physical Clements seed.chip_optimization.torchcan now warm-start from an upstreamchip_optimizationcomponent viawarm_start_source, enabling physical Clements refinement from the ideal HWO kernel instead of cold-starting the 576-parameter mesh.Shared chip-chromaticity pass (2026-05-21):
chip.kernel_mzi_mesh,chip_optimization.scipy, andchip_optimization.torchgained opt-in parametric chromaticity fields. Defaultchip_chromaticity_model = "legacy"preserves KILO/NAYRA behavior: theta is wavelength-flat and phi/alpha scale aswl_design / wl. Settingchip_chromaticity_model = "parametric"enables additive directional-coupler theta drift (theta_chromaticity_*) and a selectable phase model, includingphase_chromaticity_model = "fixed_opd_neff"with first-order effective-index dispersion fromphase_neff_refandphase_group_index.HWO fixed-OPD deployment pass (2026-05-21):
examples/hwo_space.tomlnow deploys the ideal order-3 kernel seed as a signed-real physical Clements decomposition (deployment = "physical_real_clements") and propagates it throughchip.kernel_mzi_meshwithchip_chromaticity_model = "parametric"andphase_chromaticity_model = "fixed_opd_neff". This keeps the default HWO example on a physical mesh while avoiding internal pi phase flips that would otherwise dominate finite-bandwidth fixed-OPD leakage for the real-valued HWO kernel projector.HWO physics-path cleanup pass (2026-05-21): the HWO pupil surrogate now preserves the full 8 m unobstructed collecting area and rescales its 24 mode-center coordinates so the maximum surrogate baseline is 8 m. The photon-limited baseline retains the PBS split and per-arm chip deployment but removes the nonideal PM-fiber Jones/PER perturbation; PM780 propagation loss remains.
performance.standardnow prevents a sensitivity-stage ideal null from hiding detector/calibration residual leakage and uses the sensitivity-stage throughput interpolated atscene.planet.separation_masfor headline SNR and integration-time calculations.HWO optical photometry pass (2026-05-21):
nullsim.physics.photometrygained a Cousins I-band Vega zero point and compact I-J color conversion so HWO's 0.8 µm photon budget no longer uses the near-IR Y/J placeholder. Detector stages now expose an opt-inderive_band_mag_from_scene_jmagswitch; KILO/NAYRA defaults remain unchanged, whileexamples/hwo_space.tomlsetsband = "I",scene.star.spectral_type = "M3V", and derives the I-band magnitude from the catalog J magnitude. -
Documentation, CI, packaging.
Step 8 first scoped pass (2026-05-16): closing polish —
README.md(elevator pitch, install table covering base/[torch]/[mlx]/[validation]extras, quick-start table over all six example configs, CLI cheat sheet, architecture summary linking back to nullsim.md and SPEC.md, v0.1 ship list, MIT license note),CONTRIBUTING.md(~140 lines covering branch-PR workflow, build pipeline pointer, pre-commit guards including spec-discipline + librarian backlog ≤5, pytest marker reference, and fragment-changelog → Librarian discipline).pyproject.tomlextended withreadme = "README.md", MIT license, authors/keywords/classifiers,[project.urls]placeholder, and[validation] = ["hcipy>=0.7"]extra for the optional ground-truth atmospheric/injection cross-check..github/workflows/ci.ymlships the test matrix on Ubuntu × Python {3.12, 3.13}; pip-cached viaactions/cache@v4keyed on the pyproject hash. With this pass the SPEC §15 roadmap is complete: vertical slice + physics primitives + four scoped stage passes + detection_space + KILO/NAYRA-derived examples + HWO + docs/CI/packaging.Chip-optimizer KILO tricks pass (2026-05-16):
nullsim/stages/chip_optimization/scipy.pyrestart loop refactored from a fixed 16-restart sequential bank into KILO's leaner pattern (kilo/chip.py L1100-1160).ChipOptimizationParamsgains four new fields:n_restarts(default 5, was hardcoded 16 — matches KILO empirical "deepest basin found in first 1-5 restarts at M=24"),restart_patience(default 3, KILOKILO_BB_PATIENCEanalogue — bail when K consecutive restarts don't improve over running best byrestart_patience_reltol),restart_patience_reltol(default 0.01 = 1%/restart), andn_workers(default 1 — chip-restart ProcessPoolExecutor over BFGS restarts with spawn context, top-level worker_run_one_restart_workerfor pickle safety). Pool mode dispatches all restarts in one shot (no mid-pool early exit, matching KILO's process-pool semantics); serial mode keeps both target-null and patience early exits. SLSQP polish step gets its own local_polish_objclosure now that the restart loop's_objlives inside the worker.examples/kilo_keck.tomlupdated:n_restarts=5,restart_patience=3,n_workers=2→ 5 outer sweep cells × 2 inner chip-restart workers = 10 cores in flight. 605→611 tests passing.Debug-report pass (2026-05-16): new
nullsim/outputs/plots/debug_report.pywrites a multi-pagedebug_report.pdfthat walks the executed pipeline stage-by-stage. Each page covers one stage's contribution to throughput / OPD / wavefront / results sub-budgets (read from thecomponentsdicts on the final state — no runner-level snapshots needed for v1). Cumulative pages at the end show running throughput, OPD quadrature sum, and per-port detector rates. Intent: a single-click sanity check that surfaces every load-bearing quantity the pipeline produces, so silent regressions like the post-figure-port bugs become visible without a deep dive. Added toexamples/kilo_keck.toml[output.plots]so any run automatically renders it. 613→615 tests.Figure-port pass (2026-05-16): sanity-check figure ports from
kilo/studies/paperv3_plots.pyso the headlinedetection_spaceplot isn't the only visual output.nullsim/outputs/styles.pyPUBLICATION_RC replaced with KILO's verbatim_setup_style()+_configure_matplotlib()(cmr10 serif, font sizes 12/14/11, lines.linewidth=1.5, ticks-in, figure 8x6 @ 150 dpi) and auto-applied at import. Six figure ports shipped:nullsim/outputs/plots/subaperture_layout.py(KILO paperv3 Fig 3 — per-telescope panel showing pupil boundary, central obstruction, spider vanes, optimized sub-aperture circles),nullsim/outputs/plots/uv_coverage_grid.py(KILO paperv3 Fig 5 — 4-dec x 3-integration grid of (u,v) tracks with intra-/inter-telescope coloring),nullsim/outputs/plots/chip_diagnostics.py(KILO paperv3 Fig 7 — 4-panel chip diagnostic: per-port stellar power distribution, |U| + arg(U) heatmaps, null vs DAC bit depth, broadband null spectrum; chip-input field recovered via U^H @ output to avoid new diagnostic state), andnullsim/outputs/plots/contrast_curves_multiband.py(KILO paperv3 Fig 10 —1 × n_integration_timespanels of detectable-contrast vs angular-separation, with one curve per band; bands derived from each record'sgrid.wavelength_center_um, times fromobservation.integration_time_s, so the user controls the sweep axes in TOML),nullsim/outputs/plots/loss_budget.py(KILO paperv3 Fig 4 — per-band stacked-dB pre-chip loss bars; scalar adaptation since nullsim has no dual-polarization arms yet — drops KILO's scalar-vs-dual comparison axis, keeps the per-stage decomposition read out ofstate.throughput.components), andnullsim/outputs/plots/calibration_time.py(KILO paperv3 Fig 9, scoped down — chip optimizer convergence curve: null depth vs objective evaluations. Single panel only since KILO's right panel needs photon-noise-aware optimization + wall-clock mapping that nullsim doesn't have yet. Optimizer convergence is captured via a newcapture_convergence_historyflag onChipOptimizationParamsplumbed through the scipy backend). New physics primitivenullsim/physics/uv_coverage.pyports KILO'sBaseline+baselines_to_uv+compute_uv_tracks+enumerate_subaperture_baselines. The Fig 5 port reproduceskilo_keck.toml's 276 baselines (132 intra + 144 inter) and the 80-96 m inter-Keck range. Same pass added[run].n_workers+[run].threads_per_worker+nullsim run --workers Nfor ProcessPoolExecutor-based sweep-cell parallelism (KILO chip.py L1239 pattern: spawn context, top-level worker, as_completed streaming). Defaults tomin(n_cells, cpu_count). Five-cellkilo_keck.tomlruns 5x faster end-to-end. 605 passed (was 570; +35 tests). Remaining figure ports — Fig 8 char_mode (deferred until characterization-mode chip lands) — queued. Fig 9 right panel (null vs calibration budget at multiple J-mags) also deferred until photon-noise-aware chip optimization is wired in.Post-§15 hardening pass (2026-05-16): independent review surfaced six correctness bugs that all landed fixes in one pass: (1)
injection.single_mode_fiber+transport.smf28double-counted optical loss (scaled both field amplitude ANDthroughput.transmission_fraction, then the detector multiplied them — squared every detector rate by the loss factor); (2)nullsim/cli.py_handle_runused the pre-materialized resolved config forrun_hashbut the materialized cells[0] hash in the manifest, so output directory and manifest disagreed about which config ran; (3)chip_optimization/torch.py+chip_optimization/mlx.pyproduced Clements params in KILO's column-alternating layout whilechip.kernel_mzi_meshreconstructed via the lexicographicphysics.clementslayout — same params → different unitary → broken optimizer-to-kernel handoff; (4)calibration.floor_lookupread its lookup table at runtime but did not declare it as an external dependency, so an edit to the table reused the cache and the samerun_hash; (5)nullsim/catalogs/exoplanets.py_autocompute_snapshot_hashshort-circuited on pinned hashes without verifying them against file bytes; (6)ChipOptimizationParams.n_dark_portsaccepted 0 and silently returned NaN loss. Same pass addednullsim/physics/pupil_geometry.py(line-for-line port of KILO'skilo.array.optimize_subaperture_positions+kilo.pupilconstraint helpers) soPyramidWFScan derivepiston_subaperture_diameter_mfrom the pupil geometry (n_subapertures_per_aperture+pupil_template) instead of pinning a constant in TOML.Post-figure-port bugfix pass (2026-05-16, commits e4d84f0 + ba5095d): two correctness bugs surfaced after the figure-port pass. (1)
injection.idealconstructed a freshWavefront()without copying upstreamwavefront.components, silently wiping any OPD or amplitude contributions written by atmosphere/AO stages — fixed by passingcomponents=dict(state.wavefront.components)to the constructor. (2)detector.mkidanddetector.snspdcomputednull_depth_rawfrom per-port rates AFTER the saturation cap, QE scaling, and dark-count addition — when both bright and dark MKID ports saturated, the ratio collapsed to the pixel-count ratio (~1e-3) regardless of the actual chip null; fixed by computingnull_depth_rawfromport_rate_hz_per_wavelengthBEFORE saturation/QE/darks, matching KILO'snull_depth()convention (kilo/performance.pyL164). Chip-optimizer KILO port pass (2026-05-16, commits 22fb83a, 06dda42, 8d25657, c28aac8, 64e7d05): four changes landing together complete the chip optimizer's structural alignment with KILO. (1) Thedebug_report.pdfplot (nullsim/outputs/plots/debug_report.py) is now wired intokilo_keck.toml[output.plots]and a follow-up fix (06dda42) corrects the injection-stage throughput payload to the canonical dict shape{wavelength_grid_um, transmission_fraction, +provenance}sodebug_reportandloss_budgetread it without a key error. (2)ChipOptimizationParamsgains anobjectivefield (default"absolute_leakage"); all three backends now minimizesum |U[dark_ports,:] @ field_stack|²— the absolute leakage summed over dark ports and wavelengths — exactly matching KILO's chip optimizer objective (c28aac8). The prior relative-leakage formulation is retained as"relative_leakage"for backward compatibility. (3) The torch backend (8d25657) gains KILO's restart-loop pattern:n_restartsouter restarts driven byrestart_patience/restart_patience_reltolearly-exit, and atorch.compilegate that engages only when CUDA is available or the user opts in — avoids JIT overhead on CPU-only developer runs. (4)ChipOptimizationParamsgainskernel_order(int 0/2/3/4, default 0); when non-zero the optimizer targets a kernel-null field stack instead of the on-axis stellar field (_kernel_null_field_stack: direct port of KILOinjection.py:kernel_null_field_stackL226-330, kernel order 2 → G(r)~r⁴, order 4 → G(r)~r⁸).ao.pyramid_wfsnow writesstate.geometry.aperture_positions_pupil_m(M×2, pupil-frame ENU positions derived fromoptimize_subaperture_positions);chip_optimizationreads this whenkernel_order > 0and raises a clear error if absent (64e7d05).Sensitivity correctness pass (2026-05-16, commits 118c5dc, 1fda201): two correctness fixes to
nullsim/stages/sensitivity/detection_curve.pythat together close a ~10000x contrast error at 300 mas. (1)_find_ao_piston_rms_per_sub_mnow prefers the raw AO-piston residual instate.wavefront.components['ao_residual'].piston_opd_nm.total(KILOtransport.realized_null_depth ao_piston_opd_rms_per_subconvention); the fringe tracker corrects only the common inter-telescope piston, so the within-telescope differential piston that drives the chip null floor is the AO-stage output, not the FT residual. Falls back to the closed-loop FT residual instate.opd_budget.components(key prefixfringe_tracking.closed_loop_residual) for space pipelines where no AO stage runs;DetectionCurve.consumesextended to includeopd_budget. (2)_split_apertures_to_modesnow uses the pupil-packed sub-aperture diameter instead of the naive area-equivalent value.Chip-optimizer wavelength subsampling (2026-05-16, commit 910f2de):
ChipOptimizationParamsgainsn_wl_optimize(int, default 0 = full grid, back-compat). KILO's broadband null factory uses 5 wavelengths (natural_design.pyL611,n_wl_broadband=5) while nullsim's detector grid typically has 20+ bins; running the optimizer on the full grid imposes 4× more per-iteration cost and the broadband basin converges to a shallower minimum because the optimizer must satisfy more constraints simultaneously. Whenn_wl_optimize > 0, the torch backend subsamples the input field array at that many uniformly-spaced wavelength indices for the BFGS/restart loop only;chip.kernel_mzi_meshthen applies the optimized unitary at every bin of the original grid.examples/kilo_keck.tomlsetsn_wl_optimize=5to match KILO'skeck_only_config, and also bumpsmax_iter=15000/n_restarts=10/restart_patience=4to allow deeper convergence.Chip-optimizer broadband_refine fast-path + scene wiring pass (2026-05-16, commits 70b5068, a91c6c2, c181979, bba89f8): three correctness fixes that together push kilo_keck chip_ideal from ~3–9e-4 to 1.7–5e-6, below KILO's reference 7.3e-5. (1)
ChipOptimizationParamsgainsbroadband_refine(bool, defaultTruefor back-compat). The torch backend deploys its canonicalized params monochromatically (kernel_rescales_per_wavelength=False), so the broadband restart loop was minimizing the wrong objective — per-wavelength rescaled-unitary leakages that the kernel never applies — and empirically drifted away from the bootstrap's deep design-wavelength basin. WhenFalse, the bootstrap warm-start is used directly as the final params; raisesValueErrorif the bootstrap was not run or produced no warm-start. (2)scene.point_source.Paramsgainsstar_angular_diameter_mas(float | None, defaultNone). Previously the field was parsed by the TOML schema but silently dropped before reachingstate.scene, sosensitivity.detection_curve's stellar-disk-leakage integral always sawdisk_floor=0even when[scene.star].angular_diameter_maswas set in TOML.stage_materialize.pynow bindsscene.star.angular_diameter_mas→scene.point_source.star_angular_diameter_mas. (3)ao.pyramid_wfs.Paramsgainsderive_r_mag_from_scene_jmag(bool, defaultFalse) andr_minus_j_offset(float, default1.0). When the flag is set, the WFS noise term usesr_mag = state.scene.star_jmag + r_minus_j_offsetinstead of the staticguide_star_R_mag, enabling natural-guide-star configs (host star = WFS reference) to sweep AO performance with J-magnitude.PyramidWFS.consumesgains"scene". The effective R-mag is recorded in the wavefront component payload.kilo_keck.tomlsetsderive_r_mag_from_scene_jmag=true,bootstrap_n_restarts=30, andbroadband_refine=false.Throughput-penalty + AO-piston-scale tuning pass (2026-05-17, commits 1764ae0, 7063adf, 1407e01, dd16ee05, 6f485003): Three additions to tune chip-optimizer basin selection and realized-null estimation to KILO's regime. (1)
ChipOptimizationParamsgainsthroughput_penalty_alpha(float, default 0.0 = disabled; KILO_BB_THROUGHPUT_PENALTY_ALPHA convention,kilo/studies/natural_design.pyL590-625),throughput_penalty_seps_mas(tuple of floats, default(3.0,)), andthroughput_penalty_n_pa(int, default 8). Whenthroughput_penalty_alpha > 0, the bootstrap objective adds a soft penalty−alpha × <planet_throughput at rep seps>to bias the optimizer toward basins that preserve the kernel-null's r^6 throughput rolloff; without the penalty the optimizer tends to find ultra-deep-null basins that collapse planet throughput at small separations (chip_ideal 4e-6 but throughput at 30 mas ~0.006 vs KILO's balanced chip 7e-5 / throughput 0.35).kilo_keck.tomlsetsthroughput_penalty_alpha=5e-2andthroughput_penalty_seps_mas=[3.0, 10.0, 30.0]. (2)DetectionCurve.Paramsgainsao_piston_scale(float, default 1.0 = no scaling). The raw pyramid-WFS piston (Conan 1995 temporal-error formula) can overestimate KILO's tuned-servo value by ~2-3x at Keck J=4;ao_piston_scalemultiplies_find_ao_piston_rms_per_sub_m's return before it is passed torealized_null_depth_mc.kilo_keck.tomlsetsao_piston_scale=0.4. (3)DetectionCurve.Paramsgainsscintillation_index_per_sub(float, default 0.0 = disabled) andrealized_null_depth_mcgains a matchingscintillation_index_per_subparameter. Per-sub multiplicative amplitude jitter is drawn log-normal with E[I]=1, Var[I]=sigma_I^2 (Roddier 1981); this is the residual after the amplitude servo, NOT open-loop scintillation (KILO transport L1639).kilo_keck.tomlsetsscintillation_index_per_sub=0.02.chip_optimization.torch RNG pin (2026-05-17, commit 0e52d8f2):
ChipOptimizationTorch.applynow callstorch.manual_seed(seed_base)immediately after constructing the numpy master RNG. Without this, PyTorch's internal RNG was seeded from system entropy each run, making restart-init non-reproducible even whenparams.seedwas fixed. The pin is cheap (CPU-only) and does not affect multi-thread MKL non-determinism, which is small enough that BFGS basins are stable from identical inits.eps_calibration noise-model port (2026-05-17, commits c9feec43, d968b86e, fbcf59de): Port of KILO
studies/natural_design.pyL1240_effective_systematic_bandwidth_hzinto nullsim's two-stage sensitivity + performance pipeline. (1)DetectionCurve.Paramsgainseffective_systematic_bandwidth_hz_override(float | None, defaultNone, gt=0). When set, it is forwarded to the performance stage aseffective_systematic_bandwidth_hzin the sensitivity component payload, causing_systematic_varianceto applyeps² × leak² × t / (2 f_eff)withf_eff = overrideinstead of the performance stage's owndefault_servo_bw_hz. (2)StandardPerformance.applyreadssens_payload["effective_systematic_bandwidth_hz"]and, when present and non-None, overridesservo_bw_hzbefore calling_compute_snrand_contrast_curve._resolve_servo_bwis unchanged; its output is now superseded by the sensitivity stage's f_eff when provided.Tier 2a — Fringe-tracking formula fixes + scintillation/f_eff physics ports (2026-05-17): Three closed_loop.py bugs and two missing KILO physics modules that block Keck-only reproduction. (1)
_conan_servo_lag_variance_rad2no longer multiplies by the upstream atmospheric piston variance — KILOfringe_tracking_phase_noise(kilo/fringe_tracking.py:180-182) uses the standalone(f_piston / f_servo)^(5/3)form, since the bandwidth-ratio factor already carries the absolute lag variance; the pre-fix× σ²_atmosmultiplier inflated lag by 2-3 orders of magnitude at J-band Keck. (2)_shao_colavita_shot_variance_rad2now matches KILOshot_noise_phase_variance_per_frame(kilo/fringe_tracking.py:67-95) exactly —σ²_per_channel = (N_s + N_d) / (V² × N_s²), with cross-channel averaging byn_spectral_channelsapplied inapply()— replacing the pre-fix1 / (V² × n_spectral_eff × N_s × 2π²)(off by ~158× from KILO). NewParams.dark_rate_hz_per_channelcarries the FT-readout dark term. (3) Newnullsim/physics/scintillation.pyandnullsim/physics/systematic_bandwidth.pycover the scintillation and effective-bandwidth pieces. Three existing FT scaling-test tolerances loosened modestly to absorb the small residual shot contribution now that lag is no longer artificially amplified.Tier 2b — Sensitivity-stage wiring for σ_I + f_eff + AO/FT quadrature (2026-05-17): Pipeline-side wiring for the Tier 2a physics ports. (1)
DetectionCurve.Paramsgainsauto_amplitude_servo_residual: bool = False; when True σ_I per sub-aperture is derived fromnullsim/physics/scintillation.py:amplitude_servo_residual_sigma_i(KILOatmosphere.py:311-481port) instead of the manualscintillation_index_per_subscalar. Companion KILO Maunakea-default params (auto_scintillation_anchor_h5m_zenith=0.041,auto_scintillation_height_km=10.0,auto_high_altitude_wind_ms=35.0,auto_amp_servo_bandwidth_hz=5000.0,auto_amp_predictive_gain_variance=20.0,auto_amp_servo_photon_rate_per_sub_hzrequired when auto-mode is on). The payload's newamplitude_servo_residualdict carries the fullAmplitudeServoResidualbreakdown (open-loop, temporal-lag, photon-floor). (2)DetectionCurve.Paramsgainsauto_f_piston_hz: float = 100.0(AO piston servo bandwidth used by the variance-weighted f_eff blend).apply()resolves f_eff with precedence: expliciteffective_systematic_bandwidth_hz_override→ variance-weighted blend (nullsim/physics/systematic_bandwidth.py:effective_systematic_bandwidth_hz— KILOstudies/natural_design.py:180-238port) when auto-mode is on → None (performance stage default). (3)_find_ao_piston_rms_per_sub_mnow combines the AOpiston_opd_nm.totaland thefringe_tracking.closed_loop_residualOPD in quadrature (KILO model: FT corrects only common-mode inter-telescope piston; AO-differential ⊕ FT-residual feeds the chip null). Falls back to AO-only or FT-only when one stage is absent. Full per-telescope(M,)array support (KILOcompute_servo_corrected_piston) deferred until pyramid_wfs emits per-tel piston OPD.kilo_keck.tomlretains the manual override stack for now (auto_amplitude_servo_residual=False); flipping it to the new auto-mode is a Tier 4 config decision once the per-sub photon rate plumbing is in. Two new regression tests pin the auto-mode payload against direct physics-function calls and the clear error when auto-mode is requested without a photon rate.Tier 3 — Chip fidelity (BFGS + throughput penalty + Fresnel loss; 2026-05-17): Three of four chip-side fixes. (1)
nullsim/stages/chip_optimization/torch.py:432,472both bootstrap and broadband BFGS calls switched from scipyL-BFGS-B(low-memory) to full-HessianBFGSto match KILOchip.py:1180,1212. On a 576-dim multi-modal kernel-null landscape the two land in different basins from identical inits. (2) The throughput penalty−α × <planet_throughput>_{sep, PA, wl}now runs inside the broadband objective at every wavelength bin (KILOKILO_BB_THROUGHPUT_PENALTY_ALPHAconvention inkilo/studies/natural_design.py:606); pre-fix it lived only in the single-λ bootstrap, sobroadband_refinedrifted toward narrow-band null-only basins. (3)nullsim/physics/ruilier_cassaing.pygainsFRESNEL_REFLECTION_LOSS = 0.035constant andapply_fresnel_loss: bool = Falsekwarg oncoupling_efficiency; KILOinjection.py:25,72applies a 3.5% uncoated-tip Fresnel loss per coupled field. The exact annular-overlapgeometric_coupling_limitis kept because it is the more accurate physical formula.Tier 4 — Config mismatches + realized-null MC physics (2026-05-17): Two config slices and two MC-physics ports. (1)
examples/kilo_keck.toml:[observation].integration_time_sand[detector.mkid].integration_time_sbumped 3600.0 → 14400.0 to match KILO paperv3 headline 4 h on-source. (2)mkid_max_count_rate_hz5.0e15 → 5.0e4 (KILO physical 50 kHz; previous value was a unit-test override that disabled the cap);n_pixels_dark100 → 360 =n_dark_ports × n_wavelength_binsso each (port, bin) MKID is its own pixel (KILO per-bin readout convention). (3)dac_bits20 → 16 to match KILO's paperv3 chip-diagnostics figure; 20 bits gave ~100× tighter phase quantization than KILO. (4)nullsim/physics/realized_null.py:realized_null_depth_mcgainsfiber_phase_noise_rms_rad: float = 0.0— IID per-mode-per-realization fiber phase noise. (5) The MC now draws independent AO-piston OPD perturbations per wavelength bin. Three new realized-null regression tests pin the fiber-noise lift, input validation, and per-λ-then-averaged CV.Tier 3.1 — Chromatic chip pass-through, option 2 (2026-05-17): Closes the chromaticity gap deferred from Tier 3. The torch and MLX backends now publish the chromatic
U_stackthey actually evaluated (KILO column-alternating layout + per-wavelength phase rescaling, post-DAC-quantization) in a newoptimal_u_stackfield on the chip_optimization payload.chip.kernel_mzi_meshchecks foroptimal_u_stackfirst; when present, the chip stage uses it verbatim. The chip is now genuinely broadband — its deployed unitary varies across wavelength the way a real physical chip's does — and matches the optimizer at every detector bin.CLI results/latest symlink (2026-05-17, commit 8278022):
_handle_runnow calls_update_latest_symlink(output_dir)after writing all outputs._update_latest_symlinkcreates a relative symlinkresults/<name>/latest → <run-dir>so users have a stable alias regardless of the output directory naming scheme (timestamp, run_hash, or custom template). The symlink is created as a relative path so the parentresults/directory can be moved without breaking it. If a non-symlink file or directory already occupieslatest, it is left untouched. All errors are swallowed withexcept OSError: pass; the run's artifacts are already on disk and the alias is cosmetic.Six-bug correctness pass (2026-05-17, commit 0e54bf7): External code-review surfaced six bugs across four subsystems, all fixed. [HIGH]
chip_optimization._quantize_phasesdid not wrap the rounded result to[0, 2π), causing off-design-wavelength unitary mismatches between the optimizer and the deployed chip; fixed withnp.mod(quantized, 2π)shared by all backends. [MEDIUM]detector.snspd._resolve_collecting_area_m2skippedstate.geometry.effective_collecting_area_m2, overcounting photon rates by ~2.24× in sub-aperture pipelines. [MEDIUM]physics.realized_null.realized_null_depth_mcsilently accepted negativen_realizations_per_azimuth; now raisesValueError. [MEDIUM]sensitivity.detection_curve._chip_input_stellar_rate_hzuseddeltas=[wl[0] * 0.0]for single-wavelength grids, zeroing the chip-input rate. [LOW]ao.pyramid_wfs._resolve_pupil_geometryreturnedNoneforpupil_template="circular"when a central obstruction was present. [LOW]cli._resolve_output_dirsurfaced format-string errors as rawAttributeError. Parity test extractors corrected forstrehl_jband,mean_coupling_jband,resolution_mas_jband, andnull_depth_calibrated.Chip-optimizer bakeoff script + kilo_keck torch.compile (2026-05-17, commit c1a8780):
scripts/bench_chip_optimizer.pyships as a two-tier benchmark comparing available chip-optimizer backends on the same problem with the same RNG seed. Tier 1 times a single objective+gradient evaluation; Tier 2 times the fullChipOptimization*.apply(). On M-series Apple Silicon at M=24/n_wl=5: numpy/scipy FD 2.96 s/grad, torch eager CPU 26 ms/grad,torch.compileCPU 1.6 ms/grad.examples/kilo_keck.tomlflipscompile_objectivetotrue: end-to-end time drops from ~15 min to ~1 min per cell. Default inChipOptimizationParamsstaysfalseso small-problem configs are unaffected.Torch-path wrap fix + KILO-convention null scalars (2026-05-17/18, commits d31d64b, 167d5e3, 3fb003b, c137256): Four targeted fixes. (1)
_quantize_phasesnow only rounds to the DAC grid; the[0, 2π)wrap moves to_build_unitary_stack_numpyalongside the per-wavelength rescaling, keeping the torch path's operating point consistent with its autograd loop. (2)_require_fieldcollapsesstate.field.amplitudesto the wavelength-mean per-mode magnitude and broadcasts as real-valued, matching KILOinjection.py:stellar_input_field. (3) Two new fields added toRealizedNullResult—ideal_null_depth_kilo(chip-only, mean(dark)/mean(bright)) andmean_null_depth_kilo(AO-realized MC mean) — surfaced in sensitivity payloads asideal_chip_null_depth_kiloandrealized_null_depth_kilo. The existing dark-power-fraction fields stay as the contrast pipeline input. (4)ClosedLoopFringeTracker.Params.predictive_lag_reductiondefault changed from 1.0 to 0.05, matching KILO production convention.Timestamp output dirs + chip_diagnostics fix + kilo_keck fiber noise correction (2026-05-18, commits ec8d4f3, ce1d4a2, 0c5a1d1): Three targeted fixes. (1)
OutputConfig.dirdefault changed to"results/{run.name}/{timestamp}";ResolvedConfiggains atimestampfield (wall-clock,YYYYMMDD-HHMMSS, computed once atresolve_configso CLI and runner resolve to the same path).{run_hash}remains supported. (2)chip_diagnostics._plot_broadband_nullconstructs a clean nominal stellar field (design-wavelength magnitudes, phases stripped) matchingchip_optimization._require_field, replacing the prior post-chip-output field read that showed AO/transport-phase-contaminated realized null. Panel title updated to "Broadband null spectrum (chip only, clean input)". (3)examples/kilo_keck.tomlsetsfiber_phase_noise_rms_rad = 0.0; the prior 0.01 rad was double-counting the fringe-tracker servo residual. Effect: realized null at J=4 drops from 1.80e-4 to 1.32e-4.Spectro-port disjoint partitions, physical planet fields, basin knobs + MLX retirement (2026-05-18, commits 3e77b10, 639085c): Five basin-aligning changes against KILO's
natural_design.pyreference plus retirement of the MLX backend. (1)chip_optimizationschema updated to KILO's disjoint[bright | dark | spectro]layout;_null_indices= dark + spectro (null objective),_dark_indices= dark only (throughput-penalty regularizer). (2)_require_fieldpicks per-mode magnitude near design wavelength ×sqrt(area_per_mode); throughput-penalty term builds fullplanet_input_field(...)samples per optimized wavelength. (3) Basin knobs inkilo_keck.toml:seed = 42,throughput_penalty_seps_mas = [3.0],5/18/1chip_optimization port counts,5/19downstream, Keck ENU geodetic coordinates. (4)nullsim/stages/chip_optimization/mlx.pydeleted;nullsim[mlx]pyproject extra and all MLX-aware test/bench code removed.chip_diagnostics: deployed u_stack + disjoint null_idx + comment audit (2026-05-18, commit c6e71fb):
_plot_broadband_nullnow prefersoptimal_u_stackfrom the torch backend over rebuilding fromoptimal_chip_params.null_idxupdated toarange(n_bright_ports, n_modes)(19 indices, dark + spectro) fromarange(n_modes - n_dark_ports, n_modes)(18 indices). Comment audit:_require_fielddocstring,sensitivity/detection_curve.pyKILO null reference, andperformance/standard.pycontrast-curve ratio updated.Pipeline contracts: per-instance required_consumes + stage fixes (2026-05-18, commit 52c6d1e): (1)
Stagegainsrequired_consumes(self) -> set[str];validate_dependenciesandContentCache.key_forroute throughconsumed_state_keys().chip.kernel_mzi_meshoverridesrequired_consumesto includeresultsonly whenparams.params is None. (2)--no-cacheskips cache-key computation entirely. (3)detector.ideal_counterpreservesstate.results, publishes adetector.ideal_countercomponent viaunique_component_name, and emitsbright_0+dark_0ports with proper rates and integrated counts. (4)fringe_tracking.closed_loop._bright_port_photon_rate_hzmatches ports starting with"bright"instead of requiring an exact label, fixing ~5× SNR underestimate with the new ideal-counter port layout.CI test selection cleanup (2026-05-21): frozen external reference-comparison tests were removed; pytest and CI now run the active nullsim test suite without marker filtering.
Packaging + correctness sweep (2026-05-18, commits 4827b30–abeb943): Ten targeted fixes across packaging, sweep correctness, stage contracts, numerical bugs, and CLI robustness. (1) Package data:
pyproject.tomlextended to bundledata/**/*.csvandconfig/presets/*.toml; wheel installs now work without falling back to source tree.[mlx]extra replaced by[dev](pytest>=8) inpyproject.tomlandREADME.md. (2)chip.kernel_mzi_mesh._quantize_phaseswraps to[0, 2π)even whendac_bits is None, matching the scipy optimizer's wrap-then-scale convention. (3) Sweep mirror-overwrite fixed:materialize_pipeline_stagesacceptsskip_stage_paramsso sweep-written params are not clobbered by re-materialization;_run_hash_with_externalsnow folds all per-cell externals into run identity, not just cell-0. (4) SNR docstrings incharacterization.spectro,performance.standard, andcharacterization.detection_curveclarified — no numerical changes. (5)performance.standard._resolve_servo_bwreads live FT bandwidth from results;fringe_tracking.closed_loopdouble-etaremoved; bright-port mask capitalization normalized. (6) Stage contracts corrected:performance.standard.consumesremovesphoton_rates;sensitivity.detection_curve.consumesaddsthroughput;injection.idealraises explicitly on ambiguous Strehl fallback. (7) Numerical fixes:realized_nullpooled median/std;stellar_diskmonotonic-sort validation;transport.smf28deterministic zero phase and dispersion-seed raise;sensitivity.detection_curve._per_azimuth_stellar_fieldsmismatch guards. (8)detection_spaceplot loads user-configured catalog snapshot (not always bundled CSV);_handle_run/_handle_validateraise on zero-cell sweep;RunCollectionuses materialized cell-0 config. (9) Lazy outputs autoload: plot/table registration deferred to_autoload_outputs()on therunpath;validateno longer imports matplotlib. Cache digest invalidates on file mtime changes. (10)detection_spacelegend lower-left; x-axis upper bound from curve data.
The implementation plan is a separate document.
This design is ready for implementation planning once the open questions in section 14 are decided. Implementation will be specified in a follow-on plan document.