Rift o4d junior calmarg in loop: AFTER main 'distance' merge#139
Open
oshaughn wants to merge 115 commits into
Open
Rift o4d junior calmarg in loop: AFTER main 'distance' merge#139oshaughn wants to merge 115 commits into
oshaughn wants to merge 115 commits into
Conversation
Add RIFT.precision and route all extended-precision dtype use through it.
* RIFT.precision (new): RiftFloat resolves to numpy.longdouble whenever
the platform's long double has itemsize > 8 (e.g. Linux x86_64), and
otherwise falls back to numpy.float64. Also exports
RIFT_FLOAT_HIGH_PRECISION and RIFT_FLOAT_NAME. Eliminates the
import-time AttributeError when numpy.float128 is absent (macOS arm64,
Windows MSVC, non-x86 Linux, future numpy 2.x platforms).
* Integrator package: replace every numpy.float128 / np.float128 in
mcsampler.py, mcsamplerEnsemble.py, mcsamplerGPU.py,
mcsamplerAdaptiveVolume.py, mcsamplerNFlow.py, mcsamplerPortfolio.py,
statutils.py
with RiftFloat. The dtype-equality guards in mcsamplerGPU and
mcsamplerNFlow ("if weights_alt.dtype == numpy.float128: cast to
float64") degrade gracefully when RiftFloat == float64 (the
conditional astype becomes a no-op).
* likelihood/factored_likelihood.py and
interpolators/BayesianLeastSquares.py: same RiftFloat swap, so they
also import cleanly on platforms without np.float128.
* CI (.github/workflows/ci.yml): add rift_O4d_gmm_gpu to the trigger
branches; expand the install matrix to 3.9-3.13; convert
import-check and test-run into a two-lane matrix:
- legacy : python 3.9 + numpy==1.24.4 (historical green build)
- modern : python 3.12 + numpy>=2.0,<3.0 (forward-looking gate)
Numpy is pinned after requirements.txt, so the unpinned 'numpy' line
in requirements.txt is preserved. Test-log artifacts are named
per-lane so failures from each can be uploaded independently.
No behavioral change on the existing legacy CI lane: RiftFloat is
numpy.longdouble there, which is the exact 16-byte type previously
spelled numpy.float128. The modern CI lane is the new gate.
Implement a reusable distance-grid export helper for ILE, thread the export flag through pseudo_pipe, and add focused reconstruction tests. Add a zero-spin fake-data demo that builds a DAG with distance-grid export enabled, plus a small lalsimutils XML compatibility fix for current LAL bindings.
… 4.4.0 causes breaking changes)
Provide a root pixi workspace that defaults local development to SWIG <4.4.0 while also defining a SWIG >=4.4.0 comparison environment. Add GitLab CI jobs for both pixi environments so deployment stability can be checked across the hidden SWIG binding change.
* RIFT/misc/distance_grid.py: build_distance_grid now divides out the distance sampling prior so the exported lnL is L_pure(d) = integral L(d,Omega) pi_Omega dOmega. New column ln_prior_d_sampling carries the per-bin sampling-prior factor so default reconstruction reproduces log_res exactly, while reconstruct_marginal_lnL(grid, ln_prior_d=...) re-marginalizes against any prior of choice. * integrate_likelihood_extrinsic_batchmode: handle mcsamplerEnsemble/GMM _rvs columns (raw integrand/joint_prior/joint_s_prior, not log_*); drop zero-weight samples cleanly instead of raising "missing type". Pass sampler.prior_pdf["distance"] values per-sample. * test/test_distance_grid.py: cover the pure-likelihood property (different priors yield correctly different marginals) and the round trip. * demo/rift/add_distance_grids/validate_distance_grid.py: new stress harness quantifies n_eff vs integral/shape error. * pixi.toml: pin lalsuite==7.25, lalmetaio<=4.0.5 to dodge the SWIG-4.4 cross-module SwigPyObject/LIGOTimeGPS regression (issue oshaughn#136). Verified end-to-end ILE run produces .dgrid whose reconstruction matches log_res to machine precision. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ILE batchmode now optionally emits a .dslice file per intrinsic point containing K independent extrinsic-marginalized likelihoods at K distance slice centers (quantile centers of the posterior in d). The estimator is importance-reweighting of the main run's Omega samples at each slice distance, re-using the cached likelihood machinery -- no waveform or PSD regeneration, no extra worker spin-up. With K~=10 the artifact stays within the user's <~10x .composite size budget. * RIFT/misc/distance_slices.py: importance_reweight_slices, quantile_slice_centers, table builder/loader, and reconstruct_marginal_lnL that takes an optional custom distance prior. Schema (DISTANCE_SLICE_FIELDS) deliberately mirrors .composite for downstream CIP integration. * integrate_likelihood_extrinsic_batchmode: new --export-distance-slices K and --distance-slice-method flags; threaded into analyze_event after the main integration. Reuses sampler._rvs and like_to_integrate. Emits a runtime warning when GMM + low main n_eff, since B2-reweight silently biases in that regime. * demo/rift/add_distance_grids/validate_distance_slices.py: synthetic stress harness with a known closed-form marginal and an adjustable d-Omega coupling. Confirms B2-reweight matches truth to <0.1 nat over a wide coupling range when main n_eff is healthy. * demo/rift/add_distance_grids/PLAN_B_DESIGN.md: design notes covering the math, the GMM-vs-AV finding, the (recommended) non-destructive workflow integration plan, and the deferred B2-fresh cross-check path. End-to-end check on the fake-data demo (AV sampler, main n_eff ~6): B2 marginal reconstructs the main log_res within sigmaL; per-slice n_eff 7-28; .dslice 10 rows x 19 columns per event. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reweight alone breaks in the tails: Omega samples drawn during the main
run have no support at distances far from the posterior peak, so the
slice estimator silently biases or returns garbage there. Switch to a
hybrid scheme where core slices stay reweight (cheap, accurate inside
the posterior) and wing slices are fresh Omega-only AdaptiveVolume
integrations at the pinned distance (correct, expensive only on the few
points we need them).
* RIFT/misc/distance_slices.py:
- fresh_sample_slices builds a fresh AV sampler over Omega only, clones
the main sampler's per-param (pdf, prior, llim, rlim) config, wraps
like_to_integrate to pin distance and defensively clip Omega values
inside [llim, rlim] (avoids arccos NaN at boundary).
- pick_wing_centers places K_wing centers log-uniformly in
[d_min, d_core_lo] union [d_core_hi, d_max], evenly split.
- is_uninformative detects a flat-in-d core so wings are skipped on
events where the distance posterior carries no information.
- sigma_lnL conversion for fresh slices: AV returns log(rel_var) +
2*log_int; we report sqrt(rel_var) so the column is on the same
scale as the reweight branch and the main run's sigmaL_main.
* integrate_likelihood_extrinsic_batchmode: split --export-distance-slices
K into --n-distance-slice-core (reweight) + --n-distance-slice-wing
(fresh). Default 60/40 split. New flags --distance-slice-wing-nmax,
--distance-slice-wing-neff, --distance-slice-skip-threshold.
Per-row method column (reweight=0, fresh=1) marks which estimator
produced each slice.
* PLAN_B_DESIGN.md: documents the architecture and the empirical wing
reach on the demo event (~30 nats below peak with sigmaL ~0.1-0.2,
well past the ~7-nat-target for 10^{-3} prior weight outside).
End-to-end on the fake-data demo (AV, --n-distance-slice-core 6 --n-distance-slice-wing 4):
main log_res 59.09 +/- 0.30, n_eff 4.4
core slices: lnL 60.8-62.3 (peak-lnL 0-1.4 nat), sigmaL 0.19-0.55
wing slices at 23/376/613 Mpc: lnL 59/51/34 (peak-lnL 3/11/29 nat),
sigmaL 0.10-0.22, neff 9-24
far wing at 5 Mpc: lnL -100 (signal off), correctly flagged low neff
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups left for the next session, called out as breadcrumbs
rather than implemented now:
1. Skip threshold should be an absolute lnL scale.
lnL is already a likelihood ratio with absolute meaning; the
current relative spread test will misfire on high-SNR events
whose distance posterior happens to be flat.
2. Wing centers from a parabolic-in-1/dist fit of the core, solved
for the 1/dist values where lnL drops by ~7 nats from peak
(probability outside ~10^{-3}). Marginalized-lnL caveat (the
inclination-distance ridge can extend further toward small d
than a simple parabola predicts) documented inline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements the two PLAN_B_DESIGN breadcrumbs: 1. is_uninformative now applies an absolute lnL detectability cut (peak core lnL < threshold) instead of a relative max-min spread test. lnL is a likelihood ratio vs noise, so this correctly skips undetected low-SNR events while keeping high-SNR events with a flat distance profile. ILE skip message and --distance-slice-skip-threshold help updated. 2. pick_wing_centers fits the core (lnL, 1/d) points to a parabola in 1/d (fit_lnL_parabola_in_inv_d) and spans each wing from the core edge out to where the model drops --distance-slice-wing-delta-lnL nats below peak (default 7), via _parabolic_wing_bounds. Bounds are clamped to the sampler's distance support; degenerate fits fall back to the original log-uniform full-range placement. New ILE flag --distance-slice-wing-delta-lnL threads the target. Adds a regression test (test_wing_placement_and_skip) to validate_distance_slices.py; verified end-to-end on the fake-data demo (AV sampler): wings concentrate near the core instead of the prior edges, and reconstruct_marginal_lnL matches log_res within sigmaL. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ine builder Threads per-distance likelihood export from util_RIFT_pseudo_pipe.py through create_event_parameter_pipeline_* onto the ILE extrinsic stage (ILE_extr.sub), with an end-to-end pipeline-build test/demo. CEPP (Basic/Alternate/BasicMultiApprox): - New flags --last-iteration-export-distance-slices K plus -n-core/-n-wing/-wing-delta-lnL/-skip-threshold passthroughs. When set, the extrinsic stage gets --export-distance-slices K (+ the tunables + --internal-use-lnL) and --distance-marginalization is stripped, mirroring the existing grid export. - AlternateIteration and BasicMultiApproxIteration previously lacked the grid flag entirely; added both the grid and slice args + the ile_args_extr handling so the subdags/multi-approx CEPP variants accept the flags pseudo_pipe now routes to them. util_RIFT_pseudo_pipe.py: - New --export-distance-slices K (+ tunables), sibling to --export-marginal-distance-grid. - When either export is requested: force ILE lnL mode, disable distance marginalization (sane auto-config instead of erroring), and warn if --add-extrinsic is absent (the export is emitted there). - Fix: the --last-iteration-export-* flags are pipeline-builder flags, not ILE flags. They were being appended to args_ile.txt (the ILE argument string), where they would have been passed to the ILE executable and rejected. Move them to the CEPP command; keep only the ILE-side hygiene (lnL mode, no distance marginalization) in args_ile. Make the three create_event_parameter_pipeline_* scripts executable (100644 -> 100755), matching their sibling bin scripts: pseudo_pipe invokes them by bare name, so editable/source/pixi installs need +x. Validation: - New demo MonteCarloMarginalizeCode/Code/demo/pipeline (Makefile + README): builds baseline/grid/slices pipelines from the reference ini and asserts the flags land in ILE_extr.sub (and not in the intrinsic ILE.sub), with no distance marginalization. All pass. - Expanded .travis/test-build.sh with the same grid + slice build assertions (run in GitLab and GitHub CI). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… done in PLAN_B_DESIGN Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ic stage Fix the per-distance export threading so distance marginalization is kept for the intrinsic ILE iterations (a large speedup) and removed ONLY on the final extrinsic stage that emits the per-distance output. Previously util_RIFT_pseudo_pipe.py set opts.internal_marginalize_distance = False and stripped --distance-marginalization from args_ile.txt, which disabled it for every ILE job in every iteration. Now pseudo_pipe only forces ILE lnL mode globally (clean lnL-scaled helper args) and leaves distance marginalization in place; create_event_parameter_pipeline_* already strips the standalone --distance-marginalization flag from the ILE_extr argument string, so the disable is confined to the export stage. Make util_InitMargTable executable (100644 -> 100755): the helper invokes it at build time to generate the distance-marginalization lookup table, which is now needed again because the intrinsic stage keeps distance marginalization. Validation updated to prove the last-stage-only invariant: demo/pipeline and .travis/test-build.sh now assert the standalone --distance-marginalization flag is present on the intrinsic ILE.sub / args_ile.txt but absent from ILE_extr.sub (matching the standalone flag via a trailing space, so the harmless leftover --distance-marginalization-lookup-table arg is not counted). All three demo targets pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n is disabled only at the extrinsic stage Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…validation demo
Adds the consolidation step the previous threading work was missing, plus a
self-contained zero-spin IMRPhenomD demo that runs the whole chain (pipeline
build -> ILE_extr -> consolidate -> posterior) end-to-end without condor.
Pipeline:
- New util_ConsolidateDistanceGrids.py: concatenates per-event .dgrid/.dslice
files (header-checked) into a single net intrinsic+distance table.
- New write_consolidate_distance_grids_sub in RIFT.misc.dag_utils_generic:
mirrors write_cat_sub (extrinsic posterior samples) so the consolidation
plugs into the same post-extrinsic part of the DAG.
- create_event_parameter_pipeline_BasicIteration: when the last-iteration
per-distance export is on, emit consolidate_dgrid.sub /
consolidate_dslice.sub gated on the corresponding flag, build a DAG node,
and chain it as a child of every ILE_extr job (.dgrid / .dslice come
directly off ILE with no convert/resample step, so the consolidation node
parents the ILE_extr nodes, not the cat_node downstream).
Output: all_dgrid.dat / all_dslice.dat at the run root.
Demo + validation:
- demo/pipeline/Makefile: updated grid/slices assertions to require the
consolidation sub-file and DAG references.
- demo/pipeline/zero_spin_phenomD/: new end-to-end test. Uses the
.travis/ILE-GPU-Paper zero-noise BBH fake data, IMRPhenomD with
--assume-nospin, AV sampler. Steps:
build -> util_RIFT_pseudo_pipe.py constructs the pipeline and
asserts the extrinsic stage carries grid export + AV +
IMRPhenomD + lnL mode, distance marg only off at the
extrinsic stage, consolidate_dgrid in the DAG.
run-extr -> bypass condor; invoke ILE_extr directly on N_EVENTS
grid rows -> per-event .dgrid files.
consolidate -> util_ConsolidateDistanceGrids.py -> all_dgrid.dat.
posterior -> util_ConstructEOSPosterior.py with --parameter m1 -m2 -dist
reconstructs the joint (intrinsic+distance) posterior.
Whole chain in ~45 s on a laptop core. Ships a minimal zero_spin_phenomD.ini
whose [rift-pseudo-pipe] section deliberately omits approx /
ile-sampler-method so the CLI overrides win (the ini section parser
otherwise overrides the command line).
Drive-by fix needed for the demo:
- util_ConstructEOSPosterior.py had CRLF line endings, breaking its
/usr/bin/env shebang ("python\r" not found). Converted to LF.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… in PLAN_B_DESIGN Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…--pipeline-builder hot-swap
The --use-subdags path (create_event_parameter_pipeline_AlternateIteration)
was broken for normal runs by a chain of issues, each masking the next:
- cip_args_list parsing crashed on the 'Z'/'G' prefixes emitted by
util_RIFT_pseudo_pipe.py (ValueError: invalid literal for int() ... 'Z').
Ported BasicIteration's tolerant prefix parsing.
- argparse rejected 8 options the helper passes (extrinsic samples-per-ile,
time-resampling, batched-convert, ile-request-disk, cip-explode-jobs-dag/-last,
n-iterations-subdag-max). Added them with BasicIteration's signatures; wired
the ones with a clean home, documented the rest as accepted-but-not-acted-on.
- completed the half-built extrinsic batched/time-resampling convert path
(batchConvertExtr_job was referenced but never defined) by porting
BasicIteration's 3-branch convert setup + node construction.
- fixed undefined unify_node_list used to attach SCRIPT POST composite checks.
AlternateIteration now builds a complete DAG end-to-end from a standard
pseudo_pipe invocation.
Apply the same int('Z') tolerant-parsing fix to BasicMultiApproxIteration,
which had the identical crash.
Thread AlternateIteration into util_RIFT_pseudo_pipe.py as a first-class
drop-in via a new --pipeline-builder {BasicIteration,AlternateIteration}
selector that overrides the implicit --use-subdags routing, enabling
side-by-side A/B testing of the two builders from an otherwise identical
command line. Warns if an explicit choice contradicts an AMR/subdag
requirement.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The mcsamplerEnsemble GPU port (and MonteCarloEnsemble / gaussian_mixture_model) had never been exercised with cupy installed and crashed in GPU mode. Validated end-to-end with .travis/test-integrate.sh on a Kepler/sm_30 card using a cupy 10.6 + cudatoolkit 10.2 environment (last CUDA supporting sm_30). mcsamplerEnsemble: * evaluate()/calc_pdf(): bridge the host integrand/prior -- convert samples to CPU, call the user function, push the result back to the active backend. * replace cupy-incompatible rot90([list]) with an order-preserving reshape. * build dim-group / bounds dict keys with host ints (range/np.arange); the self.xpy.arange variant produced unhashable 0-d cupy arrays on GPU. * return scalars and store _rvs on the host so downstream numpy code works. gaussian_mixture_model / MonteCarloEnsemble: * portable _xpy_logsumexp (cupyx.scipy.special.logsumexp is absent in the cupy CUDA 10.2 build needed for sm_30). * _near_psd: use Hermitian eigh/eigvalsh on GPU (cupy.linalg has no eig/eigvals); the matrices are symmetric. numpy path unchanged. * gpu_logpdf: cupy.linalg has no LinAlgError and cholesky returns NaN rather than raising; catch numpy's error type and treat a NaN factor as failure. All integrators: import cupyx.scipy.special explicitly (not auto-loaded by import cupyx in older cupy). mcsamplerGPU / mcsamplerAdaptiveVolume (so the full test passes in GPU mode): * use instance converters / self.xpy instead of module-level GPU converters, which were contaminating these otherwise-CPU samplers with cupy arrays; * bridge their CPU prior/integrand; ones_like to follow the data backend. No behavioral change without cupy: all GPU branches are guarded by cupy_ok. Note: the AC sampler's --as-test check is statistically flaky (no seed) on both CPU and GPU and can randomly fail; this is pre-existing and unrelated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… dep canary Three related pieces for container CI and flexible multi-architecture deployment: Feature C (core): container family manifest. New RIFT/misc/container_manifest.py parses a YAML manifest (advertising a family of images + GPU capability ranges) and builds HTCondor expressions. Wired into write_ILE_sub_simple and write_CIP_sub (dag_utils_generic.py, import-guarded): when SINGULARITY_RIFT_IMAGE points at a .yaml/.yml manifest, MY.SingularityImage becomes an expression-valued ifThenElse over the matched machine's GPU capability (default GPUs_Capability), selecting the right image per machine. Only the matched image is fetched (CVMFS images referenced in place / lazy-fetched; osdf images selectively transferred via a comma-free $$() ternary token), never the whole family. A require_gpus capability floor is &&-composed with any user RIFT_REQUIRE_GPUS. Plain .sif / osdf:// values keep byte-identical legacy behavior. Vanilla universe throughout. Feature B: multi-target build. New containers/ dir -- rift_container.def.in template + build_family.sh (build matrix; first entry keeps the current production base for broad compatibility), shared requirements-container.txt (single source of truth), example rift_container_family.yaml, and README. Feature A: CI dependency-resolution canary. New non-blocking container-dep-canary and container-swig-canary jobs in ci.yml, plus a weekly schedule, to catch upstream breakage (e.g. swig>=4.4.0, issue oshaughn#136) before a container rebuild. Tests: MonteCarloMarginalizeCode/Code/test/test_container_manifest.py (13 tests: parser, expression builders, integration via write_ILE_sub_simple condor_cmds, all-cvmfs, backward-compat). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Validated on a real HTCondor pool + GPU (cap-3.0 machine): GPUs_Capability /
Capability attribute names, require_gpus floor matching+exclusion, $$()
match-time image selection, and tolerance of the empty-result ("") case for a
mixed CVMFS/osdf manifest. Mixed manifests are safe; uniform retrieval is not
required. Only the OSG/GWMS pilot evaluation of the expression-valued
MY.SingularityImage remains to be smoke-tested on a real glidein.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…pipe (CI data)
Adds PP_PILOT toggle + `make pp-run-pilot` (and pp-run-pilot-build): pp-run with
--calmarg-pilot, so the full top-level pilot DAG (harvest->dump->fit->consolidate->seed
wide_{N+1}) runs on the CI fake data. build-validate asserts CALPILOT.sub runs
util_CalPilotStage.py, the CALPILOT job is in the DAG, and the wide ILE args carry the
--calibration-proposal-breadcrumb seed. Honours OSG/CIT like pp-run. Verified the build.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…(task oshaughn#23) The CALPILOT job runs ILE internally, so on OSG it needs the same container + input set as a wide ILE job. write_calpilot_sub gains use_osg/use_singularity/frames_dir/transfer_files (mirrors write_ILE_sub_simple): - runs in the singularity image (exe at SINGULARITY_BASE_EXE_DIR; transfer_executable False; MY.SingularityImage/BindCVMFS/flock_local; HAS_SINGULARITY requirement); - a calpilot_pre.sh prescript rebuilds local.cache (relative paths) from the transferred frames, then execs the stage; - transfer_input_files = transfer_files (PSD + cal envelopes, sans the wide grid) + frames_dir + composite + args_ile.txt; transfer_output_files = the consolidated breadcrumb; stage args reference BASENAMES (no shared FS), workdir '.'. - refinement (--prev-breadcrumb) is skipped on OSG (the prev breadcrumb is produced at runtime, can't be reliably listed for transfer at iteration 0) -> each OSG pilot is an independent cold start, which the prior-shrinkage fit makes safe. create_event_parameter_pipeline_BasicIteration passes the OSG params + transfer_file_names to the calpilot job. ILE robustness: the wide-ILE breadcrumb seed load is now wrapped in try/except -> a missing/partial/invalid breadcrumb (esp. under OSG file transfer) falls back to PRIOR cal draws with a warning instead of killing the job. DONE: the CALPILOT jobs RUN on OSG and produce cal_consolidated_N.npz (transferred back). REMAINING (task oshaughn#23): consuming the seed on OSG -- transferring cal_consolidated_{N-1}.npz to the wide_{N+1} ILE jobs (pseudo_pipe basename ref + the iteration-start-absent edge), so the wide jobs use the learned proposal rather than always falling back to prior. UNTESTED off-CIT: validate the container + transfer on a real OSG run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ist (task oshaughn#23 complete) Wide-ILE seed consumption on OSG (util_RIFT_pseudo_pipe.py): when --use-osg-file-transfer, reference the proposal breadcrumb by BASENAME (cal_consolidated_$(macroiterationprev).npz), add it to the ILE transfer list, and create a placeholder cal_consolidated_-1.npz so condor's transfer for the first iteration (prev=-1, never produced) succeeds -- ILE's breadcrumb load is try/except and falls back to the prior for the placeholder. So wide_{N+1} now actually consumes the learned proposal on OSG, not just falls back to prior. Clean CALPILOT transfer list: write_ILE_sub_simple mutates transfer_file_names in place (appends frames_dir, ile_pre.sh, the grid), and CALPILOT is built after the wide ILE, so it had inherited that pollution (frames_dir x3, the wide grid, the ILE prescript). Snapshot a clean PSD+cal-envelope transfer list BEFORE those mutations and pass it to the pilot. Verified: the CALPILOT transfer_input_files now lists each file exactly once (PSD, cal envelopes, composite, args_ile.txt, frames_dir, calpilot_pre.sh, prev breadcrumb). This completes the OSG pilot file transfer (CALPILOT runs in-container + transfers I/O; wide_{N+1} gets the seed). UNTESTED off-CIT -- validate on a real OSG run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… rundir_pp_run pp-run-build starts with `rm -rf $(PP_RUN_REAL)`, and pp-run-pilot reused rundir_pp_run -- so launching the pilot demo DESTROYED an in-progress vanilla pp-run. pp-run-pilot[-build] now overrides PP_RUN_REAL=rundir_pp_pilot so the two run directories are independent and neither clobbers the other. clean removes both. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n the OSG prescript too On OSG the CALPILOT.sub executable is calpilot_pre.sh (the prescript that rebuilds local.cache then runs the container's util_CalPilotStage.py), so the stage name is in the prescript, not CALPILOT.sub. The build-validate grep now checks BOTH CALPILOT.sub and calpilot_pre.sh (grep -qs), fixing a spurious "CALPILOT.sub does not run util_CalPilotStage.py" on OSG. Pipeline-writer/demo-level only -- no container rebuild. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…+ PoC
The decade-old "save the extrinsic distribution to inform the next iteration" goal,
generalized from the cal pilot's breadcrumb. GMM-first (mcsamplerEnsemble is already
seedable via gmm_dict).
- breadcrumbs.py (schema v2): the extrinsic slot now carries a per-param-group Gaussian
mixture -- means/covariances/weights/bounds + the param NAMES (so dim-group indices
reconstruct against the next run's params_ordered). cal + extrinsic coexist in one
breadcrumb. save/load round-trip test (cal + extrinsic) PASSES.
- RIFT/calmarg/extrinsic_handoff.py:
fit_extrinsic_proposal(samples, log_weights, groups, bounds, n_comp) -- per group, fits
with RIFT's OWN gaussian_mixture_model.gmm (the exact fitter the sampler uses in
update_sampling_prior), so stored means/covs are in the model's internal frame and
restore byte-identical -- no coordinate guesswork, no sklearn.
gmm_dict_from_breadcrumb(extrinsic, params_ordered) -- reconstructs gmm objects keyed by
dim-group indices (looked up by name), ready to seed mcsamplerEnsemble's gmm_dict.
Standard groups (ra,dec),(distance,incl),(phi_orb,psi). Handles the GMM running on cupy.
- PoC (__main__): synthetic BIMODAL sky posterior -> fit -> breadcrumb -> load -> seed ->
the seeded sky GMM recovers BOTH modes. PASS.
Worktree branch rift_O4d_junior_extrinsic_handoff (off the calmarg branch); does not touch
the running pipeline checkout.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…cept) Complete the GMM extrinsic-handoff loop: - ILE (integrate_likelihood_extrinsic_batchmode, EXECUTE-POINT -- needs container rebuild): --extrinsic-proposal-output harvests the run's extrinsic posterior samples + importance weights from sampler._rvs after integrate (same weight recipe as the distance-grid export, incl. the GMM sampler's raw-integrand storage), fits per-group GMMs via RIFT.calmarg.extrinsic_handoff, and writes a breadcrumb. --extrinsic-proposal-breadcrumb seed side (pre-fill gmm_dict) was added prior. Both wrapped in try/except so the handoff can never break a production integration. - DESIGN_extrinsic_handoff.md: the decade-old "carry the extrinsic posterior between iterations" goal, GMM-first rationale (mcsamplerEnsemble.gmm_dict is trivially seedable), module/ILE pieces, PoC result, pilot-DAG plug-in plan, and the AV partial-reset limitation (task oshaughn#30: AV resets every integrate(), can only contract). PoC (python -m RIFT.calmarg.extrinsic_handoff) and breadcrumb round-trip (python -m RIFT.calmarg.breadcrumbs) both PASS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Make the extrinsic handoff usable end-to-end (standalone; does NOT require the cal
pilot), gated by --extrinsic-handoff and requiring the GMM sampler:
- util_RIFT_pseudo_pipe.py --extrinsic-handoff: thread per-event
--extrinsic-proposal-output extr_proposal_$(macroiteration)_$(macroevent).npz and the
seed --extrinsic-proposal-breadcrumb .../extr_consolidated_$(macroiterationprev).npz
into args_ile.txt (OSG: basename + transfer-list + iteration-0 placeholder; shared FS:
absolute path), mirroring the cal breadcrumb. Warns if --ile-sampler-method != GMM.
Passes --extrinsic-handoff[-select] through to the pipeline builder.
- util_ExtrinsicConsolidate.py (NEW): pick the single most representative per-event
proposal (default by lnL = nearest the peak; neff/n_samples also available) ->
extr_consolidated_<it>.npz. Skips unreadable/placeholder inputs; ALWAYS writes output
(empty if nothing valid) so the next iteration's seed/transfer never fails.
- dag_utils_generic.write_extrconsolidate_sub (NEW): the consolidation job, LOCAL universe
on the submit node (pure-python file selection, no GPU/ILE/container/frames). On OSG the
per-event ILE outputs are transferred back to <wd>/iteration_<it>_ile, so it reads them
from the shared FS -- no per-event input transfer (which condor cannot glob).
- create_event_parameter_pipeline_BasicIteration: one consolidation node per iteration,
gated behind that iteration's unify (ILE barrier), and the next iteration's wide ILE jobs
depend on it: unify_{it} -> EXTRCONSOLIDATE_{it} -> wide ILE_{it+1}.
- ILE save side: record true lnL + neff in the proposal breadcrumb meta so consolidation
can pick the most representative point.
- demo/rift/calmarg: `make extr-build` builds + offline-validates the whole thread
(args_ile.txt flags, EXTRCONSOLIDATE.sub, unify->consolidate->next-ILE DAG edges);
separate rundir_pp_extr so it never touches other run dirs.
Verified: `make extr-build` passes; util_ExtrinsicConsolidate standalone tests pass
(picks highest-lnL, skips placeholders, writes empty on no-input).
NOTE: the ILE binary change (--extrinsic-proposal-output/-breadcrumb, save+seed) is
EXECUTE-POINT -- rebuild the container before an OSG/CIT run. The convergence subdag
(--first-iteration-jumpstart) does not yet carry --extrinsic-handoff (same as --calmarg-pilot).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rget Found by running the GMM extrinsic handoff on a real GPU (cardassia, NVS 510); the full loop now works end-to-end (iteration-0 writes proposal -> consolidate -> iteration-1 prints "Extrinsic GMM SEEDED ... [(4,5),(3,2),(0,1)]" for all three groups -> integrates -> writes the next proposal): - reconstruct_gmm: move self.bounds onto the GPU (identity_convert_togpu). The sampler's score()/_normalize write into a cupy array, so a leftover numpy bounds raised "non-scalar numpy.ndarray cannot be used for fill". - gmm_dict_from_breadcrumb(existing_keys=...): match each breadcrumb group to the sampler's actual gmm_dict key by dim-SET and permute the stored means/covariances/bounds columns into that key's order. Fixes the phase/pol group being silently dropped because the sampler pairs (psi,phi_orb)=(0,1) while the breadcrumb stored (phi_orb,psi)=(1,0). - reconstruct_gmm(cov_inflate=2.0): broaden the seed (a warm start should be conservative; the ensemble sampler can contract but starves if seeded too tight). Mitigates -- does not rescue -- a degenerate source: on a bad batch the sampler _reset()s gmm_dict[k]=None, i.e. discards the seed and continues cold (correct safety net). So seed quality tracks the SOURCE iteration's convergence; a useful (accelerating) seed needs n_eff in the hundreds, i.e. a real --n-max / larger GPU, not the tiny smoke (n_eff~1 -> seed safely discarded). - demo/rift/calmarg: `make extr-run[-build]` -- tiny GMM extrinsic-handoff pipeline on the CI data (300 initial / 200 per-gen intrinsic, 50 evals/ILE job, n-chunk 4000, n-max bounded to 40000 vs the 4,000,000 production default, >=2 iterations). Derives a run-specific ini (sed) because [rift-pseudo-pipe] ini values override the CLI. Separate rundir_pp_extr_run. DESIGN_extrinsic_handoff.md: documents the GPU validation, the two bugs, and the seed-quality-vs-source-convergence finding. ILE binary change is EXECUTE-POINT (container rebuild for OSG/CIT). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Background
----------
util_ConstructIntrinsicPosterior_GenericCoordinates.py has long used
three CLI flags for declaring how a parameter is treated:
--parameter X both fit dim AND MC sampling dim
--parameter-implied X fit dim only (the converter produces X from the
data file's columns; the MC integrator never
sees it)
--parameter-nofit X MC sampling dim only (the integrator integrates
over it; the fit never sees it)
util_ConstructEOSPosterior.py declared the same three flags but never
honoured them: the integrator at line 487 hardcoded
`low_level_coord_names=dat_orig_names` in its convert_coords closure,
which only worked when the sampling basis equalled the data-file basis;
sampler.add_parameter iterated over coord_names (the fit basis); the
arity dispatch for likelihood_function keyed on len(coord_names); and
sampler.integrate was passed *coord_names rather than
*low_level_coord_names. The net effect: any user who tried to fit in a
transformed basis (e.g. via the new --supplementary-coordinate-code
plugin) silently got a wrong likelihood evaluation -- the rotation was
applied an extra time inside convert_coords every Monte Carlo step.
What this commit changes
------------------------
bin/util_ConstructEOSPosterior.py
* Parameter-resolution block rewritten to mirror IntrinsicPosterior's
semantics, plus a clean fallback to dat_orig_names when none of the
three flags are supplied (legacy bare-invocation unchanged). Seven
CLI permutations now map to documented (coord_names,
low_level_coord_names) pairs.
* The convert_coords closure used by the integrator captures
low_level_coord_names as its input basis (was dat_orig_names). The
initial dat->X conversion still uses dat_orig_names, since that's
the basis of the file columns.
* Sampler add_parameter loop now iterates over low_level_coord_names
(the MC basis), and sampler.integrate is passed *low_level_coord_names.
* The arity-dispatched likelihood_function definitions key on
len(low_level_coord_names) and route every input -- including the
scalar branches -- through convert_coords so a non-trivial converter
is never silently bypassed.
* Output-writer iterates samples by low_level_coord_names (the keys
sampler._rvs actually carries) and applies the "constant fill"
check in the sampling basis, not the fit basis. Implied (fit-only)
coords correctly skip the output file.
* Added a guard: if low_level_coord_names != coord_names but no
coordinate plugin is supplied, raise a clear error instead of
silently feeding samples through an identity convert_coords into a
fit built in a different basis.
* Help text for --parameter / --parameter-implied / --parameter-nofit
rewritten to describe what each flag actually does now.
RIFT/hyperpipe/coords.py
* HyperCoordSpec.from_strings accepts integration ranges for names in
coords-nofit (the MC sampling basis is coords-fit + coords-nofit);
unknown range names are still rejected.
* HyperCoordSpec.validate accepts empty coords-fit so long as
coords-implied covers the fit basis and coords-nofit covers the
sampling basis; emits distinct errors for empty-fit vs empty-sample.
* to_parameter_args emits --integration-parameter-range for the
sampling basis (parameters + nofit), not just parameters.
* to_puff_args and to_test_args emit --parameter for the sampling
basis -- the puff lane and convergence-test driver operate on the
data-file columns, which is the sampling basis after decoupling.
RIFT/hyperpipe/config.py
* validate_config accepts empty coords-fit when coords-implied
(fit-side) and coords-nofit (sample-side) are non-empty.
demo/hyperpipe/hyperpipe_conf_linear_uvw.yaml
* Rewritten to actually exercise the decoupled path: coords-implied
"u v w" (fit), coords-nofit "x y z" (sample), coords-sample ranges
in (x, y, z), coord-module pointing at the linear plugin with the
uvw_rotated chart. Iteration / puff / marg stay in (x, y, z); the
EOS posterior fits in (u, v, w) and writes its posterior in
(x, y, z).
Verified
--------
* Parameter-resolution unit test (in this commit's worktree) covers 7
CLI permutations -- legacy no-flags, legacy --parameter, IntrinsicPosterior
--parameter+implied and --parameter+nofit, the new --implied-only,
--implied+nofit, and full --parameter+implied+nofit -- all map to
the documented (coord_names, low_level_coord_names) pairs.
* HyperCoordSpec unit test covers the new decoupled emit (post sees
implied/nofit and ranges; puff/test see the sampling basis only),
a legacy-regression case (unchanged output), the two new validation
errors (empty fit, empty sample), the new "range for nofit name"
permission, and the still-rejected "unknown range name" case.
* AST + yaml parses on every edited file.
* validate_config passes on hyperpipe_conf_linear_uvw.yaml plus the
demo's baseline and tracer yamls.
… GPU Attempting the seed-acceleration demo on the CI point (SNR~17.5, lnLmax~90-115) showed the ensemble (GMM) sampler does not converge there: n_eff pinned at ~1 through ~200k samples, with OR without calmarg (vanilla GMM: 1.00007 at 196k / 50 iterations). GMM collapses onto the dominant sample at a sharp high-SNR peak; AV (the production sampler) handles these but is not seedable. So the GMM->GMM handoff is correct+safe but cannot bootstrap a useful seed on real high-SNR data -- its payoff is gated on seedable/partial-reset AV (task oshaughn#30/oshaughn#25) or a cross-sampler AV->GMM seed (fit_extrinsic_proposal already accepts any sampler's samples). Recorded in DESIGN_extrinsic_handoff.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sal-adapt) + cross-sampler findings - --extrinsic-proposal-adapt (default OFF = freeze): the seeded GMM groups are no longer re-fit each iteration. Re-fitting a seed on a bad first batch dies in the GMM init (random.choice "probabilities are not non-negative") and triggers _reset, discarding the seed. _train already skips groups with gmm_adapt=False, so freezing preserves the seed. Freezing is also the right semantics for a handed-off / cross-sampler proposal. With freeze the seeded run completes with 0 resets and n_eff rises from cold ~1 to ~5-10. - DESIGN_extrinsic_handoff.md: document the cross-sampler AV->GMM result. AV converges as a source (n_eff~7 at 400k, lnLmax~143); the frozen seed lands cleanly and lifts n_eff, but the seeded GMM INTEGRAL is wrong (sqrt(2 lnLmax)=nan, Z~1e-4 vs cold ~1e43) -- the proposal is importance-sampling a displaced region. Two suspects to audit (no more blind GPU): AV-vs-GMM _rvs coordinate convention (angle vs cosine for incl/dec), and cov_inflate pushing distance out of [1,1000] into NaN likelihood. Same-sampler GMM->GMM round-trips cleanly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…d weights, ESS n_comp, distmarg) Debugging the wrong-integral the GPU run showed (the seeded GMM integrated as if lnL~44 vs the true ~140), found+fixed four real issues; the cross-sampler seed is now numerically correct (finite lnLmax, valid Z) end-to-end: 1. SAVE side used the sampler's stored log_weights, but mcsamplerGPU/AV stores log_weights = tempering_exp*lnL + ln(prior) - ln(s_prior) (adapt-weight-exponent baked in). Fitting the GMM to those flattened weights displaces the proposal. Now build the TRUE untempered weight from log_integrand + log_joint_prior - log_joint_s_prior and prefer the raw components over 'log_weights'. (GMM's own _rvs is untempered -> GMM->GMM unaffected.) This took the seeded n_eff from ~5 to ~26. 2. cov_inflate default 2.0 -> 1.0: a frozen seed should match the source, not be widened (inflation pushes samples past hard bounds -> NaN likelihood). 3. fit_extrinsic_proposal: cap mixture components by the weight ESS (k <= ESS/(d+2)) and DROP any non-finite component (renormalize; skip group if none survive). A starved source collapses a component to a singular/NaN covariance that poisons the whole seed. 4. The persistent nan lnLmax was distance sampled against [1,1000]: a seeded distance Gaussian spills past the bound -> NaN. The calmarg path is meant to run with --distance-marginalization (the fused kernel IS a distmarg kernel); with distmarg on, the seeded integral is finite and valid. (Gap: pseudo_pipe/extr-run don't add --distance-marginalization yet -- noted in doc.) Result (distmarg on, CI point SNR~17.5): seeded GMM has 0 resets, finite lnLmax, valid Z, but n_eff ~1 == cold n_eff ~1. The handoff is correct+safe but does NOT accelerate here because GMM does not converge on this peak (cold or seeded) and the AV source (n_eff~5) is too under-converged to inform a strong seed. Hard evidence the payoff needs seedable AV (task oshaughn#30) or a converged source. Full analysis in DESIGN_extrinsic_handoff.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nstance (XML compat) copy_lsctables_sim_inspiral iterated lsctables.SimInspiralTable.validcolumns.keys() (the full schema) and did bare getattr(row, simattr) for string columns (waveform/source/numrel_data/ taper) and numeric columns. On the current igwn_ligolw + lalsuite stack, ILE-written sim_inspiral tables contain only the columns actually set, so the schema view and the written columns drift apart -> reading a saved ILE output_*.xml.gz raised "AttributeError: 'SimInspiral' object has no attribute 'waveform'" (and would equally fail on any absent numeric column via the else branch). Fix: skip columns not present on the row instance (hasattr guard), after the process_id/simulation_id default-setting branch (which doesn't read the row). RIFT's own grids (written via lsctables.New with all columns) are unaffected; column-subset tables now round-trip. Verified both a 300-row RIFT grid and a waveform-less ILE-style table read with no AttributeError. Per /home/oshaughn/BREADCRUMB_rift_xml_compat.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… for the calmarg pipeline There was no pipeline-code gap: --internal-marginalize-distance already composes cleanly with --calmarg-fused-kernel (verified -- args_ile gets --distance-marginalization + a util_InitMargTable lookup table AND --calibration-fused-kernel), and the fused kernel does NOT require distmarg (it has both Q_fused_calmarg_cupy and Q_fused_calmarg_distmarg_cupy; the ILE binary wires whichever applies). The only gap was that the demo targets didn't expose distmarg. - demo Makefile: PP_DMARG toggle (default 0, optional) -> --internal-marginalize-distance --internal-distance-max PP_DMAX, threaded into extr-build, extr-run-build, pp-run-build. (Distinct from the direct-ILE dag-build DMARG knob, which uses a pre-built lookup table.) extr-validate checks --distance-marginalization + lookup table when PP_DMARG=1. Verified `make extr-build PP_DMARG=1` passes. - DESIGN_extrinsic_handoff.md: corrected -- distmarg is OPTIONAL with the fused kernel, not required; RECOMMENDED with --extrinsic-handoff (removes distance + its hard bound from the seeded GMM proposal, which was the source of the boundary-NaN). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…fairdraw cupy crash)
BREADCRUMB_export_cal_posterior.md: the final export with extrinsics did not carry the
recovered calibration posterior. Now --calibration-export-posterior (ILE) /
--calmarg-export-posterior (pseudo_pipe): at the fairdraw export, for each fair-draw sample
draw ONE cal realization in proportion to its posterior weight (per-realization likelihood
components from return_cal_components, times the importance weight cal_log_weights) and write
a SELF-CONTAINED sibling <output>_<event>_cal.dat with the FULL draw -- intrinsic + extrinsic
+ the drawn realization's spline nodes as labeled cal_<IFO>_amp_<k>/cal_<IFO>_phase_<k>
columns. (The fairdraw LIGOLW/.dat schema can't carry arbitrary columns, so per the user the
cal posterior rides a row-aligned sibling .dat with the whole draw, plottable as-is.)
- node retention: the production prior path now keeps the cal node vectors
(draw_prior_realizations_with_nodes) when the flag is set; the seed path already returns them.
- verified on GPU: writes 1 sample x 90 cols incl 60 cal cols (amp_0..9 + phase_0..9 over H1,L1,V1).
Also fix a PRE-EXISTING crash this surfaced: mcsamplerEnsemble (GMM sampler) fairdraw on GPU
did `self.xpy.min([n_extr, 1.5*eff_samp, 1.5*neff])` -- cupy.min has no Python-list overload
("'list' object has no attribute 'min'"), so ANY GMM-sampler fairdraw export on GPU crashed
(independent of calmarg). Use Python min() of floats.
ILE binary is EXECUTE-POINT (container rebuild to run on OSG/CIT).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…+ thread cal-export into demo PILOT OSG bug: iteration 0 seeds from cal_consolidated_$(macroiterationprev).npz with macroiterationprev=-1 -> cal_consolidated_-1.npz, the 0-byte placeholder pseudo_pipe creates so condor's transfer_input_files of that path does not fail. Locally the file simply does not exist (the "missing -> fall back to PRIOR" check fires); on OSG it IS transferred in, so it EXISTS but is empty, and np.load raised "EOFError: No data left in file", crashing the first-iteration ILE. Fix: treat a missing OR EMPTY breadcrumb as "not present yet" (size guard before any load), for BOTH the calibration and extrinsic seed paths. EXECUTE-POINT -- rebuild the container. demo/rift/calmarg: PP_CALPOST toggle (default 1) threads --calmarg-export-posterior into pp-run-build and extr-run-build, so the recovered cal posterior is written in the runnable demos. (Does NOT touch a running rundir_pp_run -- pp-run-build starts with its own rm -rf.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nders; works on old container) Complement to the ILE empty-breadcrumb size-guard: make the iteration-0 placeholder a VALID breadcrumb that LOADS cleanly, so the pipeline-writer fix ALONE (no container rebuild) keeps an older ILE binary from crashing on it. - generate_realizations.prior_cal_breadcrumb_dict(env_dir, dets, fmin, fmax, n_spline_points): build the 'cal' breadcrumb for the broad PRIOR with proposal == prior. Seeding from it draws cal realizations from the prior with ZERO importance weights -- exactly equivalent to the cold prior draws. Layout matches seed_realizations_from_breadcrumb (per-det [amp,phase] blocks; dim = 2N*len(dets)). - util_RIFT_pseudo_pipe.py: on OSG file-transfer, write cal_consolidated_-1.npz as that valid prior breadcrumb (was a 0-byte file) and extr_consolidated_-1.npz as a valid EMPTY breadcrumb (extrinsic=None -> cold). Falls back to a 0-byte file only if the build fails (then the ILE size-guard catches it). PIPELINE-WRITER change -- no container rebuild needed. - util_CalMakePriorBreadcrumb.py (NEW): (re)generate the prior placeholder for an ALREADY-built run dir IN PLACE (overwrite the 0-byte cal_consolidated_-1.npz), so an in-flight pilot run can be patched without re-running pseudo_pipe or rebuilding the container. Verified: the placeholder loads + seeds with max|cal_log_weights| ~ 1e-14 (== prior draws). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…lmarg flags The demo grew from the single-ILE correctness check into a ladder up to a runnable condor pipeline. Document all targets grouped by what they exercise (A: numerical correctness + single-ILE; B: direct-ILE DAG + tuning; C: offline pipeline build-validate incl. extrinsic handoff; D: runnable pipeline on CI data + pilots + extrinsic-handoff GPU run), the runnable toggles (OSG/PP_PILOT/PP_DMARG/PP_CALPOST/PP_NIT), the helper utils, and the advanced pipeline flags (--calmarg-export-posterior, --internal-marginalize-distance, --calmarg-pilot, --extrinsic-handoff). Add the recovered-cal-posterior section, the iteration-0 prior placeholder note (+ util_CalMakePriorBreadcrumb.py), and the execute-point vs pipeline-writer rule. Points to DESIGN_adaptive_driver.md / DESIGN_extrinsic_handoff.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds --supplementary-coordinate-{code,function,ini,chart} plus the two
input/output parameter list flags to plot_posterior_corner.py, mirroring
the surface already in util_ConstructEOSPosterior.py. When the plugin
flag is set, _materialize_plugin_columns runs once per loaded posterior
and once per loaded composite file *after* the existing RIFT
postprocessing -- it computes the plugin's output columns from existing
record-array fields and splices them in via add_field.
Critically, the hook is strictly ADDITIVE. Any output name already
present in samples.dtype.names is skipped, so the hardcoded
extract_combination_from_LI and the per-file postprocess loops
(mc / eta / chi_eff / LambdaTilde / chi1_perp / ...) always win. Legacy
invocations with no --supplementary-coordinate-code flag are byte-
identical to the pre-plugin tool -- the helper returns input unchanged
when the converter is None.
CLI surface
-----------
--supplementary-coordinate-code SPEC
'rift_default' | filesystem path to a .py | dotted module name.
--supplementary-coordinate-function NAME
Entry-point callable. Defaults to 'convert_coordinates'.
--supplementary-coordinate-ini PATH
Optional; parsed and handed to prepare().
--supplementary-coordinate-chart NAME
Required only when the plugin defines multiple charts.
--supplementary-coordinate-input-parameter NAME (action='append')
Override the plugin-declared INPUT_PARAMETERS / chart's
input_parameters list.
--supplementary-coordinate-output-parameter NAME (action='append')
Override the plugin-declared OUTPUT_PARAMETERS / chart's
parameters list.
When the input / output name lists aren't given on the CLI they're
resolved from CHARTS[chart] (input_parameters, parameters) and then from
the module-level INPUT_PARAMETERS / OUTPUT_PARAMETERS attributes.
Verified
--------
Five synthetic cases on an (m1, x, y, z) record array with the linear
plugin requesting (u, v, w):
* happy path -- (x, y, z) -> (u, v, w) values match
u=(x+y)/sqrt(2), v=(y-x)/sqrt(2), w=z on every row; (m1, x, y, z)
untouched.
* output-reorder -- works regardless of the order u/v/w are listed.
* name-collision -- samples pre-seeded with u=99; the plugin leaves u
alone (RIFT path wins) and still adds v, w.
* missing-input -- samples without an x column; helper logs a skip,
returns input unchanged, no crash.
* no-plugin -- helper is identity (out is samples).
RePrimAnd's Python API changed: the NS-accuracy factory tov_acc_simple was renamed star_acc_simple (now taking two leading bool flags need_deform, need_bulk, then acc_tov, acc_deform, minsteps), and make_tov_branch_stable replaced num_samp/mgrav_min with mg_cut_low_rel/mg_cut_low_abs/gm1_step. The old calls raised TypeErrors against current installs. Add version-robust shims (_pyr_star_acc, _pyr_tov_branch, _pyr_interval) that target the modern API and fall back to the legacy one, so EOSReprimand/make_mr_lambda_reprimand work with either pyreprimand. make_eos_barotr_spline and the star_branch accessors are unchanged. Also fix the read_tov_sequence path (load_star_branch takes a filename, not the eos object). Verified against the RePrimAnd 1.7 docs. Co-Authored-By: Claude <noreply@anthropic.com>
… plot tool Brings in three commits from the HyperpipeCoordinates worktree: 8aaba0c util_ConstructEOSPosterior: decouple fit basis from MC sampling basis (--parameter-implied / --parameter-nofit semantics ported from util_ConstructIntrinsicPosterior_GenericCoordinates.py, plus integrator, sampler, arity dispatch and output-writer rewires; hyperpipe yaml schema relaxed so coords-fit can be empty when coords-implied / coords-nofit carry the bases.) d1995aa demo/hyperpipe: README tour of the four yaml configs 5643386 plot_posterior_corner: additive coordinate-plugin hook (--supplementary-coordinate-* flags, never overrides a name the hardcoded RIFT path already produced.) Safety check before merging: * merge-base with calmarg_in_loop = 26f8d83 (the calmarg breadcrumb). * Files touched by eospost_coords: util_ConstructEOSPosterior.py, plot_posterior_corner.py, RIFT/hyperpipe/{coords,config}.py, demo/hyperpipe/{README.md, hyperpipe_conf_linear_uvw.yaml}. * Files touched by calmarg_in_loop since the merge-base: all under demo/rift/calmarg/ and the calmarg-pilot-lane drivers. * Path overlap between the two = empty. * git merge-tree produces zero conflict markers. * Same path-disjoint result against origin/rift_O4d_junior_extrinsic_handoff (a sibling junior branch cross-checked at merge time).
…oshaughnessy-junior/research-projects-RIT into rift_O4d_junior_calmarg_in_loop
…nm_backend' into rift_O4d_junior_calmarg_in_loop
…yreprimand 1.7) pyreprimand's star_acc_simple is (*, need_deform, need_bulk, acc_tov, acc_deform, minsteps) -- all keyword-only -- so the positional call in _pyr_star_acc raised TypeError. Pass by keyword. Also add a defaults fallback for make_tov_branch_stable. Co-Authored-By: Claude <noreply@anthropic.com>
Brings the puff lane into the same coordinate-plugin framework already
used by util_ConstructEOSPosterior and plot_posterior_corner. When
--supplementary-coordinate-code is supplied, both
util_HyperparameterPuffball.py and util_HyperparameterTracerUpdate.py
operate in the PLUGIN basis: forward-transform the file's input-basis
columns into the basis named by --parameter, do the covariance estimation
/ SMC / birth-death / puff-displacement step in that basis, then
INVERSE-transform back to the file basis to write the output .dat in the
same column structure the rest of the pipeline expects.
The legacy code path is byte-identical when no plugin is supplied --
--parameter names are file columns, _extract_X reads them directly, the
write-back uses opts.parameter -> cols.index(name). The plugin-or-not
branch is the same `if plugin_active` predicate in every site.
Required plugin contract addition: inverse_convert_coordinates(y_in,
coord_names, low_level_coord_names, **kwargs) -> (N, len(low_level_coord_names)).
The puff lane MUST round-trip through the plugin basis, so we bail out
loudly if the plugin doesn't define an inverse rather than silently
using a pseudo-inverse (which would give subtly-wrong placements).
CLI surface (both executables)
------------------------------
--supplementary-coordinate-code SPEC
'rift_default' | filesystem path to a .py | dotted module name.
--supplementary-coordinate-function NAME
Entry-point callable. Defaults to 'convert_coordinates'.
--supplementary-coordinate-ini PATH
Optional; parsed and handed to prepare().
--supplementary-coordinate-chart NAME
Required only when the plugin defines multiple charts.
--supplementary-coordinate-input-parameter NAME (action='append')
File-column name to feed the plugin as an input dimension. If
omitted, CHARTS[chart].input_parameters / INPUT_PARAMETERS is used.
linear_coordinate_convert.py: inverse_convert_coordinates
---------------------------------------------------------
Closed-form x = A^{-1} (y - b) with cached A^{-1}. Requires a square,
non-singular A; raises if A is non-square (pseudo-inverse is ambiguous
for the puff use case) or if the input doesn't span every output
dimension declared in OUTPUT_PARAMETERS. Honors permuted coord_names /
low_level_coord_names orders.
Verified
--------
Synthetic test on a 2000-point (u,v,w)-diagonal Gaussian rotated into
(x,y,z), driven through the tracer's puffball-mode regression path:
* plugin puff_factor=0.5 yields per-axis (u,v,w) variance ratios of
[1.24, 1.25, 1.25] -- the textbook (1 + puff_factor^2) growth.
* uvw off-diagonal correlation stays < 0.04 (diag-cov data, diag-cov
delta).
* output .dat header is (lnL, sigma_lnL, x, y, z) -- the file's
original basis is preserved across the round-trip.
* puff_factor=0 leaves the grid unchanged modulo the tracer's existing
near-singular-cov regularization (~1e-4 residual on xyz of stdev ~0.8).
* Legacy --parameter x y z path runs and displaces the grid unchanged.
* Missing input column -> clean error.
* Plugin without inverse_convert_coordinates -> clean error at load.
Also verified linear_coordinate_convert.inverse round-trips to 4.4e-16
against the forward transform.
…oshaughnessy-junior/research-projects-RIT into rift_O4d_junior_calmarg_in_loop # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
…ILE too Follow-up to the puff-barrier fix (c4c1455): the puffball ILE jobs run with the SAME args_ile as the normal wide ILE, so they read the same iteration-(it-1) seed breadcrumb (--calibration-proposal-breadcrumb / --extrinsic-proposal-breadcrumb). The normal-ILE seed barriers are applied via ile_node_list_per_iteration BEFORE the puffball ILE jobs are created, so the puffball jobs were missing them and would race the it-1 consolidation that produces the seed file (silently falling back to the prior). Now that c4c1455 added extra_parent_nodes, thread last_puff_node AND (when pilots/handoff are active) calpilot_{it-1}/extrconsolidate_{it-1} into the puffball ILE's extra_parent_nodes, so every wide ILE job of iteration `it` (normal + puffball) waits for the same seed barrier. Verified by building a puff+pilot DAG: the puff node's children are 200/200 ILE_puff jobs and 0 normal ILE jobs (no halt), and a puffball ILE job depends on both ParameterPuffball and CalPilotStage. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
calmarg done in the ILE loop, including
as well as fancy tools to