Skip to content

Make ensemble generator sample in memory, stop relying on samples.Rdata side effect#4042

Open
omkarrr2533 wants to merge 8 commits into
PecanProject:developfrom
omkarrr2533:refactor-generators-return-samples
Open

Make ensemble generator sample in memory, stop relying on samples.Rdata side effect#4042
omkarrr2533 wants to merge 8 commits into
PecanProject:developfrom
omkarrr2533:refactor-generators-return-samples

Conversation

@omkarrr2533

Copy link
Copy Markdown
Member

Week 5 part 2 of the modularity work. Makes generate_joint_ensemble_design sample in memory and return its samples instead of relying on samples.Rdata being written as a side effect, and moves the samples.Rdata write up to runModule.

what changed

generate_joint_ensemble_design now samples in memory via the shared loader and pure get_parameter_samples, returns list(X, samples), and takes an optional samples= pass-in to reuse an existing bundle instead of resampling.

the posterior loading block is pulled out of the dot wrapper into a new helper load_pft_posteriors, so the wrapper and the generator share one path.

runModule.run.write.configs samples once for the full five object bundle, writes samples.Rdata like before, and passes that same bundle into the generator and both run.write.configs calls so nothing resamples. run.write.configs gets a samples= arg that falls back to loading samples.Rdata from disk when NULL, so existing callers are unaffected.

why samples.Rdata is still written

once the generator stops writing it, run.sensitivity.analysis (trait.samples), run.ensemble.analysis and get.results (ensemble.samples fallback) still read it. so runModule writes it once at the top level outdir, same objects and location as before. sda reads it too but through its own path, so sda stays a separate migration.

not calling the dot get.parameter.samples for the write because it's .Deprecated and would warn on every run, so the load/sample/save is done directly in .prepare_input_designs.

scope

covers the ensemble generator and the config writing chain. OAT/SA generator and sda are separate follow ups. the SA only path still writes samples.Rdata through the untouched OAT generator, confirmed.

testing

new tests cover the loader, the generator's in-memory sampling and pass-in, the runModule write and threading, and run.write.configs skipping the disk read when passed a bundle. PEcAn.uncertainty 152 pass, PEcAn.workflow 53 pass, both 0 fail.

Move the disk/BETY posterior-loading block out of get.parameter.samples into a new load_pft_posteriors() so the dash sampling functions and design generators can share one loader rather than each carrying a copy. The wrapper now calls the loader and behaves identically; existing wrapper tests updated to stub the loader and pass unchanged.
The joint generator now loads posteriors via load_pft_posteriors and samples through get_parameter_samples in memory, returning list(X, samples) instead of relying on the samples.Rdata side effect. An optional samples argument lets callers pass an established parameter set in. Tests updated to stub the new loader and sampler and to verify the returned samples and the pass-in path.
Now that generate_joint_ensemble_design samples in memory, .prepare_input_designs samples once through load_pft_posteriors + get_parameter_samples for the full five-object bundle, writes samples.Rdata for the downstream analysis steps that still read it (run.sensitivity.analysis, run.ensemble.analysis, get.results), and passes that same bundle into the generator so nothing resamples. Adds tests for the file contract, the pass-in, and the normalized-design passthrough.
run.write.configs now accepts a pre-computed samples bundle and uses it directly, falling back to loading samples.Rdata from settings\ only when samples is NULL. The trait/sa/ensemble.samples resolution is hoisted to run once on whichever source. Adds a test verifying the in-memory path skips the disk read.
The samples bundle produced in .prepare_input_designs is now stored on the designs list and passed into both run.write.configs calls, so they use it directly instead of re-reading samples.Rdata. When no ensemble runs (SA-only), the bundle is NULL and run.write.configs falls back to disk. Extends the .prepare_input_designs test to assert the bundle is threaded onto designs.
Roxygen regeneration for the signature and doc changes: run.write.configs and generate_joint_ensemble_design gain the samples argument, load_pft_posteriors gets a new man page and NAMESPACE export, and .prepare_input_designs documents the samples entry on its return value.
…e.method fallback

In .prepare_input_designs, write posterior.files as rep(NA, length of settings pfts) instead of a map_chr over a field pfts do not carry, and restore the uniform default on ens.sample.method to match the previous get.parameter.samples behavior.
Add missing @param samples to generate_joint_ensemble_design (R CMD check flagged it as undocumented) and correct its stale @return. Replace the corrupted em-dash bytes in get.parameter.samples roxygen with plain hyphens so document() produces a stable Rd and the tree-is-clean check passes.
@omkarrr2533 omkarrr2533 marked this pull request as ready for review July 3, 2026 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant