Cpaniaguam/fix predictive idata to dataframe datatree#974
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends HSSM’s inference-data handling to support both arviz.InferenceData and xarray.DataTree, aiming to keep predictive sampling and downstream utilities working across ArviZ container variants/versions.
Changes:
- Added helper methods in
HSSMto abstract inference-data group enumeration, dataset access, and dataset replacement. - Refactored predictive sampling paths (
sample_prior_predictive,sample_do, and aposterior_predictivegroup check) to use the new helpers. - Updated
predictive_idata_to_dataframeto acceptaz.InferenceData | xr.DataTree, and marked an RL likelihood-builder test asxfaildue to upstream PyTensor changes.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/hssm/base.py |
Adds group-access helpers and refactors predictive code paths to use them for InferenceData/DataTree compatibility. |
src/hssm/utils.py |
Extends predictive_idata_to_dataframe to handle predictive groups stored as xr.DataTree nodes. |
tests/rl/test_rl_likelihood_builder.py |
Marks test_make_rl_logp_op as xfail due to upstream API breakage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @pytest.mark.xfail( | ||
| reason="make_rl_logp_op is currently broken due to changes in PyTensor API and needs to be updated." | ||
| ) |
| self, idata: az.InferenceData | None | ||
| ) -> az.InferenceData: | ||
| """Drop the parent_str variable from an InferenceData object. | ||
| def _get_idata_groups( |
There was a problem hiding this comment.
I understand that you are trying to be backward-compatible here. However, there is no need. Arviz is no longer backward-compatible, so we are moving forward from InferenceData altogether. az.InferenceData doesn't really exist any more. I am surprised that we didn't get an error. Let's drop this layer of backward compatibility. This new version of HSSM should not support InferenceData
|
@cpaniaguam no need for this to be closed. We still need to make these functions compatible with DataTree. Maybe you'll do this in another PR? I actually don't really like these functions because they break wrappers and tend to break when breaking changes are introduced. Let's also think of ways to remove these functions. We might lose some functionalities but gain maintainability |
Add handling of inference data containers (
az.InferenceDataandxr.DataTree) in the codebase, increasing compatibility with different versions of ArviZ and supporting both data structures throughout the workflow. The changes include new utility methods for accessing and modifying inference data groups, refactoring of predictive sampling functions to use these utilities, and updates to utility functions and tests for consistency.Inference Data Handling Improvements:
_get_idata_groups,_get_idata_group_dataset, and_set_idata_group_datasetto abstract group access and modification for bothaz.InferenceDataandxr.DataTree, ensuring compatibility across ArviZ versions._drop_parent_str_from_idatato use the new group access utilities, and updated its type hints to support bothaz.InferenceDataandxr.DataTree.Predictive Sampling Refactor:
sample_prior_predictiveandsample_domethods to use the new group access utilities when renaming dimensions and variables, improving code clarity and compatibility. [1] [2]sample_posterior_predictiveto use the new_get_idata_groupsutility for checking group existence.Utility Function Updates:
predictive_idata_to_dataframeto accept bothaz.InferenceDataandxr.DataTree, and handle group data extraction accordingly. [1] [2] [3]Testing:
test_make_rl_logp_optest as expected to fail (xfail) due to upstream changes in the PyTensor API. See Cpaniaguam/fix rl likelihood builder pymc6 #975 with a fix.