Object-based, multiprocess, and caching pdet and weight estimator #1
Open
kcroker wants to merge 4 commits into
Open
Object-based, multiprocess, and caching pdet and weight estimator #1kcroker wants to merge 4 commits into
kcroker wants to merge 4 commits into
Conversation
added 4 commits
July 24, 2023 22:00
instead of the bounds being set by the grid.
BROKE caching. Will need to save the LVKWeighter object itself,
so that we have access to the bounds. (and other things, like
the key and filename)
set member variables from that pickle upon load from
cache
REMOVED some vestigal output
Author
|
Apparently |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the core functionality in
pdets_from_grid()as a persistent-state object. It will cache the underlying trained regression model, so that training the model only needs to happen on the first call. Subsequent evaluation of the model can also be distributed across amultiprocessing.Pool. Multiprocessing can be memory intensive, because every child needs a copy of the trained regressor, and that's ~350MB. Because the regressor is complicated, getting the memory consumption down will probably require something likemap_coordinates()with ashared_memorynumpyarray. If this approximation scheme delivers the necessarily accuracy, it would likely be 10-100x faster in addition to consuming 1/cores less memory.Usage example: