Add fit_predict / fit_predict_proba to TabularCloudPredictor#253
Draft
shchur wants to merge 1 commit into
Draft
Conversation
Fuse fit + batch predict into a single SageMaker training job: fit on train_data and predict on a separate test_data inside the training container (predict_after_fit=True), mirroring the existing TimeSeries fit_predict path. Avoids a second cold start, data upload, and the predictor-tarball round-trip of a separate batch-transform job. - train.py: branch the predict_after_fit block on predictor_type; tabular reads a new test_data channel and writes the full [pred, proba...] frame (regression: just [pred]). Raises NotImplementedError on image columns. - TabularSagemakerBackend: fit override that validates test_data covers the train feature columns client-side before launch, then injects it as a data channel. - TabularCloudPredictor: fit_predict -> pd.Series and fit_predict_proba(include_predict) -> (pred, proba) | proba, matching the predict / predict_proba split. wait=False returns None; fetch later via get_fit_predict_results / get_fit_predict_proba_results. - Tests: pure-unit wiring/validation tests plus a parametrized classification/regression integration test.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fuse fit + batch predict into a single SageMaker training job: fit on train_data and predict on a separate test_data inside the training container (predict_after_fit=True), mirroring the existing TimeSeries fit_predict path. Avoids a second cold start, data upload, and the predictor-tarball round-trip of a separate batch-transform job.
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.