Adds image_features parameter to predict for pre-computed embeddings#169
Adds image_features parameter to predict for pre-computed embeddings#169NetZissou wants to merge 3 commits into
image_features parameter to predict for pre-computed embeddings#169Conversation
hlapp
left a comment
There was a problem hiding this comment.
Thanks @NetZissou. The part of the implementation approach that I don't like here is that now creating probabilities is taking place redundantly in two different functions. This also creates more code noise than I think should be needed in the predict() method.
Instead, shouldn't the clean way to handle this in the predict() method be to see whether image_features are already provided. If they are, apply basic checks like correct dimensions etc. If they are not, create them (like the are being created now from images). Then proceed with creating probabilities from image_features.
|
@NetZissou just FYI, it might be advisable to rebase on main to bring in the changes from #179. It's well possible your changes so far are not in conflict at all, but some stuff did get moved around. |
Allows passing pre-computed image embeddings directly to predict() on TreeOfLifeClassifier, CustomLabelsClassifier, and CustomLabelsBinningClassifier, avoiding redundant image encoding when embeddings are already available. Validates input: checks tensor is 2D, embedding_dim matches the model's expected dimension (model.visual.output_dim), and normalizes via L2 norm only if not already normalized to avoid floating point drift. Closes Imageomics#167 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces `create_probabilities_from_features` with two smaller helpers on `BaseClassifier`: `_validate_image_features` and `_resolve_image_features`. Now `TreeOfLifeClassifier.predict` and `CustomLabelsClassifier.predict` can share a single input-resolution path instead of the duplication. The commit also restores `rank` as required on `TreeOfLifeClassifier.predict` via a runtime TypeError, preserving the pre-PR behavior of the API. Adds tests for 1) length-mismatch on `CustomLabelsClassifier` 2) images+image_features input, 3) rank-omitted error Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com
508064a to
a36724c
Compare
Per @egrace479 comment, when image_features is passed, images must also be passed, which will be used to generate identifiers for the output keys. The length of image_features must match images. Updated test cases to adopt the new API design. Updated predict method docstring to reflect the new changes, and make the docstring easier to read Co-Authored-By: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> Co-Authored-By: Hilmar Lapp <51458+hlapp@users.noreply.github.com>
1146fe5 to
1ae72a4
Compare
Re-implemented based on the comment above. Ready for re-review. |
hlapp
left a comment
There was a problem hiding this comment.
Thanks @NetZissou. Some thoughts and comments:
-
Since this is trying to optimize performance for the case that someone has the image embeddings already, I think we can expect at least by default that they are already normalized, rather than re-normalizing them (for testing whether they are normalized) each time. So I think that should be conditional to a parameter that by default is off?
-
The
_resolve_image_features()function doesn't really resolve image features, given that it returns probabilities, not resolved image features. So instead it should probably better be named something like_create_probabilities_for_imgs_or_img_feats()(or maybe even just_create_probabilities()?). And given the number of parameters andNonebeing allowed in several places, all invocations should arguably use named parameter syntax. -
In this implementation (as per
_resolve_image_features()), there's actually really nothing directly in common between creating probabilities from image features and creating them from images: there is no code before or after the conditional that would run either way. I guess the shared part is indirect: creating probabilities from images invokescreate_batched_probabilities_for_images(), whereas creating them from image features invokescreate_probabilities(). Of course, indirectly the former invokescreate_probabilities_for_images()which then invokescreate_probabilities(), too. It seems what this really means is the following:- If I have images and no image features, encode the images to obtain image features. This should be done batched for performance optimization.
- Once I have image features, create probabilities. This need not be batched.
In your current implementation, creating probabilities from image features will be batched if starting from images, and the batching will be bypassed if image embeddings are already in hand. Is this reasonable?
Disclaimer: This PR was developed with assistance from
Claude Opus 4.6 (1M context). The author has reviewed all code changes and test additions. CI has been executed successfully in the forked repo. Opening this PR to request review from the package maintainers for further feedback and iteration.Summary
This PR adds an optional
image_featuresparameter topredict()onTreeOfLifeClassifierandCustomLabelsClassifier(CustomLabelsBinningClassifierinherits this throughCustomLabelsClassifier). When provided, the method skips image encoding and computes classification directly from pre-computed embeddings.Embedding validation
The method validates input embeddings before classification:
(N, embedding_dim)embedding_dimmatches the model's expected dimension (model.visual.output_dim)Test plan
New tests in
TestPredictFromEmbeddings:file_name)CustomLabelsClassifierandCustomLabelsBinningClassifierCloses #167