Skip to content

Adds image_features parameter to predict for pre-computed embeddings#169

Open
NetZissou wants to merge 3 commits into
Imageomics:mainfrom
NetZissou:feature/predict-from-embeddings
Open

Adds image_features parameter to predict for pre-computed embeddings#169
NetZissou wants to merge 3 commits into
Imageomics:mainfrom
NetZissou:feature/predict-from-embeddings

Conversation

@NetZissou

Copy link
Copy Markdown

Disclaimer: This PR was developed with assistance from Claude Opus 4.6 (1M context). The author has reviewed all code changes and test additions. CI has been executed successfully in the forked repo. Opening this PR to request review from the package maintainers for further feedback and iteration.

Summary

This PR adds an optional image_features parameter to predict() on TreeOfLifeClassifier and CustomLabelsClassifier (CustomLabelsBinningClassifier inherits this through CustomLabelsClassifier). When provided, the method skips image encoding and computes classification directly from pre-computed embeddings.

Embedding validation

The method validates input embeddings before classification:

  • Verifies tensor is 2D (N, embedding_dim)
  • Checks that embedding_dim matches the model's expected dimension (model.visual.output_dim)
  • Normalizes the embedding vector via L2 norm only if not already normalized

Test plan

New tests in TestPredictFromEmbeddings:

  • Results from embeddings match results from images exactly (all fields except file_name)
  • Species, family, and multi-image predictions
  • Unnormalized features auto-normalized with correct classifications
  • CustomLabelsClassifier and CustomLabelsBinningClassifier
  • Error cases: no inputs, wrong tensor dim, wrong embedding dim, image-embedding length mismatch

Closes #167

@hlapp hlapp left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NetZissou. The part of the implementation approach that I don't like here is that now creating probabilities is taking place redundantly in two different functions. This also creates more code noise than I think should be needed in the predict() method.

Instead, shouldn't the clean way to handle this in the predict() method be to see whether image_features are already provided. If they are, apply basic checks like correct dimensions etc. If they are not, create them (like the are being created now from images). Then proceed with creating probabilities from image_features.

@hlapp

hlapp commented Apr 17, 2026

Copy link
Copy Markdown
Member

@NetZissou just FYI, it might be advisable to rebase on main to bring in the changes from #179. It's well possible your changes so far are not in conflict at all, but some stuff did get moved around.

NetZissou and others added 2 commits April 28, 2026 12:09
Allows passing pre-computed image embeddings directly to predict() on
TreeOfLifeClassifier, CustomLabelsClassifier, and CustomLabelsBinningClassifier,
avoiding redundant image encoding when embeddings are already available.

Validates input: checks tensor is 2D, embedding_dim matches the model's
expected dimension (model.visual.output_dim), and normalizes via L2 norm
only if not already normalized to avoid floating point drift.

Closes Imageomics#167

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces `create_probabilities_from_features` with two smaller helpers
on `BaseClassifier`: `_validate_image_features` and
`_resolve_image_features`. Now `TreeOfLifeClassifier.predict` and
`CustomLabelsClassifier.predict` can share a single input-resolution
path instead of the duplication.

The commit also restores `rank` as required on
`TreeOfLifeClassifier.predict` via a runtime TypeError, preserving the
pre-PR behavior of the API.

Adds tests for 1) length-mismatch on `CustomLabelsClassifier` 2)
images+image_features input, 3) rank-omitted error

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com
@NetZissou NetZissou force-pushed the feature/predict-from-embeddings branch from 508064a to a36724c Compare May 4, 2026 18:49
Comment thread src/bioclip/predict.py Outdated
Comment thread src/bioclip/predict.py Outdated
@NetZissou NetZissou requested review from egrace479 and hlapp May 8, 2026 20:15
Per @egrace479 comment, when image_features is passed, images must also be
passed, which will be used to generate identifiers for the output keys.

The length of image_features must match images.

Updated test cases to adopt the new API design.

Updated predict method docstring to reflect the new changes, and make
the docstring easier to read

Co-Authored-By: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com>
Co-Authored-By: Hilmar Lapp <51458+hlapp@users.noreply.github.com>
@NetZissou NetZissou force-pushed the feature/predict-from-embeddings branch from 1146fe5 to 1ae72a4 Compare May 8, 2026 20:19
@NetZissou

Copy link
Copy Markdown
Author

@egrace479 @hlapp

@NetZissou, @hlapp and I discussed the implementation this morning. It would make the most sense to expect (require) the list of image identifiers (e.g., list of filename strings) that correspond to the embeddings to be passed as well. These can then be used for the image keys. The embeddings would go through the size checks previously discussed and you have the sanity check that the list lengths match. We just want to make sure it's well-documented then.

Re-implemented based on the comment above. Ready for re-review.

@hlapp hlapp left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NetZissou. Some thoughts and comments:

  • Since this is trying to optimize performance for the case that someone has the image embeddings already, I think we can expect at least by default that they are already normalized, rather than re-normalizing them (for testing whether they are normalized) each time. So I think that should be conditional to a parameter that by default is off?

  • The _resolve_image_features() function doesn't really resolve image features, given that it returns probabilities, not resolved image features. So instead it should probably better be named something like _create_probabilities_for_imgs_or_img_feats() (or maybe even just _create_probabilities()?). And given the number of parameters and None being allowed in several places, all invocations should arguably use named parameter syntax.

  • In this implementation (as per _resolve_image_features()), there's actually really nothing directly in common between creating probabilities from image features and creating them from images: there is no code before or after the conditional that would run either way. I guess the shared part is indirect: creating probabilities from images invokes create_batched_probabilities_for_images(), whereas creating them from image features invokes create_probabilities(). Of course, indirectly the former invokes create_probabilities_for_images() which then invokes create_probabilities(), too. It seems what this really means is the following:

    • If I have images and no image features, encode the images to obtain image features. This should be done batched for performance optimization.
    • Once I have image features, create probabilities. This need not be batched.

    In your current implementation, creating probabilities from image features will be batched if starting from images, and the batching will be bypassed if image embeddings are already in hand. Is this reasonable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support prediction from pre-computed image embeddings

3 participants