Similarity function used for evaluation of non-normalized embeddings

Hello, nice work!

I am reading the paper (https://arxiv.org/html/2601.11262v1) and trying to benchmark the model.

Could you confirm whether all reported retrieval metrics were computed using the `Ranker` class in src/livi/apps/retrieval_eval/ranker.py, specifically the logic at line 77?

I am asking because some of the models you compare against (e.g., CLEWS) do not L2-normalize their embeddings by design. 
Computing cosine similarity in their original embedding space would distort the learned geometry and likely yield suboptimal performance metrics.

Were these embeddings normalized before ranking, or was a different similarity function used for them?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Similarity function used for evaluation of non-normalized embeddings #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Similarity function used for evaluation of non-normalized embeddings #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions