Skip to content

Transformers v5 Gemma scaling adjustment#1315

Merged
jlarson4 merged 2 commits into
devfrom
bug/gemma-embedding-scalar-fix
May 20, 2026
Merged

Transformers v5 Gemma scaling adjustment#1315
jlarson4 merged 2 commits into
devfrom
bug/gemma-embedding-scalar-fix

Conversation

@jlarson4
Copy link
Copy Markdown
Collaborator

Description

  • Bump pinned transformers version to 5.4.0
  • Gemma models use Gemma3TextScaledWordEmbedding which scales embeddings by sqrt(d_model). Before transformers 4.54.0, this was not done by default. In newer versions, we were doubling the scaling. This removes our additional scaling and defers to HuggingFace
  • Reverifying effected models – This change actually improved some of model's scores
Note

This is not technically a breaking change, but transformers is a significant dependency that I want to make sure we call out as being updated

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@jlarson4 jlarson4 merged commit 8e8d9d4 into dev May 20, 2026
42 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant