feat: add MLX builds for LFM2.5 and privacy-filter models#1266
Open
NorbertKlockiewicz wants to merge 2 commits into
Open
feat: add MLX builds for LFM2.5 and privacy-filter models#1266NorbertKlockiewicz wants to merge 2 commits into
NorbertKlockiewicz wants to merge 2 commits into
Conversation
Wire HF-hosted MLX variants for LFM2.5 (350M, 1.2B), LFM2.5-VL (450M, 1.6B) and the privacy filters (openai, nemotron), defaulting to MLX on iOS alongside the existing XNNPACK builds. Runner support for the new builds: - vision_encoder reads its declared input dtype from method metadata and converts the fp32 pixels accordingly (fp32 passthrough / bf16 / fp16), instead of hardcoding Float. - multimodal prefiller splice handles fp32<->bf16 vision/text-embed dtype pairs (hybrid fp32 vision + bf16 decoder). - convert_from_float helper in util.h. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
msluszniak
reviewed
Jun 18, 2026
Member
- Extract convert_to_float16 (symmetric to convert_to_bfloat16); rewrite convert_from_float as a switch dispatching to the bf16/fp16 helpers with a Float passthrough, rejecting other targets. - Dedupe the image-embed splice's per-pair conversion loops into a single templated castCopy<Src, Dst> in multimodal_prefiller. - Read the vision input dtype once in getInputShape (ImageShape.dtype) instead of a second method_meta call in encode(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description
Adds MLX (Apple GPU) builds for the LFM2.5 family and the privacy-filter
models, wiring the HF-hosted MLX variants into the model registry so they
default to MLX on iOS (with the existing XNNPACK builds as the
simulator / Android fallback):
Runner changes needed to drive the new builds:
vision_encodernow reads its declared input dtype from method metadataand converts the preprocessed fp32 pixels to match (fp32 passthrough,
bf16, or fp16) instead of hardcoding
Float. Required for bf16 MLXvision encoders; the fp32 path stays zero-copy.
fp32 <-> bf16vision/text-embed dtype pairs (a hybrid where the vision encoder is fp32
and the decoder embeds are bf16), in addition to the existing
fp32<->fp16 cases.
convert_from_floathelper inrunner/util.h.Review order:
modelUrls.ts+modelRegistry.ts(the wiring) first, thenthe three
common/runnerfiles (the dtype handling).Introduces a breaking change?
Type of change
Tested on
Testing instructions
default backend (resolves to MLX on iOS), e.g.
models.llm.lfm2_5_vl_450m(),models.privacy_filter.openai().models generate correctly (the VL models with an image).
back to XNNPACK (or pass
{ backend: 'xnnpack' }explicitly).Checklist