Skip to content

feat: add MLX builds for LFM2.5 and privacy-filter models#1266

Open
NorbertKlockiewicz wants to merge 2 commits into
mainfrom
@nk/mlx-lfm-privacy-filter-models
Open

feat: add MLX builds for LFM2.5 and privacy-filter models#1266
NorbertKlockiewicz wants to merge 2 commits into
mainfrom
@nk/mlx-lfm-privacy-filter-models

Conversation

@NorbertKlockiewicz

@NorbertKlockiewicz NorbertKlockiewicz commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Description

Adds MLX (Apple GPU) builds for the LFM2.5 family and the privacy-filter
models, wiring the HF-hosted MLX variants into the model registry so they
default to MLX on iOS (with the existing XNNPACK builds as the
simulator / Android fallback):

  • LFM2.5 text — 350M, 1.2B
  • LFM2.5-VL — 450M, 1.6B
  • Privacy filter — openai, nemotron

Runner changes needed to drive the new builds:

  • vision_encoder now reads its declared input dtype from method metadata
    and converts the preprocessed fp32 pixels to match (fp32 passthrough,
    bf16, or fp16) instead of hardcoding Float. Required for bf16 MLX
    vision encoders; the fp32 path stays zero-copy.
  • The multimodal prefiller image-embed splice now handles fp32 <-> bf16
    vision/text-embed dtype pairs (a hybrid where the vision encoder is fp32
    and the decoder embeds are bf16), in addition to the existing
    fp32<->fp16 cases.
  • New convert_from_float helper in runner/util.h.

Review order: modelUrls.ts + modelRegistry.ts (the wiring) first, then
the three common/runner files (the dtype handling).

Introduces a breaking change?

  • Yes
  • No

Type of change

  • New feature (change which adds functionality)

Tested on

  • iOS

Testing instructions

  1. On a physical iOS device, load each model via its accessor with the
    default backend (resolves to MLX on iOS), e.g.
    models.llm.lfm2_5_vl_450m(), models.privacy_filter.openai().
  2. Verify the privacy filters classify PII and the LFM2.5 / LFM2.5-VL
    models generate correctly (the VL models with an image).
  3. MLX requires a physical device; on the simulator the accessors fall
    back to XNNPACK (or pass { backend: 'xnnpack' } explicitly).

Checklist

  • I have performed a self-review of my code

Wire HF-hosted MLX variants for LFM2.5 (350M, 1.2B), LFM2.5-VL (450M,
1.6B) and the privacy filters (openai, nemotron), defaulting to MLX on
iOS alongside the existing XNNPACK builds.

Runner support for the new builds:
- vision_encoder reads its declared input dtype from method metadata and
  converts the fp32 pixels accordingly (fp32 passthrough / bf16 / fp16),
  instead of hardcoding Float.
- multimodal prefiller splice handles fp32<->bf16 vision/text-embed dtype
  pairs (hybrid fp32 vision + bf16 decoder).
- convert_from_float helper in util.h.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread packages/react-native-executorch/common/runner/util.h Outdated
@msluszniak msluszniak added the feature PRs that implement a new feature label Jun 18, 2026
@msluszniak

Copy link
Copy Markdown
Member

When running privacy filter on iOS, I got this error:
image

- Extract convert_to_float16 (symmetric to convert_to_bfloat16); rewrite
  convert_from_float as a switch dispatching to the bf16/fp16 helpers with
  a Float passthrough, rejecting other targets.
- Dedupe the image-embed splice's per-pair conversion loops into a single
  templated castCopy<Src, Dst> in multimodal_prefiller.
- Read the vision input dtype once in getInputShape (ImageShape.dtype)
  instead of a second method_meta call in encode().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PRs that implement a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants