release: v0.9.2 by chmjkb · Pull Request #1259 · software-mansion/react-native-executorch

chmjkb · 2026-06-17T14:25:22Z

Description

Adds a patch with RF-Detr Keypoint preview support

Introduces a breaking change?

Yes
No

Type of change

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
Documentation update (improves or adds clarity to existing documentation)
Other (chores, tests, code style improvements etc.)

Tested on

iOS
Android

Testing instructions

Screenshots

Related issues

Checklist

I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have updated the documentation accordingly
My changes generate no new warnings

Additional notes

Register the RF-DETR keypoint preview pose model with xnnpack, coreml and mlx backends (all fp32). This is a beta preview export and may be re-exported under a different constant once a stable version ships. - modelUrls/modelRegistry: add the three backend URLs and variant map - PoseEstimationModule/types: register the model config (single-`forward` export, no inputSize axis) and extend PoseEstimationModelSources - demo: load it via usePoseEstimation in the pose estimation screen - docs: list it in the model registry and usePoseEstimation supported models ## Description  ### Introduces a breaking change? - [ ] Yes - [ ] No ### Type of change - [ ] Bug fix (change which fixes an issue) - [ ] New feature (change which adds functionality) - [ ] Documentation update (improves or adds clarity to existing documentation) - [ ] Other (chores, tests, code style improvements etc.) ### Tested on - [ ] iOS - [ ] Android ### Testing instructions  ### Screenshots  ### Related issues  ### Checklist - [ ] I have performed a self-review of my code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have updated the documentation accordingly - [ ] My changes generate no new warnings ### Additional notes  --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

msluszniak · 2026-06-17T14:37:26Z

We should definitely include fix for vision encoder in this patch. Please check if there are other applicable additions.

## Description In any multimodal conversation with more than one image, the model starts describing earlier images as the most recently sent one on later turns. `VisionEncoder::encode` caches the `EValue` returned by `vision_encoder.execute()` per image path. That tensor aliases the method's reusable output buffer, so the next `execute()` (the second image, or any later encode) overwrites the bytes behind every cached entry. On re-prefilled turns the prefiller then splices the latest image's embeddings into every image slot. The audio path already snapshots its encoder output for exactly this reason (see the `AudioSlot` comment in `multimodal_prefiller.cpp`); vision never got the same treatment. The fix copies the encoder output into bytes owned by the cache entry immediately after `execute()` and serves cache hits from a tensor wrapping those owned bytes (`unordered_map` nodes are pointer-stable, so the blob stays valid). The bug is backend-independent (the cache sits above the delegate), so XNNPACK/Vulkan multimodal models are affected the same way. ### Introduces a breaking change? - [ ] Yes - [x] No ### Type of change - [x] Bug fix (change which fixes an issue) - [ ] New feature (change which adds functionality) - [ ] Documentation update (improves or adds clarity to existing documentation) - [ ] Other (chores, tests, code style improvements etc.) ### Tested on - [x] iOS - [ ] Android ### Testing instructions 1. Run the example LLM app with a multimodal model (e.g. Gemma 4 E2B multimodal) on the Multimodal LLM screen. 2. Send image A with "What's in this picture?" — answer is correct. 3. Send image B (different content) with the same question — answer is correct. 4. Ask "What was in the FIRST picture I sent?". Before this fix, step 4 describes image B's content (both image slots receive B's embeddings on the re-prefilled turn). After the fix, the model correctly recalls image A. ### Screenshots N/A ### Related issues N/A ### Checklist - [x] I have performed a self-review of my code - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have updated the documentation accordingly - [x] My changes generate no new warnings ### Additional notes Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

## Description Optimizes token sampling for large-vocabulary models (e.g. Gemma 4 E2B, 262k vocab), where the previous full-vocabulary sort in top-p dominated per-token latency. Two changes in `sampler.cpp`: - **`mask_topp`**: replaces the `O(n log n)` sort over all logits with a logit-space histogram (`kBins=2048`) that locates the nucleus threshold in two `O(n)` passes — no sort, no per-token vocab-sized allocation. Binning in logit space (rather than probability space) keeps uniform resolution for both peaked and flat distributions. - **`softmax`**: skips `exp()` on logits already masked to `lowest()` by top-k/top-p. The result underflows to zero anyway, and the call is slow on device. On an iPhone 17 Pro with Gemma 4 E2B (int4), per-token sampling drops from ~45 ms to ~10 ms. The histogram approximates the exact sort-based nucleus; the resulting sampled distribution is statistically equivalent (verified the kept-mass fraction stays within <1% of the exact nucleus across peaked, flat, and sharp distributions). ### Introduces a breaking change? - [ ] Yes - [x] No ### Type of change - [ ] Bug fix (change which fixes an issue) - [ ] New feature (change which adds functionality) - [ ] Documentation update (improves or adds clarity to existing documentation) - [x] Other (chores, tests, code style improvements etc.) ### Tested on - [x] iOS - [ ] Android ### Testing instructions 1. Run an LLM with a large vocabulary and a non-zero temperature with `topP` set (e.g. Gemma 4 E2B with `temperature: 0.8`, `topP: 0.9`). 2. Generate a long response and observe tokens/sec. 3. Confirm output remains coherent and sampling is unchanged in character (still stochastic, not greedy). Greedy decoding (`temperature: 0`) is unaffected — it bypasses this path entirely. ### Screenshots  ### Related issues  ### Checklist - [x] I have performed a self-review of my code - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have updated the documentation accordingly - [x] My changes generate no new warnings ### Additional notes The histogram is an approximation bounded by bin granularity (`kBins=2048` over a `kRange=40` logit span). This is intentional: exact top-p over a 262k vocab where the nucleus can exceed 100k tokens is inherently expensive, and the sampling outcome is statistically indistinguishable from the exact version. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chmjkb marked this pull request as ready for review June 17, 2026 14:25

chmjkb changed the title ~~feat: add RF-DETR keypoint preview model (#1257)~~ release: v0.9.0 Jun 17, 2026

chmjkb changed the title ~~release: v0.9.0~~ release: v0.9.2 Jun 17, 2026

chore: bump version in package.json

04b2161

barhanc self-requested a review June 17, 2026 14:36

barhanc approved these changes Jun 17, 2026

View reviewed changes

chmjkb and others added 3 commits June 17, 2026 16:43

chore: replace model url to point to 0.9

6959fe0

benITo47 approved these changes Jun 17, 2026

View reviewed changes

chmjkb merged commit 77d176d into release/0.9 Jun 17, 2026
2 checks passed

chmjkb deleted the @chmjkb/patch-0.9 branch June 17, 2026 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release: v0.9.2#1259

release: v0.9.2#1259
chmjkb merged 5 commits into
release/0.9from
@chmjkb/patch-0.9

chmjkb commented Jun 17, 2026

Uh oh!

msluszniak commented Jun 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

chmjkb commented Jun 17, 2026

Description

Introduces a breaking change?

Type of change

Tested on

Testing instructions

Screenshots

Related issues

Checklist

Additional notes

Uh oh!

msluszniak commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

msluszniak commented Jun 17, 2026 •

edited

Loading