Embeddings generation support #201

chuckbeasley · 2025-07-14T17:40:05Z

chuckbeasley
Jul 14, 2025

Currently, without using Foundry Local, I can generate embeddings using the Phi series ONNX models. It would be nice for Foundry Local to add support for that so that I don't have external dependencies, like Ollama or loading the same Phi model in a Docker container.

Answered by alxxdev

Jun 17, 2026

Hi @chuckbeasley,

Embeddings are actually already supported in Foundry Local via the OpenAI-compatible /v1/embeddings endpoint. The catalog includes embedding models:

$ foundry model list --type embedding -v
+---------------------+-----------+--------+--------+-------+--------+----------+------------------------------------+------------------------------------+
| Model Name          | Type      | Size   | Device | Tools | Cached | License  | Display Name                       | Model ID                           |
+---------------------+-----------+--------+--------+-------+--------+----------+------------------------------------+------------------------------------+
| qwen3-embedding-0.6…

View full answer

chuckbeasley · 2025-07-14T17:41:20Z

chuckbeasley
Jul 14, 2025
Author

Tagging @MaanavD for visibility.

0 replies

behroozbc · 2026-04-06T12:50:29Z

behroozbc
Apr 6, 2026

I think this is a good thing to add to foundry local

0 replies

alxxdev · 2026-06-17T18:37:29Z

alxxdev
Jun 17, 2026

Hi @chuckbeasley,

Embeddings are actually already supported in Foundry Local via the OpenAI-compatible /v1/embeddings endpoint. The catalog includes embedding models:

$ foundry model list --type embedding -v
+---------------------+-----------+--------+--------+-------+--------+----------+------------------------------------+------------------------------------+
| Model Name          | Type      | Size   | Device | Tools | Cached | License  | Display Name                       | Model ID                           |
+---------------------+-----------+--------+--------+-------+--------+----------+------------------------------------+------------------------------------+
| qwen3-embedding-0.6b| Embedding | 495 MB | CPU    | ○     | ●      | apache-2.0| qwen3-embedding-0.6b-generic-cpu  | qwen3-embedding-0.6b-generic-cpu:1 |
| qwen3-embedding-8b  | Embedding | 5.6 GB | CPU    | ○     | ○      | apache-2.0| qwen3-embedding-8b-generic-cpu    | qwen3-embedding-8b-generic-cpu:1   |
+---------------------+-----------+--------+--------+-------+--------+----------+------------------------------------+------------------------------------+

To use them:

foundry model download qwen3-embedding-0.6b
foundry model load qwen3-embedding-0.6b

Then call the API:

curl http://127.0.0.1:<port>/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"model":"qwen3-embedding-0.6b","input":"hello world"}'

Or from Python with the OpenAI SDK:

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:<port>/v1", api_key="none")
resp = client.embeddings.create(model="qwen3-embedding-0.6b", input="hello world")
print(resp.data[0].embedding)

No CLI command for embeddings yet (only HTTP API), but the functionality is there. Hope this helps!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Embeddings generation support #201

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Embeddings generation support #201

Uh oh!

chuckbeasley Jul 14, 2025

Replies: 3 comments

Uh oh!

chuckbeasley Jul 14, 2025 Author

Uh oh!

behroozbc Apr 6, 2026

Uh oh!

alxxdev Jun 17, 2026

chuckbeasley
Jul 14, 2025

chuckbeasley
Jul 14, 2025
Author

behroozbc
Apr 6, 2026

alxxdev
Jun 17, 2026