Skip to content

microsoft/olive-recipes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

353 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
olive

Olive Recipes For AI Model Optimization Toolkit

This repository compliments Olive, the AI model optimization toolkit, and includes recipes demonstrating its extensive features and use cases. Users of Olive can use these recipes as a reference to either optimize publicly available AI models or to optimize their own proprietary models.

Supported models, architectures, devices and execution providers

Below are list of available recipes grouped by different criteria. Click the link to expand.

Models grouped by model architecture
bert clip deepseek gemma hiera llama llama3 mistral mobilenet phi3 phi4 qwen2 resnet sam sd vit whisper
google-bert-bert-base-multilingual-cased OFA-Sys-chinese-clip-vit-base-patch16 deepseek-ai-DeepSeek-R1-Distill-Llama-8B google-gemma-3-1b-it sam2.1-hiera-small deepseek-ai-DeepSeek-R1-Distill-Llama-8B meta-llama-Llama-3.1-8B-Instruct mistralai-Mistral-7B-Instruct-v0.2 timm-mobilenetv3_small_100.lamb_in1k microsoft-Phi-3-mini-128k-instruct microsoft-Phi-4-mini-instruct Qwen-Qwen2.5-0.5B-Instruct microsoft-resnet-50 sam-vit-base sd-legacy-stable-diffusion-v1-5 google-vit-base-patch16-224 openai-whisper-large-v3-turbo
google-bert-bert-base-multilingual-cased laion-CLIP-ViT-B-32-laion2B-s34B-b79K deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B google-gemma-4-E2B-it meta-llama-Llama-3.1-8B-Instruct meta-llama-Llama-3.2-1B-Instruct mistralai-Mistral-7B-Instruct-v0.2 microsoft-Phi-3-mini-128k-instruct microsoft-Phi-4-mini-instruct Qwen-Qwen2.5-0.5B-Instruct sam2.1-hiera-small sd2-community-stable-diffusion-2-1 google-vit-base-patch16-224 openai-whisper-large-v3-turbo
intel-bert-base-uncased-mrpc laion-CLIP-ViT-B-32-laion2B-s34B-b79K deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B meta-llama-Llama-3.2-1B-Instruct meta-llama-Llama-3.2-1B-Instruct mistralai-Mistral-7B-Instruct-v0.3 microsoft-Phi-3-mini-128k-instruct microsoft-Phi-4-mini-instruct Qwen-Qwen2.5-0.5B sam2.1-hiera-small google-vit-base-patch16-224 openai-whisper-large-v3-turbo
intel-bert-base-uncased-mrpc openai-clip-vit-base-patch16 deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B meta-llama-Llama-3.2-1B-Instruct microsoft-Phi-3-mini-4k-instruct microsoft-Phi-4-mini-reasoning Qwen-Qwen2.5-1.5B-Instruct sam-vit-base
openai-clip-vit-base-patch16 deepseek-ai-DeepSeek-R1-Distill-Qwen-14B meta-llama-Meta-Llama-3-8B microsoft-Phi-3-mini-4k-instruct microsoft-Phi-4-reasoning-plus Qwen-Qwen2.5-1.5B-Instruct
openai-clip-vit-base-patch32 deepseek-ai-DeepSeek-R1-Distill-Qwen-7B microsoft-Phi-3-mini-4k-instruct microsoft-Phi-4-reasoning Qwen-Qwen2.5-1.5B-Instruct
openai-clip-vit-base-patch32 microsoft-Phi-3.5-mini-instruct microsoft-Phi-4 Qwen-Qwen2.5-1.5B-Instruct
openai-clip-vit-large-patch14 microsoft-Phi-3.5-mini-instruct microsoft-Phi-4 Qwen-Qwen2.5-14B-Instruct
microsoft-Phi-3.5-mini-instruct Qwen-Qwen2.5-14B-Instruct
microsoft-Phi-3.5-mini-instruct Qwen-Qwen2.5-3B-Instruct
microsoft-Phi-4 Qwen-Qwen2.5-7B-Instruct
Qwen-Qwen2.5-7B-Instruct
Qwen-Qwen2.5-Coder-0.5B-Instruct
Qwen-Qwen2.5-Coder-0.5B-Instruct
Qwen-Qwen2.5-Coder-1.5B-Instruct
Qwen-Qwen2.5-Coder-1.5B-Instruct
Qwen-Qwen2.5-Coder-14B-Instruct
Qwen-Qwen2.5-Coder-14B-Instruct
Qwen-Qwen2.5-Coder-3B-Instruct
Qwen-Qwen2.5-Coder-7B-Instruct
Qwen-Qwen2.5-Coder-7B-Instruct
deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B
deepseek-ai-DeepSeek-R1-Distill-Qwen-14B
deepseek-ai-DeepSeek-R1-Distill-Qwen-7B
Models grouped by device
cpu gpu npu
OFA-Sys-chinese-clip-vit-base-patch16 DeepSeek-R1-Distill-Llama-8B_Model_Builder_INT4 OFA-Sys-chinese-clip-vit-base-patch16
Qwen-Qwen2.5-0.5B-Instruct DeepSeek-R1-Distill-Qwen-1.5B_Model_Builder_FP16 Qwen-Qwen2.5-0.5B-Instruct
Qwen-Qwen2.5-0.5B DeepSeek-R1-Distill-Qwen-14B_NVMO_INT4_AWQ Qwen-Qwen2.5-0.5B-Instruct
Qwen-Qwen2.5-1.5B-Instruct DeepSeek-R1-Distill-Qwen-7B_NVMO_INT4_RTN Qwen-Qwen2.5-1.5B-Instruct
Qwen-Qwen2.5-14B-Instruct Llama-3.2-1B-Instruct_Model_Builder_FP16 Qwen-Qwen2.5-1.5B-Instruct
Qwen-Qwen2.5-3B-Instruct Llama3.1-8B-Instruct_Model_Builder_INT4 Qwen-Qwen2.5-1.5B-Instruct
Qwen-Qwen2.5-7B-Instruct Mistral-7B-Instruct-v0.2_Model_Builder_INT4 Qwen-Qwen2.5-1.5B-Instruct
Qwen-Qwen2.5-Coder-0.5B-Instruct OFA-Sys-chinese-clip-vit-base-patch16 Qwen-Qwen2.5-14B-Instruct
Qwen-Qwen2.5-Coder-1.5B-Instruct Phi-3-mini-128k-instruct_NVMO_INT4_RTN Qwen-Qwen2.5-3B-Instruct
Qwen-Qwen2.5-Coder-14B-Instruct Phi-3-mini-4k-instruct_Model_Builder_INT4 Qwen-Qwen2.5-7B-Instruct
Qwen-Qwen2.5-Coder-3B-Instruct Phi3.5_Mini_Instruct_Model_Builder_INT4 Qwen-Qwen2.5-7B-Instruct
Qwen-Qwen2.5-Coder-7B-Instruct Qwen-Qwen2.5-0.5B-Instruct Qwen-Qwen2.5-7B-Instruct
alibaba-nlp-gte-large-en-v1.5 Qwen-Qwen2.5-0.5B-Instruct Qwen-Qwen2.5-Coder-0.5B-Instruct
deepseek-ai-DeepSeek-R1-Distill-Llama-8B Qwen-Qwen2.5-0.5B Qwen-Qwen2.5-Coder-0.5B-Instruct
deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B Qwen-Qwen2.5-1.5B-Instruct-mixed Qwen-Qwen2.5-Coder-1.5B-Instruct
deepseek-ai-DeepSeek-R1-Distill-Qwen-14B Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-Coder-1.5B-Instruct
deepseek-ai-DeepSeek-R1-Distill-Qwen-7B Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-Coder-14B-Instruct
facebook-opt-125m-splicegpt Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-Coder-3B-Instruct
gemma-3-1b-it_model_builder_cpu_FP32 Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-Coder-7B-Instruct
gemma4-e2b-fp32-cpu Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-Coder-7B-Instruct
gemma4-e2b-int4-kquant-cpu Qwen-Qwen2.5-1.5B-Instruct deepseek-ai-DeepSeek-R1-Distill-Llama-8B
google-bert-bert-base-multilingual-cased Qwen-Qwen2.5-14B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B
google-gemma Qwen-Qwen2.5-14B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B
google-vit-base-patch16-224 Qwen-Qwen2.5-3B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B
gpt-oss-20b Qwen-Qwen2.5-7B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-14B
intel-bert-base-uncased-mrpc (ov) Qwen-Qwen2.5-7B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-7B
intel-bert-base-uncased-mrpc-inc-smooth-quant Qwen-Qwen2.5-Coder-0.5B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-7B
intel-bert-base-uncased-mrpc-ptq Qwen-Qwen2.5-Coder-0.5B-Instruct google-bert-bert-base-multilingual-cased
laion-CLIP-ViT-B-32-laion2B-s34B-b79K Qwen-Qwen2.5-Coder-1.5B-Instruct google-bert-bert-base-multilingual-cased
meta-llama-Llama-3.1-8B-Instruct Qwen-Qwen2.5-Coder-1.5B-Instruct google-bert-bert-base-multilingual-cased
meta-llama-Llama-3.2-1B-Instruct-dora Qwen-Qwen2.5-Coder-14B-Instruct google-gemma-3-1b-it
meta-llama-Llama-3.2-1B-Instruct-hqq Qwen-Qwen2.5-Coder-14B-Instruct google-vit-base-patch16-224
meta-llama-Llama-3.2-1B-Instruct-lmeval-onnx Qwen-Qwen2.5-Coder-3B-Instruct google-vit-base-patch16-224
meta-llama-Llama-3.2-1B-Instruct-lmeval Qwen-Qwen2.5-Coder-7B-Instruct google-vit-base-patch16-224
meta-llama-Llama-3.2-1B-Instruct-loha Qwen-Qwen2.5-Coder-7B-Instruct google-vit-base-patch16-224
meta-llama-Llama-3.2-1B-Instruct-lokr Qwen2.5-0.5B-Instruct_Model_Builder_FP16 google-vit-base-patch16-224
meta-llama-Llama-3.2-1B-Instruct-mixed Qwen2.5-14B-Instruct_Model_Builder_INT4 intel-bert-base-uncased-mrpc (AMD)
meta-llama-Llama-3.2-1B-Instruct-qlora Qwen2.5-7B-Instruct_Model_Builder_INT4 intel-bert-base-uncased-mrpc (ov)
meta-llama-Llama-3.2-1B-Instruct Qwen2.5-Coder-0.5B-Instruct_Model_Builder_FP16 intel-bert-base-uncased-mrpc
meta-llama-Meta-Llama-3-8B Qwen2.5-Coder-1.5B-Instruct_Model_Builder_FP16 laion-CLIP-ViT-B-32-laion2B-s34B-b79K
microsoft-Phi-3-mini-128k-instruct Qwen2.5-Coder-14B-Instruct_Model_Builder_INT4 laion-CLIP-ViT-B-32-laion2B-s34B-b79K
microsoft-Phi-3-mini-4k-instruct Qwen2.5-Coder-7B-Instruct_Model_Builder_INT4 laion-CLIP-ViT-B-32-laion2B-s34B-b79K
microsoft-Phi-3.5-mini-instruct Qwen2.5_1.5B_Instruct_Model_Builder_FP16 llama3.1-8b-instruct-x-elite
microsoft-Phi-4-mini-instruct deepseek-ai-DeepSeek-R1-Distill-Llama-8B llama3.1-8b-instruct-x2-elite
microsoft-Phi-4-mini-reasoning deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B meta-llama-Llama-3.1-8B-Instruct
microsoft-Phi-4 deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B meta-llama-Llama-3.1-8B-Instruct
microsoft-deberta-base-mnli deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B meta-llama-Llama-3.1-8B-Instruct
microsoft-resnet-50 deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B meta-llama-Llama-3.2-1B-Instruct
ministral_3_3b deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B meta-llama-Llama-3.2-1B-Instruct
mistralai-Mistral-7B-Instruct-v0.2 deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B meta-llama-Llama-3.2-1B-Instruct
mistralai-Mistral-7B-Instruct-v0.3 deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B microsoft-Phi-3-mini-128k-instruct
moonshine-tiny deepseek-ai-DeepSeek-R1-Distill-Qwen-14B microsoft-Phi-3-mini-128k-instruct
openai-clip-vit-base-patch16 deepseek-ai-DeepSeek-R1-Distill-Qwen-14B microsoft-Phi-3-mini-128k-instruct
openai-clip-vit-base-patch32 deepseek-ai-DeepSeek-R1-Distill-Qwen-7B microsoft-Phi-3-mini-128k-instruct
openai-clip-vit-large-patch14 deepseek-ai-DeepSeek-R1-Distill-Qwen-7B microsoft-Phi-3-mini-4k-instruct
openai-whisper-base-cpu-int8 facebook-opt-125m-splicegpt microsoft-Phi-3-mini-4k-instruct
openai-whisper-base.en-cpu-int8 gemma4-e2b-fp16-cuda microsoft-Phi-3-mini-4k-instruct
openai-whisper-large-cpu-int8 gemma4-e2b-int4-kquant-cuda microsoft-Phi-3-mini-4k-instruct
openai-whisper-large-v2-cpu-int8 google-bert-bert-base-multilingual-cased microsoft-Phi-3.5-mini-instruct
openai-whisper-large-v3-cpu-int8 google-bert-bert-base-multilingual-cased microsoft-Phi-3.5-mini-instruct
openai-whisper-large-v3-turbo-cpu-int8 google-bert-bert-base-multilingual-cased microsoft-Phi-3.5-mini-instruct
openai-whisper-large-v3-turbo google-bert-bert-base-multilingual-cased microsoft-Phi-3.5-mini-instruct
openai-whisper-large-v3-turbo google-bert-bert-base-multilingual-cased microsoft-Phi-3.5-mini-instruct
openai-whisper-large-v3-turbo google-bert-bert-base-multilingual-cased microsoft-Phi-4-mini-instruct
openai-whisper-large-v3-turbo google-gemma-3-1b-it microsoft-Phi-4-mini-instruct
openai-whisper-large-v3-turbo google-gemma microsoft-Phi-4-mini-instruct
openai-whisper-medium-cpu-int8 google-vit-base-patch16-224 microsoft-Phi-4-mini-instruct
openai-whisper-medium.en-cpu-int8 google-vit-base-patch16-224 microsoft-Phi-4-mini-reasoning
openai-whisper-small-cpu-int8 google-vit-base-patch16-224 microsoft-Phi-4-mini-reasoning
openai-whisper-small.en-cpu-int8 google-vit-base-patch16-224 microsoft-Phi-4-reasoning-plus
openai-whisper-tiny-cpu-int8 google-vit-base-patch16-224 microsoft-Phi-4-reasoning
openai-whisper-tiny.en-cpu-int8 google-vit-base-patch16-224 microsoft-Phi-4-reasoning
qwen2.5-vl-3B-Instruct gpt-oss-20b microsoft-Phi-4-reasoning
qwen3.5-0.8B-Instruct intel-bert-base-uncased-mrpc (ov) microsoft-Phi-4-reasoning
qwen3.5-2B intel-bert-base-uncased-mrpc-inc-quant microsoft-resnet-50
qwen3.5-35B-A3B-MoE intel-bert-base-uncased-mrpc-ptq microsoft-resnet-50
qwen3.5-4B intel-bert-base-uncased-mrpc microsoft-resnet-50
qwen3.5-9B intel-bert-base-uncased-mrpc microsoft-table-transformer-detection
qwen3vl-2B-Instruct intel-bert-base-uncased-mrpc mistralai-Mistral-7B-Instruct-v0.2
qwen3vl-4B-Instruct intel-bert-base-uncased-mrpc mistralai-Mistral-7B-Instruct-v0.2
qwen3vl-8B-Instruct intel-bert-base-uncased-mrpc openai-clip-vit-base-patch16
sam2.1-hiera-small laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-clip-vit-base-patch16
sam2.1-hiera-small laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-clip-vit-base-patch16
sam2.1-hiera-small laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-clip-vit-base-patch32
sd-legacy-stable-diffusion-v1-5 laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-clip-vit-base-patch32
sd2-community-stable-diffusion-2-1 laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-clip-vit-base-patch32
sshleifer-tiny-gpt2-sparsegpt laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-clip-vit-large-patch14
stable-diffusion-v1-4-safety-checker meta-llama-Llama-3.1-8B-Instruct openai-clip-vit-large-patch14
stable-diffusion-v1-4-text-encoder meta-llama-Llama-3.1-8B-Instruct openai-clip-vit-large-patch14
stable-diffusion-v1-4-unet meta-llama-Llama-3.1-8B-Instruct openai-whisper-large-v3-turbo
stable-diffusion-v1-4-vae-decoder meta-llama-Llama-3.1-8B-Instruct openai-whisper-large-v3-turbo
stable-diffusion-v1-4-vae-encoder meta-llama-Llama-3.2-1B-Instruct-dora openai-whisper-large-v3-turbo
stable-diffusion-v1-5-safety-checker meta-llama-Llama-3.2-1B-Instruct-hqq openai-whisper-large-v3-turbo
stable-diffusion-v1-5-text-encoder meta-llama-Llama-3.2-1B-Instruct-lmeval-onnx openai-whisper-large-v3-turbo
stable-diffusion-v1-5-unet meta-llama-Llama-3.2-1B-Instruct-lmeval openai-whisper-large-v3-turbo
stable-diffusion-v1-5-vae-decoder meta-llama-Llama-3.2-1B-Instruct-loha openai-whisper-large-v3-turbo
stable-diffusion-v1-5-vae-encoder meta-llama-Llama-3.2-1B-Instruct-lokr openai-whisper-large-v3-turbo
stable-diffusion-v1-5 meta-llama-Llama-3.2-1B-Instruct-mixed openai-whisper-large-v3-turbo
stable-diffusion-xl-base-1.0 meta-llama-Llama-3.2-1B-Instruct-qlora qwen2.5-7b-instruct
timm-mobilenetv3_small_100.lamb_in1k meta-llama-Llama-3.2-1B-Instruct sam-vit-base
translategemma-4b-it meta-llama-Llama-3.2-1B-Instruct sam-vit-base
videochat-flash-qwen2_5-7b-internvideo2-1b meta-llama-Llama-3.2-1B-Instruct sam-vit-base
meta-llama-Llama-3.2-1B-Instruct sam-vit-base
meta-llama-Llama-3.2-1B-Instruct sam-vit-base
meta-llama-Llama-3.2-1B-Instruct sam2.1-hiera-small
microsoft-Phi-3-mini-128k-instruct sam2.1-hiera-small
microsoft-Phi-3-mini-128k-instruct sam2.1-hiera-small
microsoft-Phi-3-mini-4k-instruct sam2.1-hiera-small
microsoft-Phi-3-mini-4k-instruct sam2.1-hiera-small
microsoft-Phi-3.5-mini-instruct sam2.1-hiera-small
microsoft-Phi-3.5-mini-instruct sd-legacy-stable-diffusion-v1-5
microsoft-Phi-3.5-mini-instruct sd-legacy-stable-diffusion-v1-5
microsoft-Phi-3.5-mini-instruct sd2-community-stable-diffusion-2-1
microsoft-Phi-3.5-mini-instruct sd2-community-stable-diffusion-2-1
microsoft-Phi-3.5-mini-instruct stable-diffusion-v1-4-safety-checker
microsoft-Phi-3.5-mini-instruct stable-diffusion-v1-4-text-encoder
microsoft-Phi-4-mini-instruct-mixed-tied stable-diffusion-v1-4-unet
microsoft-Phi-4-mini-instruct-mixed stable-diffusion-v1-4-vae-decoder
microsoft-Phi-4-mini-instruct stable-diffusion-v1-4-vae-encoder
microsoft-Phi-4-mini-instruct stable-diffusion-v1-5-safety-checker
microsoft-Phi-4-mini-instruct stable-diffusion-v1-5-text-encoder
microsoft-Phi-4-mini-instruct stable-diffusion-v1-5-unet
microsoft-Phi-4-mini-instruct_nvmo_ptq_mixed_precision_awq_lite stable-diffusion-v1-5-vae-decoder
microsoft-Phi-4-mini-reasoning stable-diffusion-v1-5-vae-encoder
microsoft-Phi-4-mini-reasoning stable-diffusion-v1-5
microsoft-Phi-4-mini-reasoning stable-diffusion-xl-base-1.0
microsoft-Phi-4-reasoning-plus timm-mobilenetv3_small_100.lamb_in1k
microsoft-Phi-4-reasoning timm-mobilenetv3_small_100.lamb_in1k
microsoft-Phi-4
microsoft-Phi-4
microsoft-Phi-4
microsoft-resnet-50
microsoft-resnet-50
microsoft-resnet-50
microsoft-resnet-50
microsoft-resnet-50
ministral_3_3b
mistral-7b
mistral-7b
mistral-7b
mistralai-Mistral-7B-Instruct-v0.2
mistralai-Mistral-7B-Instruct-v0.2
mistralai-Mistral-7B-Instruct-v0.3
moonshine-tiny
openai-clip-vit-base-patch16
openai-clip-vit-base-patch16
openai-clip-vit-base-patch16
openai-clip-vit-base-patch16
openai-clip-vit-base-patch16
openai-clip-vit-base-patch16
openai-clip-vit-base-patch32
openai-clip-vit-base-patch32
openai-clip-vit-base-patch32
openai-clip-vit-base-patch32
openai-clip-vit-base-patch32
openai-clip-vit-base-patch32
openai-clip-vit-large-patch14
openai-clip-vit-large-patch14
openai-clip-vit-large-patch14
openai-clip-vit-large-patch14
openai-whisper-base-cuda-int8
openai-whisper-base-webgpu-int8
openai-whisper-base.en-cuda-int8
openai-whisper-base.en-webgpu-int8
openai-whisper-large-cuda-int8
openai-whisper-large-v2-cuda-int8
openai-whisper-large-v2-webgpu-int8
openai-whisper-large-v3-cuda-int8
openai-whisper-large-v3-turbo-cuda-int8
openai-whisper-large-v3-turbo-webgpu-int8
openai-whisper-large-v3-turbo
openai-whisper-large-v3-webgpu-int8
openai-whisper-large-webgpu-int8
openai-whisper-medium-cuda-int8
openai-whisper-medium-webgpu-int8
openai-whisper-medium.en-cuda-int8
openai-whisper-medium.en-webgpu-int8
openai-whisper-small-cuda-int8
openai-whisper-small-webgpu-int8
openai-whisper-small.en-cuda-int8
openai-whisper-small.en-webgpu-int8
openai-whisper-tiny-cuda-int8
openai-whisper-tiny-webgpu-int8
openai-whisper-tiny.en-cuda-int8
openai-whisper-tiny.en-webgpu-int8
phi-4_Model_Builder_INT4
qwen2.5-vl-3B-Instruct
qwen3.5-0.8B-Instruct
qwen3.5-27B
qwen3.5-2B
qwen3.5-4B
qwen3.5-9B
qwen3vl-2B-Instruct
qwen3vl-4B-Instruct
qwen3vl-8B-Instruct
sam2.1-hiera-small
sam2.1-hiera-small
sam2.1-hiera-small
sd-legacy-stable-diffusion-v1-5
sd2-community-stable-diffusion-2-1
sshleifer-tiny-gpt2-sparsegpt
stable-diffusion-v1-4-safety-checker
stable-diffusion-v1-4-text-encoder
stable-diffusion-v1-4-unet
stable-diffusion-v1-4-vae-decoder
stable-diffusion-v1-4-vae-encoder
stable-diffusion-v1-5
stable-diffusion-xl-base-1.0
Models grouped by EP
CPU CUDA Dml MIGraphX NvTensorRTRTX OpenVINO QNN VitisAI WebGpu
alibaba-nlp-gte-large-en-v1.5 Qwen-Qwen2.5-1.5B-Instruct-mixed Qwen-Qwen2.5-1.5B-Instruct google-bert-bert-base-multilingual-cased DeepSeek-R1-Distill-Llama-8B_Model_Builder_INT4 OFA-Sys-chinese-clip-vit-base-patch16 Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-0.5B-Instruct ministral_3_3b
facebook-opt-125m-splicegpt deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B google-vit-base-patch16-224 DeepSeek-R1-Distill-Qwen-1.5B_Model_Builder_FP16 Qwen-Qwen2.5-0.5B-Instruct Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-1.5B-Instruct openai-whisper-base-webgpu-int8
gemma-3-1b-it_model_builder_cpu_FP32 facebook-opt-125m-splicegpt google-bert-bert-base-multilingual-cased intel-bert-base-uncased-mrpc DeepSeek-R1-Distill-Qwen-14B_NVMO_INT4_AWQ Qwen-Qwen2.5-0.5B-Instruct Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-7B-Instruct openai-whisper-base.en-webgpu-int8
gemma4-e2b-fp32-cpu gemma4-e2b-fp16-cuda google-vit-base-patch16-224 laion-CLIP-ViT-B-32-laion2B-s34B-b79K DeepSeek-R1-Distill-Qwen-7B_NVMO_INT4_RTN Qwen-Qwen2.5-0.5B Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-Coder-0.5B-Instruct openai-whisper-large-v2-webgpu-int8
gemma4-e2b-int4-kquant-cpu gemma4-e2b-int4-kquant-cuda intel-bert-base-uncased-mrpc microsoft-resnet-50 Llama-3.2-1B-Instruct_Model_Builder_FP16 Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-Coder-1.5B-Instruct openai-whisper-large-v3-turbo-webgpu-int8
google-gemma google-gemma laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-clip-vit-base-patch16 Llama3.1-8B-Instruct_Model_Builder_INT4 Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-7B-Instruct Qwen-Qwen2.5-Coder-7B-Instruct openai-whisper-large-v3-webgpu-int8
gpt-oss-20b gpt-oss-20b meta-llama-Llama-3.1-8B-Instruct openai-clip-vit-base-patch32 Mistral-7B-Instruct-v0.2_Model_Builder_INT4 Qwen-Qwen2.5-14B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B openai-whisper-large-webgpu-int8
intel-bert-base-uncased-mrpc-inc-smooth-quant intel-bert-base-uncased-mrpc-inc-quant meta-llama-Llama-3.2-1B-Instruct openai-clip-vit-large-patch14 Phi-3-mini-128k-instruct_NVMO_INT4_RTN Qwen-Qwen2.5-14B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B deepseek-ai-DeepSeek-R1-Distill-Qwen-7B openai-whisper-medium-webgpu-int8
intel-bert-base-uncased-mrpc-ptq intel-bert-base-uncased-mrpc-ptq microsoft-Phi-3.5-mini-instruct Phi-3-mini-4k-instruct_Model_Builder_INT4 Qwen-Qwen2.5-3B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B google-bert-bert-base-multilingual-cased openai-whisper-medium.en-webgpu-int8
meta-llama-Llama-3.2-1B-Instruct-dora meta-llama-Llama-3.2-1B-Instruct-dora microsoft-resnet-50 Phi3.5_Mini_Instruct_Model_Builder_INT4 Qwen-Qwen2.5-3B-Instruct deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B google-vit-base-patch16-224 openai-whisper-small-webgpu-int8
meta-llama-Llama-3.2-1B-Instruct-hqq meta-llama-Llama-3.2-1B-Instruct-hqq openai-clip-vit-base-patch16 Qwen-Qwen2.5-0.5B-Instruct Qwen-Qwen2.5-7B-Instruct google-bert-bert-base-multilingual-cased intel-bert-base-uncased-mrpc (AMD) openai-whisper-small.en-webgpu-int8
meta-llama-Llama-3.2-1B-Instruct-lmeval-onnx meta-llama-Llama-3.2-1B-Instruct-lmeval-onnx openai-clip-vit-base-patch32 Qwen-Qwen2.5-1.5B-Instruct Qwen-Qwen2.5-7B-Instruct google-bert-bert-base-multilingual-cased laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-whisper-tiny-webgpu-int8
meta-llama-Llama-3.2-1B-Instruct-lmeval meta-llama-Llama-3.2-1B-Instruct-lmeval openai-clip-vit-large-patch14 Qwen-Qwen2.5-14B-Instruct Qwen-Qwen2.5-Coder-0.5B-Instruct google-bert-bert-base-multilingual-cased meta-llama-Llama-3.1-8B-Instruct openai-whisper-tiny.en-webgpu-int8
meta-llama-Llama-3.2-1B-Instruct-loha meta-llama-Llama-3.2-1B-Instruct-loha Qwen-Qwen2.5-7B-Instruct Qwen-Qwen2.5-Coder-0.5B-Instruct google-vit-base-patch16-224 meta-llama-Llama-3.2-1B-Instruct
meta-llama-Llama-3.2-1B-Instruct-lokr meta-llama-Llama-3.2-1B-Instruct-lokr Qwen-Qwen2.5-Coder-0.5B-Instruct Qwen-Qwen2.5-Coder-1.5B-Instruct google-vit-base-patch16-224 microsoft-Phi-3-mini-128k-instruct
meta-llama-Llama-3.2-1B-Instruct-mixed meta-llama-Llama-3.2-1B-Instruct-mixed Qwen-Qwen2.5-Coder-1.5B-Instruct Qwen-Qwen2.5-Coder-1.5B-Instruct google-vit-base-patch16-224 microsoft-Phi-3-mini-4k-instruct
meta-llama-Llama-3.2-1B-Instruct-qlora meta-llama-Llama-3.2-1B-Instruct-qlora Qwen-Qwen2.5-Coder-14B-Instruct Qwen-Qwen2.5-Coder-14B-Instruct google-vit-base-patch16-224 microsoft-Phi-3.5-mini-instruct
meta-llama-Meta-Llama-3-8B microsoft-Phi-3.5-mini-instruct Qwen-Qwen2.5-Coder-7B-Instruct Qwen-Qwen2.5-Coder-14B-Instruct intel-bert-base-uncased-mrpc microsoft-Phi-4-mini-instruct
microsoft-deberta-base-mnli microsoft-Phi-4-mini-instruct-mixed-tied Qwen2.5-0.5B-Instruct_Model_Builder_FP16 Qwen-Qwen2.5-Coder-3B-Instruct intel-bert-base-uncased-mrpc microsoft-Phi-4-mini-reasoning
ministral_3_3b microsoft-Phi-4-mini-instruct-mixed Qwen2.5-14B-Instruct_Model_Builder_INT4 Qwen-Qwen2.5-Coder-3B-Instruct intel-bert-base-uncased-mrpc microsoft-resnet-50
moonshine-tiny ministral_3_3b Qwen2.5-7B-Instruct_Model_Builder_INT4 Qwen-Qwen2.5-Coder-7B-Instruct laion-CLIP-ViT-B-32-laion2B-s34B-b79K mistralai-Mistral-7B-Instruct-v0.2
openai-whisper-base-cpu-int8 mistral-7b Qwen2.5-Coder-0.5B-Instruct_Model_Builder_FP16 Qwen-Qwen2.5-Coder-7B-Instruct laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-clip-vit-base-patch16
openai-whisper-base.en-cpu-int8 mistral-7b Qwen2.5-Coder-1.5B-Instruct_Model_Builder_FP16 deepseek-ai-DeepSeek-R1-Distill-Llama-8B laion-CLIP-ViT-B-32-laion2B-s34B-b79K openai-clip-vit-base-patch32
openai-whisper-large-cpu-int8 mistral-7b Qwen2.5-Coder-14B-Instruct_Model_Builder_INT4 deepseek-ai-DeepSeek-R1-Distill-Llama-8B llama3.1-8b-instruct-x-elite openai-clip-vit-large-patch14
openai-whisper-large-v2-cpu-int8 moonshine-tiny Qwen2.5-Coder-7B-Instruct_Model_Builder_INT4 deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B llama3.1-8b-instruct-x2-elite stable-diffusion-v1-5-safety-checker
openai-whisper-large-v3-cpu-int8 openai-whisper-base-cuda-int8 Qwen2.5_1.5B_Instruct_Model_Builder_FP16 deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B meta-llama-Llama-3.1-8B-Instruct stable-diffusion-v1-5-text-encoder
openai-whisper-large-v3-turbo-cpu-int8 openai-whisper-base.en-cuda-int8 deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B deepseek-ai-DeepSeek-R1-Distill-Qwen-14B meta-llama-Llama-3.1-8B-Instruct stable-diffusion-v1-5-unet
openai-whisper-large-v3-turbo openai-whisper-large-cuda-int8 deepseek-ai-DeepSeek-R1-Distill-Qwen-14B deepseek-ai-DeepSeek-R1-Distill-Qwen-14B meta-llama-Llama-3.2-1B-Instruct stable-diffusion-v1-5-vae-decoder
openai-whisper-large-v3-turbo openai-whisper-large-v2-cuda-int8 deepseek-ai-DeepSeek-R1-Distill-Qwen-7B deepseek-ai-DeepSeek-R1-Distill-Qwen-7B meta-llama-Llama-3.2-1B-Instruct stable-diffusion-v1-5-vae-encoder
openai-whisper-large-v3-turbo openai-whisper-large-v3-cuda-int8 google-bert-bert-base-multilingual-cased deepseek-ai-DeepSeek-R1-Distill-Qwen-7B meta-llama-Llama-3.2-1B-Instruct timm-mobilenetv3_small_100.lamb_in1k
openai-whisper-large-v3-turbo openai-whisper-large-v3-turbo-cuda-int8 google-vit-base-patch16-224 google-bert-bert-base-multilingual-cased meta-llama-Llama-3.2-1B-Instruct
openai-whisper-medium-cpu-int8 openai-whisper-medium-cuda-int8 intel-bert-base-uncased-mrpc google-gemma-3-1b-it microsoft-Phi-3-mini-128k-instruct
openai-whisper-medium.en-cpu-int8 openai-whisper-medium.en-cuda-int8 laion-CLIP-ViT-B-32-laion2B-s34B-b79K google-gemma-3-1b-it microsoft-Phi-3-mini-128k-instruct
openai-whisper-small-cpu-int8 openai-whisper-small-cuda-int8 meta-llama-Llama-3.1-8B-Instruct google-vit-base-patch16-224 microsoft-Phi-3-mini-4k-instruct
openai-whisper-small.en-cpu-int8 openai-whisper-small.en-cuda-int8 meta-llama-Llama-3.2-1B-Instruct google-vit-base-patch16-224 microsoft-Phi-3-mini-4k-instruct
openai-whisper-tiny-cpu-int8 openai-whisper-tiny-cuda-int8 microsoft-Phi-3-mini-128k-instruct intel-bert-base-uncased-mrpc (ov) microsoft-Phi-3.5-mini-instruct
openai-whisper-tiny.en-cpu-int8 openai-whisper-tiny.en-cuda-int8 microsoft-Phi-3-mini-4k-instruct laion-CLIP-ViT-B-32-laion2B-s34B-b79K microsoft-Phi-3.5-mini-instruct
qwen2.5-vl-3B-Instruct qwen2.5-vl-3B-Instruct microsoft-Phi-3.5-mini-instruct meta-llama-Llama-3.1-8B-Instruct microsoft-Phi-3.5-mini-instruct
qwen3.5-0.8B-Instruct qwen3.5-0.8B-Instruct microsoft-Phi-4-mini-instruct meta-llama-Llama-3.1-8B-Instruct microsoft-Phi-3.5-mini-instruct
qwen3.5-2B qwen3.5-27B microsoft-Phi-4-mini-instruct_nvmo_ptq_mixed_precision_awq_lite meta-llama-Llama-3.2-1B-Instruct microsoft-Phi-3.5-mini-instruct
qwen3.5-35B-A3B-MoE qwen3.5-2B microsoft-Phi-4 meta-llama-Llama-3.2-1B-Instruct microsoft-Phi-3.5-mini-instruct
qwen3.5-4B qwen3.5-4B microsoft-resnet-50 microsoft-Phi-3-mini-128k-instruct microsoft-Phi-4-mini-instruct
qwen3.5-9B qwen3.5-9B mistralai-Mistral-7B-Instruct-v0.2 microsoft-Phi-3-mini-128k-instruct microsoft-Phi-4-mini-instruct
qwen3vl-2B-Instruct qwen3vl-2B-Instruct openai-clip-vit-base-patch16 microsoft-Phi-3-mini-4k-instruct microsoft-Phi-4-reasoning
qwen3vl-4B-Instruct qwen3vl-4B-Instruct openai-clip-vit-base-patch32 microsoft-Phi-3-mini-4k-instruct microsoft-Phi-4-reasoning
qwen3vl-8B-Instruct qwen3vl-8B-Instruct openai-clip-vit-large-patch14 microsoft-Phi-3.5-mini-instruct microsoft-Phi-4-reasoning
sshleifer-tiny-gpt2-sparsegpt sshleifer-tiny-gpt2-sparsegpt phi-4_Model_Builder_INT4 microsoft-Phi-3.5-mini-instruct microsoft-resnet-50
stable-diffusion-v1-4-safety-checker stable-diffusion-v1-4-safety-checker microsoft-Phi-4-mini-instruct microsoft-resnet-50
stable-diffusion-v1-4-text-encoder stable-diffusion-v1-4-text-encoder microsoft-Phi-4-mini-instruct microsoft-table-transformer-detection
stable-diffusion-v1-4-unet stable-diffusion-v1-4-unet microsoft-Phi-4-mini-instruct openai-clip-vit-base-patch16
stable-diffusion-v1-4-vae-decoder stable-diffusion-v1-4-vae-decoder microsoft-Phi-4-mini-instruct openai-clip-vit-base-patch16
stable-diffusion-v1-4-vae-encoder stable-diffusion-v1-4-vae-encoder microsoft-Phi-4-mini-reasoning openai-clip-vit-base-patch16
stable-diffusion-v1-5-safety-checker stable-diffusion-v1-5 microsoft-Phi-4-mini-reasoning openai-clip-vit-base-patch32
stable-diffusion-v1-5-text-encoder stable-diffusion-xl-base-1.0 microsoft-Phi-4-mini-reasoning openai-clip-vit-base-patch32
stable-diffusion-v1-5-unet microsoft-Phi-4-mini-reasoning openai-clip-vit-base-patch32
stable-diffusion-v1-5-vae-decoder microsoft-Phi-4-reasoning-plus openai-clip-vit-large-patch14
stable-diffusion-v1-5-vae-encoder microsoft-Phi-4-reasoning-plus openai-whisper-large-v3-turbo
stable-diffusion-v1-5 microsoft-Phi-4-reasoning openai-whisper-large-v3-turbo
stable-diffusion-xl-base-1.0 microsoft-Phi-4-reasoning openai-whisper-large-v3-turbo
timm-mobilenetv3_small_100.lamb_in1k microsoft-Phi-4 openai-whisper-large-v3-turbo
translategemma-4b-it microsoft-Phi-4 openai-whisper-large-v3-turbo
videochat-flash-qwen2_5-7b-internvideo2-1b microsoft-resnet-50 openai-whisper-large-v3-turbo
mistralai-Mistral-7B-Instruct-v0.2 openai-whisper-large-v3-turbo
mistralai-Mistral-7B-Instruct-v0.2 qwen2.5-7b-instruct
mistralai-Mistral-7B-Instruct-v0.3 sam-vit-base
openai-clip-vit-base-patch16 sam-vit-base
openai-clip-vit-base-patch32 sam-vit-base
openai-clip-vit-large-patch14 sam-vit-base
openai-whisper-large-v3-turbo sam-vit-base
openai-whisper-large-v3-turbo sam2.1-hiera-small
openai-whisper-large-v3-turbo sam2.1-hiera-small
sam2.1-hiera-small sam2.1-hiera-small
sam2.1-hiera-small sd-legacy-stable-diffusion-v1-5
sam2.1-hiera-small sd2-community-stable-diffusion-2-1
sd-legacy-stable-diffusion-v1-5 stable-diffusion-v1-4-safety-checker
sd-legacy-stable-diffusion-v1-5 stable-diffusion-v1-4-text-encoder
sd2-community-stable-diffusion-2-1 stable-diffusion-v1-4-unet
sd2-community-stable-diffusion-2-1 stable-diffusion-v1-4-vae-decoder
stable-diffusion-v1-4-safety-checker stable-diffusion-v1-4-vae-encoder
stable-diffusion-v1-4-text-encoder stable-diffusion-v1-5
stable-diffusion-v1-4-unet stable-diffusion-xl-base-1.0
stable-diffusion-v1-4-vae-decoder timm-mobilenetv3_small_100.lamb_in1k
stable-diffusion-v1-4-vae-encoder
stable-diffusion-v1-5

Learn more

🤝 Contributions and Feedback

⚖️ License

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors