|
| 1 | +.. _models_llm_deepseek-r1-0528-qwen3: |
| 2 | + |
| 3 | +======================================== |
| 4 | +deepseek-r1-0528-qwen3 |
| 5 | +======================================== |
| 6 | + |
| 7 | +- **Context Length:** 131072 |
| 8 | +- **Model Name:** deepseek-r1-0528-qwen3 |
| 9 | +- **Languages:** en, zh |
| 10 | +- **Abilities:** chat, reasoning |
| 11 | +- **Description:** The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. In the latest update, DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro |
| 12 | + |
| 13 | +Specifications |
| 14 | +^^^^^^^^^^^^^^ |
| 15 | + |
| 16 | + |
| 17 | +Model Spec 1 (pytorch, 8 Billion) |
| 18 | +++++++++++++++++++++++++++++++++++++++++ |
| 19 | + |
| 20 | +- **Model Format:** pytorch |
| 21 | +- **Model Size (in billions):** 8 |
| 22 | +- **Quantizations:** none |
| 23 | +- **Engines**: vLLM, Transformers, SGLang |
| 24 | +- **Model ID:** deepseek-ai/DeepSeek-R1-0528-Qwen3-8B |
| 25 | +- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B>`__ |
| 26 | + |
| 27 | +Execute the following command to launch the model, remember to replace ``${quantization}`` with your |
| 28 | +chosen quantization method from the options listed above:: |
| 29 | + |
| 30 | + xinference launch --model-engine ${engine} --model-name deepseek-r1-0528-qwen3 --size-in-billions 8 --model-format pytorch --quantization ${quantization} |
| 31 | + |
| 32 | + |
| 33 | +Model Spec 2 (gptq, 8 Billion) |
| 34 | +++++++++++++++++++++++++++++++++++++++++ |
| 35 | + |
| 36 | +- **Model Format:** gptq |
| 37 | +- **Model Size (in billions):** 8 |
| 38 | +- **Quantizations:** Int4-W4A16, Int8-W8A16 |
| 39 | +- **Engines**: vLLM, Transformers, SGLang |
| 40 | +- **Model ID:** QuantTrio/DeepSeek-R1-0528-Qwen3-8B-{quantization} |
| 41 | +- **Model Hubs**: `Hugging Face <https://huggingface.co/QuantTrio/DeepSeek-R1-0528-Qwen3-8B-{quantization}>`__, `ModelScope <https://modelscope.cn/models/tclf90/DeepSeek-R1-0528-Qwen3-8B-GPTQ-Int4-Int8Mix>`__ |
| 42 | + |
| 43 | +Execute the following command to launch the model, remember to replace ``${quantization}`` with your |
| 44 | +chosen quantization method from the options listed above:: |
| 45 | + |
| 46 | + xinference launch --model-engine ${engine} --model-name deepseek-r1-0528-qwen3 --size-in-billions 8 --model-format gptq --quantization ${quantization} |
| 47 | + |
| 48 | + |
| 49 | +Model Spec 3 (gptq, 8 Billion) |
| 50 | +++++++++++++++++++++++++++++++++++++++++ |
| 51 | + |
| 52 | +- **Model Format:** gptq |
| 53 | +- **Model Size (in billions):** 8 |
| 54 | +- **Quantizations:** Int4-Int8Mix |
| 55 | +- **Engines**: vLLM, Transformers, SGLang |
| 56 | +- **Model ID:** QuantTrio/DeepSeek-R1-0528-Qwen3-8B-GPTQ-Int4-Int8Mix |
| 57 | +- **Model Hubs**: `Hugging Face <https://huggingface.co/QuantTrio/DeepSeek-R1-0528-Qwen3-8B-GPTQ-Int4-Int8Mix>`__, `ModelScope <https://modelscope.cn/models/tclf90/DeepSeek-R1-0528-Qwen3-8B-GPTQ-Int4-Int8Mix>`__ |
| 58 | + |
| 59 | +Execute the following command to launch the model, remember to replace ``${quantization}`` with your |
| 60 | +chosen quantization method from the options listed above:: |
| 61 | + |
| 62 | + xinference launch --model-engine ${engine} --model-name deepseek-r1-0528-qwen3 --size-in-billions 8 --model-format gptq --quantization ${quantization} |
| 63 | + |
0 commit comments