Skip to content

Commit 91f743a

Browse files
authored
FEAT: support CogView4 image model (#3557)
1 parent 72cc5e3 commit 91f743a

8 files changed

Lines changed: 249 additions & 81 deletions

File tree

doc/source/locale/zh_CN/LC_MESSAGES/models/model_abilities/image.po

Lines changed: 37 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ msgid ""
88
msgstr ""
99
"Project-Id-Version: Xinference \n"
1010
"Report-Msgid-Bugs-To: \n"
11-
"POT-Creation-Date: 2025-05-25 20:55+0800\n"
11+
"POT-Creation-Date: 2025-06-02 20:52+0800\n"
1212
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
1313
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
1414
"Language: zh_CN\n"
@@ -104,31 +104,31 @@ msgstr ""
104104

105105
#: ../../source/models/model_abilities/image.rst:44
106106
#: ../../source/models/model_abilities/image.rst:208
107-
#: ../../source/models/model_abilities/image.rst:237
107+
#: ../../source/models/model_abilities/image.rst:241
108108
msgid "sd3.5-medium"
109109
msgstr ""
110110

111111
#: ../../source/models/model_abilities/image.rst:45
112112
#: ../../source/models/model_abilities/image.rst:210
113-
#: ../../source/models/model_abilities/image.rst:239
113+
#: ../../source/models/model_abilities/image.rst:243
114114
msgid "sd3.5-large"
115115
msgstr ""
116116

117117
#: ../../source/models/model_abilities/image.rst:46
118118
#: ../../source/models/model_abilities/image.rst:212
119-
#: ../../source/models/model_abilities/image.rst:241
119+
#: ../../source/models/model_abilities/image.rst:245
120120
msgid "sd3.5-large-turbo"
121121
msgstr ""
122122

123123
#: ../../source/models/model_abilities/image.rst:47
124124
#: ../../source/models/model_abilities/image.rst:204
125-
#: ../../source/models/model_abilities/image.rst:235
125+
#: ../../source/models/model_abilities/image.rst:239
126126
msgid "FLUX.1-schnell"
127127
msgstr ""
128128

129129
#: ../../source/models/model_abilities/image.rst:48
130130
#: ../../source/models/model_abilities/image.rst:202
131-
#: ../../source/models/model_abilities/image.rst:233
131+
#: ../../source/models/model_abilities/image.rst:237
132132
msgid "FLUX.1-dev"
133133
msgstr ""
134134

@@ -173,8 +173,9 @@ msgid ""
173173
"reference/images/createVariation>`_. We can try image-to-image API out "
174174
"either via cURL, OpenAI Client, or Xinference's python client:"
175175
msgstr ""
176-
"图生图 API 模拟了 OpenAI 的 `图像变体创建 API <https://platform.openai.com/docs/api-reference/images/createVariation>`_。"
177-
"我们可以通过 cURL、OpenAI 客户端,或 Xinference 的 Python 客户端来尝试使用图生图 API:"
176+
"图生图 API 模拟了 OpenAI 的 `图像变体创建 API <https://platform.openai."
177+
"com/docs/api-reference/images/createVariation>`_。我们可以通过 cURL、"
178+
"OpenAI 客户端,或 Xinference 的 Python 客户端来尝试使用图生图 API:"
178179

179180
#: ../../source/models/model_abilities/image.rst:169
180181
msgid "Memory optimization for Large Image Models e.g. SD3-Medium, FLUX.1"
@@ -253,7 +254,7 @@ msgid "Below list default options that used from v0.16.1."
253254
msgstr "如下列出了从 v0.16.1 开始默认使用的参数。"
254255

255256
#: ../../source/models/model_abilities/image.rst:200
256-
#: ../../source/models/model_abilities/image.rst:231
257+
#: ../../source/models/model_abilities/image.rst:235
257258
msgid "Model"
258259
msgstr "模型"
259260

@@ -313,11 +314,23 @@ msgstr ""
313314
"设置 key ``quantize_text_encoder`` 和值 ``False``,或对于命令行,指定 ``"
314315
"--quantize_text_encoder False`` 来关闭 text encoder 的量化。"
315316

316-
#: ../../source/models/model_abilities/image.rst:223
317+
#: ../../source/models/model_abilities/image.rst:222
318+
msgid ""
319+
"For :ref:`CogView4 <models_builtin_cogview4>`, we found that quantization"
320+
" has a significant impact on the model. Therefore, when GPU memory is "
321+
"limited, we recommend enabling the CPU offload option in the Web UI, and"
322+
" specifying ``--cpu_offload True`` when loading the model via the command"
323+
" line."
324+
msgstr ""
325+
"对于 :ref:`CogView4 <models_builtin_cogview4>`,我们发现量化对模型的影响较大。"
326+
"因此,当显存有限时,我们推荐在 Web UI 中启用 CPU offload 选项,在命令行加载模型时指定 "
327+
"``--cpu_offload True``。"
328+
329+
#: ../../source/models/model_abilities/image.rst:227
317330
msgid "GGUF file format"
318331
msgstr "GGUF 文件格式"
319332

320-
#: ../../source/models/model_abilities/image.rst:225
333+
#: ../../source/models/model_abilities/image.rst:229
321334
msgid ""
322335
"GGUF file format for transformer provides various quantization options. "
323336
"To use gguf file, you can specify additional option ``gguf_quantization``"
@@ -329,27 +342,27 @@ msgstr ""
329342
"``--gguf_quantization`` ,以为 Xinference 内建支持 GGUF 量化的模型开启。"
330343
"如下是内置支持的模型。"
331344

332-
#: ../../source/models/model_abilities/image.rst:231
345+
#: ../../source/models/model_abilities/image.rst:235
333346
msgid "supported gguf quantization"
334347
msgstr "支持 GGUF 量化格式"
335348

336-
#: ../../source/models/model_abilities/image.rst:233
337-
#: ../../source/models/model_abilities/image.rst:235
349+
#: ../../source/models/model_abilities/image.rst:237
350+
#: ../../source/models/model_abilities/image.rst:239
338351
msgid "F16, Q2_K, Q3_K_S, Q4_0, Q4_1, Q4_K_S, Q5_0, Q5_1, Q5_K_S, Q6_K, Q8_0"
339352
msgstr ""
340353

341-
#: ../../source/models/model_abilities/image.rst:237
354+
#: ../../source/models/model_abilities/image.rst:241
342355
msgid ""
343356
"F16, Q3_K_M, Q3_K_S, Q4_0, Q4_1, Q4_K_M, Q4_K_S, Q5_0, Q5_1, Q5_K_M, "
344357
"Q5_K_S, Q6_K, Q8_0"
345358
msgstr ""
346359

347-
#: ../../source/models/model_abilities/image.rst:239
348-
#: ../../source/models/model_abilities/image.rst:241
360+
#: ../../source/models/model_abilities/image.rst:243
361+
#: ../../source/models/model_abilities/image.rst:245
349362
msgid "F16, Q4_0, Q4_1, Q5_0, Q5_1, Q8_0"
350363
msgstr ""
351364

352-
#: ../../source/models/model_abilities/image.rst:246
365+
#: ../../source/models/model_abilities/image.rst:250
353366
msgid ""
354367
"We stronly recommend to enable additional option ``cpu_offload`` with "
355368
"value ``True`` for WebUI, or specify ``--cpu_offload True`` for command "
@@ -358,17 +371,17 @@ msgstr ""
358371
"我们强烈推荐在 WebUI 上开启额外选项 ``cpu_offload`` 并指定为 ``True``,或"
359372
"对命令行,指定 ``--cpu_offload True``。"
360373

361-
#: ../../source/models/model_abilities/image.rst:249
374+
#: ../../source/models/model_abilities/image.rst:253
362375
msgid "Example:"
363376
msgstr "例如:"
364377

365-
#: ../../source/models/model_abilities/image.rst:255
378+
#: ../../source/models/model_abilities/image.rst:259
366379
msgid ""
367380
"With ``Q2_K`` quantization, you only need around 5 GiB GPU memory to run "
368381
"Flux.1-dev."
369382
msgstr "使用 ``Q2_K`` 量化,你只需要大约 5GB 的显存来运行 Flux.1-dev。"
370383

371-
#: ../../source/models/model_abilities/image.rst:257
384+
#: ../../source/models/model_abilities/image.rst:261
372385
msgid ""
373386
"For those models gguf options are not supported internally, or you want "
374387
"to download gguf files on you own, you can specify additional option "
@@ -379,15 +392,15 @@ msgstr ""
379392
"Web UI 指定额外选项 ``gguf_model_path`` 或者用命令行指定 ``--gguf_model_"
380393
"path /path/to/model_quant.gguf`` 。"
381394

382-
#: ../../source/models/model_abilities/image.rst:263
395+
#: ../../source/models/model_abilities/image.rst:267
383396
msgid "OCR"
384397
msgstr ""
385398

386-
#: ../../source/models/model_abilities/image.rst:265
399+
#: ../../source/models/model_abilities/image.rst:269
387400
msgid "The OCR API accepts image bytes and returns the OCR text."
388401
msgstr "OCR API 接受图像字节并返回 OCR 文本。"
389402

390-
#: ../../source/models/model_abilities/image.rst:267
403+
#: ../../source/models/model_abilities/image.rst:271
391404
msgid "We can try OCR API out either via cURL, or Xinference's python client:"
392405
msgstr "可以通过 cURL 或 Xinference 的 Python 客户端来尝试 OCR API。"
393406

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
.. _models_builtin_cogview4:
2+
3+
========
4+
cogview4
5+
========
6+
7+
- **Model Name:** cogview4
8+
- **Model Family:** stable_diffusion
9+
- **Abilities:** text2image
10+
- **Available ControlNet:** None
11+
12+
Specifications
13+
^^^^^^^^^^^^^^
14+
15+
- **Model ID:** THUDM/CogView4-6B
16+
17+
Execute the following command to launch the model::
18+
19+
xinference launch --model-name cogview4 --model-type image
20+

doc/source/models/builtin/image/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ The following is a list of built-in image models in Xinference:
1111
:maxdepth: 1
1212

1313

14+
cogview4
15+
1416
flux.1-dev
1517

1618
flux.1-schnell

doc/source/models/model_abilities/image.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,10 @@ Below list default options that used from v0.16.1.
219219
and for command line, specify ``--quantize_text_encoder False`` to disable quantization
220220
for text encoder.
221221

222+
For :ref:`CogView4 <models_builtin_cogview4>`, we found that quantization has a significant impact on the model.
223+
Therefore, when GPU memory is limited, we recommend enabling the CPU offload option in the Web UI,
224+
and specifying ``--cpu_offload True`` when loading the model via the command line.
225+
222226
GGUF file format
223227
~~~~~~~~~~~~~~~~
224228

xinference/model/image/model_spec.json

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@
123123
"quantize": true,
124124
"quantize_text_encoder": "text_encoder_3",
125125
"torch_dtype": "bfloat16",
126-
"transformer_nf4": true
126+
"transformer_quantization": "nf4"
127127
},
128128
"gguf_model_id": "city96/stable-diffusion-3.5-large-gguf",
129129
"gguf_quantizations": [
@@ -150,7 +150,7 @@
150150
"quantize": true,
151151
"quantize_text_encoder": "text_encoder_3",
152152
"torch_dtype": "bfloat16",
153-
"transformer_nf4": true
153+
"transformer_quantization": "nf4"
154154
},
155155
"default_generate_config": {
156156
"guidance_scale": 1.0,
@@ -314,6 +314,24 @@
314314
]
315315
}
316316
},
317+
{
318+
"model_name": "cogview4",
319+
"model_family": "stable_diffusion",
320+
"model_id": "THUDM/CogView4-6B",
321+
"model_revision": "63a52b7f6dace7033380cd6da14d0915eab3e6b5",
322+
"model_ability": [
323+
"text2image"
324+
],
325+
"default_model_config": {
326+
"torch_dtype": "bfloat16"
327+
},
328+
"virtualenv": {
329+
"packages": [
330+
"diffusers>=0.33.0",
331+
"#system_numpy#"
332+
]
333+
}
334+
},
317335
{
318336
"model_name": "stable-diffusion-inpainting",
319337
"model_family": "stable_diffusion",

xinference/model/image/model_spec_modelscope.json

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@
128128
"quantize": true,
129129
"quantize_text_encoder": "text_encoder_3",
130130
"torch_dtype": "bfloat16",
131-
"transformer_nf4": true
131+
"transformer_quantization": "nf4"
132132
},
133133
"gguf_model_id": "Xorbits/stable-diffusion-3.5-large-gguf",
134134
"gguf_quantizations": [
@@ -156,7 +156,7 @@
156156
"quantize": true,
157157
"quantize_text_encoder": "text_encoder_3",
158158
"torch_dtype": "bfloat16",
159-
"transformer_nf4": true
159+
"transformer_quantization": "nf4"
160160
},
161161
"default_generate_config": {
162162
"guidance_scale": 1.0,
@@ -327,6 +327,25 @@
327327
]
328328
}
329329
},
330+
{
331+
"model_name": "cogview4",
332+
"model_family": "stable_diffusion",
333+
"model_hub": "modelscope",
334+
"model_id": "ZhipuAI/CogView4-6B",
335+
"model_revision": "master",
336+
"model_ability": [
337+
"text2image"
338+
],
339+
"default_model_config": {
340+
"torch_dtype": "bfloat16"
341+
},
342+
"virtualenv": {
343+
"packages": [
344+
"diffusers>=0.33.0",
345+
"#system_numpy#"
346+
]
347+
}
348+
},
330349
{
331350
"model_name": "GOT-OCR2_0",
332351
"model_family": "ocr",

0 commit comments

Comments
 (0)