Modelfile-built version: JSON tool calls appear inside `<think>` blocks instead of after them #356

bidueiro · 2026-05-31T21:54:11Z

bidueiro
May 31, 2026

Hi. I'm testing MiniCPM5-1B-Q4_K_M with Ollama using a custom Modelfile and I am running into an issue with tool calling output format.

Setup:

FROM MiniCPM5-1B-Q4_K_M.gguf

TEMPLATE """{{- if .Messages }}
{{- range .Messages }}
<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{- end }}
<|im_start|>assistant
{{- end }}"""

PARAMETER temperature 0.7
PARAMETER top_p 0.95
PARAMETER stop "<|im_end|>"
PARAMETER stop "</s>"

Problem:
When prompting for JSON tool dispatch, the model places the JSON answer inside the <think> block rather than after it:

<think>
The user wants to search for AI. I should use vault.search with {"tool": "vault.search"...
</think>

Instead of the expected:

<think>
The user wants to search for AI.
</think>
{"tool": "vault.search", "args": {"query": "AI"}}

Also observed: simple chat produces <think><think>\n</think> (multiple nested opening tags) before the actual reply.

Comparison:
The official openbmb/minicpm5:q4_K_M from Ollama registry behaves better for tool dispatch, but uses top_p 0.8 and stop "<|endoftext|>" instead of the documented values. Is there a known correct Modelfile configuration for no-think mode with Ollama?

Thanks otherwise for a very interesting model with great potential! :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modelfile-built version: JSON tool calls appear inside `<think>` blocks instead of after them #356

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Modelfile-built version: JSON tool calls appear inside <think> blocks instead of after them #356

Uh oh!

bidueiro May 31, 2026

Replies: 0 comments

Modelfile-built version: JSON tool calls appear inside `<think>` blocks instead of after them #356

bidueiro
May 31, 2026