Name and Version
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
slot process_toke: id 0 | task 29484 | raw tool marker observed while lazy grammar is enabled; suppressing DFlash for this response without changing sampler state in_reasoning=0 n_decoded=11 reasoning_tokens=0 visible_tokens=10
Operating systems
Windows
GGML backends
CUDA
Hardware
GPU RTX 5080
CPU 14600k
RAM 2x24 DDR5 Teamgroup CL34 @ 7400Mhz
OS: Windows 11
Models
Qwen3.6 27B Q3_K_M
Problem description & steps to reproduce
I use pi.dev [pi coding agent] with the model, I just tell pi to analyzed a big codebase and write detailed report about it.. and that's it... the bug is reproduced
If I tell PI to write only 100 lines at a time in the report file it works.... it does not crash.
First Bad Commit
No response
Relevant log output
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
slot process_toke: id 0 | task 29484 | raw tool marker observed while lazy grammar is enabled; suppressing DFlash for this response without changing sampler state in_reasoning=0 n_decoded=11 reasoning_tokens=0 visible_tokens=10
Name and Version
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
slot process_toke: id 0 | task 29484 | raw tool marker observed while lazy grammar is enabled; suppressing DFlash for this response without changing sampler state in_reasoning=0 n_decoded=11 reasoning_tokens=0 visible_tokens=10
Operating systems
Windows
GGML backends
CUDA
Hardware
GPU RTX 5080
CPU 14600k
RAM 2x24 DDR5 Teamgroup CL34 @ 7400Mhz
OS: Windows 11
Models
Qwen3.6 27B Q3_K_M
Problem description & steps to reproduce
I use pi.dev [pi coding agent] with the model, I just tell pi to analyzed a big codebase and write detailed report about it.. and that's it... the bug is reproduced
If I tell PI to write only 100 lines at a time in the report file it works.... it does not crash.
First Bad Commit
No response
Relevant log output
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
slot process_toke: id 0 | task 29484 | raw tool marker observed while lazy grammar is enabled; suppressing DFlash for this response without changing sampler state in_reasoning=0 n_decoded=11 reasoning_tokens=0 visible_tokens=10