The experimental.server (v0.7.1) /v1/chat/completions streaming response
emits SSE chunks missing the object: "chat.completion.chunk" field that
the OpenAI spec requires for ChatCompletionChunk.
Source: experimental/server/api_server.py::_sse_chunk (lines ~290 in v0.7.1):
def _sse_chunk(response_id, delta, finish_reason=None):
choice = {"delta": delta, "index": 0}
if finish_reason:
choice["finish_reason"] = finish_reason
payload = {"id": response_id, "choices": [choice]} # missing "object"
return f"data: {json.dumps(payload)}\n\n"
The non-streaming branch (line ~188) correctly emits "object": "chat.completion",
so this is a streaming-specific omission.
Impact: Strict OpenAI-compatible clients reject every chunk. We hit this
with NVIDIA AIPerf 0.8.0 (ValueError: Unsupported OpenAI object type: None,
50/50 records lost) when benchmarking Qwen3-8B-NVFP4 on Jetson AGX Thor.
Reproduce:
curl -N http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"...","messages":[{"role":"user","content":"hi"}],"stream":true}' \
| head -3
Each data: {...} line should contain "object": "chat.completion.chunk";
in v0.7.1 it does not.
Suggested fix: one line in _sse_chunk:
payload = {
"id": response_id,
"object": "chat.completion.chunk",
"choices": [choice],
}
Environment: Jetson AGX Thor T5000, JetPack 7.1, Edge-LLM v0.7.1,
container nvcr.io/nvidia/pytorch:25.12-py3.
The
experimental.server(v0.7.1)/v1/chat/completionsstreaming responseemits SSE chunks missing the
object: "chat.completion.chunk"field thatthe OpenAI spec requires for
ChatCompletionChunk.Source:
experimental/server/api_server.py::_sse_chunk(lines ~290 in v0.7.1):The non-streaming branch (line ~188) correctly emits
"object": "chat.completion",so this is a streaming-specific omission.
Impact: Strict OpenAI-compatible clients reject every chunk. We hit this
with NVIDIA AIPerf 0.8.0 (
ValueError: Unsupported OpenAI object type: None,50/50 records lost) when benchmarking Qwen3-8B-NVFP4 on Jetson AGX Thor.
Reproduce:
Each
data: {...}line should contain"object": "chat.completion.chunk";in v0.7.1 it does not.
Suggested fix: one line in
_sse_chunk:Environment: Jetson AGX Thor T5000, JetPack 7.1, Edge-LLM v0.7.1,
container nvcr.io/nvidia/pytorch:25.12-py3.