Skip to content

Experimental OpenAI server: SSE chunks missing required object field #93

@ddonovan312

Description

@ddonovan312

The experimental.server (v0.7.1) /v1/chat/completions streaming response
emits SSE chunks missing the object: "chat.completion.chunk" field that
the OpenAI spec requires for ChatCompletionChunk.

Source: experimental/server/api_server.py::_sse_chunk (lines ~290 in v0.7.1):

def _sse_chunk(response_id, delta, finish_reason=None):
    choice = {"delta": delta, "index": 0}
    if finish_reason:
        choice["finish_reason"] = finish_reason
    payload = {"id": response_id, "choices": [choice]}   # missing "object"
    return f"data: {json.dumps(payload)}\n\n"

The non-streaming branch (line ~188) correctly emits "object": "chat.completion",
so this is a streaming-specific omission.

Impact: Strict OpenAI-compatible clients reject every chunk. We hit this
with NVIDIA AIPerf 0.8.0 (ValueError: Unsupported OpenAI object type: None,
50/50 records lost) when benchmarking Qwen3-8B-NVFP4 on Jetson AGX Thor.

Reproduce:

curl -N http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"...","messages":[{"role":"user","content":"hi"}],"stream":true}' \
  | head -3

Each data: {...} line should contain "object": "chat.completion.chunk";
in v0.7.1 it does not.

Suggested fix: one line in _sse_chunk:

payload = {
    "id": response_id,
    "object": "chat.completion.chunk",
    "choices": [choice],
}

Environment: Jetson AGX Thor T5000, JetPack 7.1, Edge-LLM v0.7.1,
container nvcr.io/nvidia/pytorch:25.12-py3.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions