I'm trying to use a local model through an NPU (On a Snapdragon X Elite device) + Github Copilot, but I get the following logs while doing so.
Microsoft Foundry Local VS Code Extension Version: 1.4.3
VS Code version: 1.125.1
The logs (from VS Code) are attached below.
If you need any more info and you're also able to provide steps so I can obtain them, I'm available to help.
2026-06-22 01:12:01.413 [info] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2026-06-22T01:12:01.4117388+03:00 Finish loading model:qwen2.5-coder-7b-instruct-qnn-npu:1 elapsed time:00:00:17.5720719
2026-06-22 01:12:01.421 [info] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2026-06-22T01:12:01.4117388+03:00 Finish loading model:qwen2.5-coder-7b-instruct-qnn-npu:1 elapsed time:00:00:17.5720719
2026-06-22 01:12:01.421 [info] Information: Microsoft.Neutron.Telemetry.MicrosoftTelemetry [0] 2026-06-22T01:12:01.4131012+03:00 [Telemetry] AppName:Neutron UserAgent:NeutronServer Command:ModelLoad Status:Success Direct:True Time:17574ms
2026-06-22 01:12:01.421 [info] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2026-06-22T01:12:01.4117388+03:00 Finish loading model:qwen2.5-coder-7b-instruct-qnn-npu:1 elapsed time:00:00:17.5720719
2026-06-22 01:12:01.422 [info] Information: Microsoft.Neutron.Telemetry.MicrosoftTelemetry [0] 2026-06-22T01:12:01.4131012+03:00 [Telemetry] AppName:Neutron UserAgent:NeutronServer Command:ModelLoad Status:Success Direct:True Time:17574ms
2026-06-22 01:12:01.422 [info] Information: Microsoft.Neutron.Telemetry.MicrosoftTelemetry [0] 2026-06-22T01:12:01.4131012+03:00 [Telemetry] AppName:Neutron UserAgent:NeutronServer Command:ModelLoad Status:Success Direct:True Time:17574ms
2026-06-22 01:12:01.437 [info] Information: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.436455+03:00 HandleChatCompletionAsStreamRequest -> model:qwen2.5-coder-7b-instruct-qnn-npu:1 MaxCompletionTokens:(null) maxTokens:(null) temperature:(null) topP:(null)
2026-06-22 01:12:01.437 [info] Information: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.436455+03:00 HandleChatCompletionAsStreamRequest -> model:qwen2.5-coder-7b-instruct-qnn-npu:1 MaxCompletionTokens:(null) maxTokens:(null) temperature:(null) topP:(null)
2026-06-22 01:12:01.437 [info] Information: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.436455+03:00 HandleChatCompletionAsStreamRequest -> model:qwen2.5-coder-7b-instruct-qnn-npu:1 MaxCompletionTokens:(null) maxTokens:(null) temperature:(null) topP:(null)
2026-06-22 01:12:01.446 [info] Error: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.4446967+03:00 [json.exception.type_error.302] type must be string, but is array
2026-06-22 01:12:01.447 [info] Error: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.4446967+03:00 [json.exception.type_error.302] type must be string, but is array
2026-06-22 01:12:01.447 [info] Error: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.4446967+03:00 [json.exception.type_error.302] type must be string, but is array
2026-06-22 01:12:01.973 [info] Information: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.9728077+03:00 HandleChatCompletionAsStreamRequest -> model:qwen2.5-coder-7b-instruct-qnn-npu:1 MaxCompletionTokens:(null) maxTokens:(null) temperature:(null) topP:(null)
2026-06-22 01:12:01.973 [info] Information: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.9728077+03:00 HandleChatCompletionAsStreamRequest -> model:qwen2.5-coder-7b-instruct-qnn-npu:1 MaxCompletionTokens:(null) maxTokens:(null) temperature:(null) topP:(null)
2026-06-22 01:12:01.974 [info] Information: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.9728077+03:00 HandleChatCompletionAsStreamRequest -> model:qwen2.5-coder-7b-instruct-qnn-npu:1 MaxCompletionTokens:(null) maxTokens:(null) temperature:(null) topP:(null)
2026-06-22 01:12:01.977 [info] Error: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.9771391+03:00 [json.exception.type_error.302] type must be string, but is array
2026-06-22 01:12:01.978 [info] Error: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.9771391+03:00 [json.exception.type_error.302] type must be string, but is array
2026-06-22 01:12:01.978 [info] Error: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:01.9771391+03:00 [json.exception.type_error.302] type must be string, but is array
2026-06-22 01:12:02.782 [info] Information: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:02.7821072+03:00 HandleChatCompletionAsStreamRequest -> model:qwen2.5-coder-7b-instruct-qnn-npu:1 MaxCompletionTokens:(null) maxTokens:(null) temperature:(null) topP:(null)
2026-06-22 01:12:02.782 [info] Information: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:02.7821072+03:00 HandleChatCompletionAsStreamRequest -> model:qwen2.5-coder-7b-instruct-qnn-npu:1 MaxCompletionTokens:(null) maxTokens:(null) temperature:(null) topP:(null)
2026-06-22 01:12:02.782 [info] Information: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:02.7821072+03:00 HandleChatCompletionAsStreamRequest -> model:qwen2.5-coder-7b-instruct-qnn-npu:1 MaxCompletionTokens:(null) maxTokens:(null) temperature:(null) topP:(null)
2026-06-22 01:12:02.787 [info] Error: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:02.7874721+03:00 [json.exception.type_error.302] type must be string, but is array
2026-06-22 01:12:02.788 [error] Unable to call the qwen2.5-coder-7b-instruct-qnn-npu:1 inference endpoint due to 500. Please check if the input or configuration is correct. 500 status code (no body)
2026-06-22 01:12:02.832 [info] Error: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:02.7874721+03:00 [json.exception.type_error.302] type must be string, but is array
2026-06-22 01:12:02.833 [info] Error: Microsoft.Neutron.OpenAI.Delegates.OpenAIApi [0] 2026-06-22T01:12:02.7874721+03:00 [json.exception.type_error.302] type must be string, but is array```
I'm trying to use a local model through an NPU (On a Snapdragon X Elite device) + Github Copilot, but I get the following logs while doing so.
Microsoft Foundry Local VS Code Extension Version: 1.4.3
VS Code version: 1.125.1
The logs (from VS Code) are attached below.
If you need any more info and you're also able to provide steps so I can obtain them, I'm available to help.