NPU load crash on Strix Halo: Inference.Service.Agent.exe fault in onnxruntime_vitis_ai_custom_ops.dll (0xc0000005)

## Summary
On AMD Strix Halo (Ryzen AI Max+ 395), Foundry Local can run GPU models successfully, but NPU model load consistently fails and crashes `Inference.Service.Agent.exe`.

## Environment
- OS: Windows 11 host (WSL2 Ubuntu 24.04 used only as operator shell)
- Hardware: AMD Ryzen AI Max+ 395 / Radeon 8060S
- Foundry Local: `Microsoft.FoundryLocal_0.8.119.102_x64__8wekyb3d8bbwe`
- NPU EP: `MicrosoftCorporationII.WinML.AMD.NPU.EP.1.8_1.8.62.0_x64__8wekyb3d8bbwe`

## Reproduction
1. `foundry service status` (service running, VitisAI EP listed)
2. Run:
   `foundry model run qwen2.5-0.5b --device NPU --prompt "test" --ttl 120`
3. Failure happens during load request.

## Observed behavior
- CLI stderr includes:
  - `Exception: Request to local service failed. Uri:http://127.0.0.1:<port>/openai/load/qwen2.5-0.5b-instruct-vitis-npu:3?ttl=120`
  - `An error occurred while sending the request.`
- Application log repeatedly records crash:
  - Event ID: `1000`
  - Faulting app: `Inference.Service.Agent.exe` (0.8.119.102)
  - Faulting module: `onnxruntime_vitis_ai_custom_ops.dll` (1.7.0.0)
  - Exception code: `0xc0000005`
  - Fault offset: `0x00000000000246f5`

## Control test
GPU path works on same machine:
`foundry model run qwen2.5-0.5b --device GPU --prompt "test" --ttl 120`
Model loads and returns text successfully.

## Mitigations tried
- Service reset/init and smoke retest
- Tested two NPU model variants (`qwen2.5-coder-0.5b`, `qwen2.5-0.5b`)
- Repaired Foundry package (initially blocked by in-use files `0x80073d02`, then repair completed after stopping processes)
- Retested after repair: same NPU crash signature remains

## Why this seems product/runtime related
CPU/GPU paths are healthy on the same system, but NPU path consistently crashes in `onnxruntime_vitis_ai_custom_ops.dll` during model load.

## Artifacts available
I collected full evidence bundle under:
`C:\Users\Ken\foundry_fix_round1\`

Key files:
- `ISSUE_BUNDLE_20260623.md`
- `r2b_01_service_status.txt`
- `r2b_02_versions.txt`
- `r2b_03_npu_smoke_stdout.txt`
- `r2b_03_npu_smoke_stderr.txt`
- `r2b_04_events.txt`
- `r3_06_gpu_smoke_stdout.txt`
- `r3_14_npu_after_repair_stderr.txt`
- `r3_15_events_after_repair.txt`

If you want, I can upload zipped artifacts from this folder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NPU load crash on Strix Halo: Inference.Service.Agent.exe fault in onnxruntime_vitis_ai_custom_ops.dll (0xc0000005) #832

Summary

Environment

Reproduction

Observed behavior

Control test

Mitigations tried

Why this seems product/runtime related

Artifacts available

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

NPU load crash on Strix Halo: Inference.Service.Agent.exe fault in onnxruntime_vitis_ai_custom_ops.dll (0xc0000005) #832

Description

Summary

Environment

Reproduction

Observed behavior

Control test

Mitigations tried

Why this seems product/runtime related

Artifacts available

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions