Skip to content

Windows: flip default to Ninja, add MSVC classifier, add GPU classifiers#276

Open
bernardladenthin wants to merge 6 commits into
mainfrom
claude/windows-gpu-support-xvhi5r
Open

Windows: flip default to Ninja, add MSVC classifier, add GPU classifiers#276
bernardladenthin wants to merge 6 commits into
mainfrom
claude/windows-gpu-support-xvhi5r

Conversation

@bernardladenthin

Copy link
Copy Markdown
Owner

Summary

  • Default Windows CPU build is now Ninja Multi-Config + sccache (was MSVC/Visual Studio). Both generators use the same MSVC toolchain (cl.exe, static /MT CRT) on the same runner, producing functionally equivalent DLLs with identical runtime dependencies. Ninja enables sccache caching over Depot WebDAV, which Visual Studio's generator ignores.

  • MSVC/Visual Studio build shipped as msvc-windows classifier for users who prefer the Visual Studio generator. Both Windows CPU builds are validated end-to-end with the full model-backed Java test suite.

  • Three new Windows GPU classifiers added (x86_64 only, all Ninja + sccache):

    • cuda13-windows-x86-64 — CUDA 13 backend
    • vulkan-windows-x86-64 — Vulkan backend (most portable)
    • opencl-windows-x86-64 — OpenCL backend

    GPU runtime libraries are not bundled (consumer's driver/toolkit provides them). GitHub-hosted Windows runners have no GPU, so GPU jobs build + run C++ unit tests (ctest, CPU-only) but cannot run model-backed GPU inference — end-to-end GPU validation is local/self-hosted.

Changes

.github/workflows/publish.yml:

  • Renamed build-windows-x86_64build-windows-x86_64-msvc and build-windows-x86build-windows-x86-msvc (MSVC builds, now classifiers)
  • Renamed build-windows-x86_64-ninjabuild-windows-x86_64 and build-windows-x86-ninjabuild-windows-x86 (Ninja builds, now default)
  • Updated artifact names: Windows-*-ninjaWindows-*-libraries (default), Windows-*Windows-*-msvc (classifier)
  • Added three new GPU build jobs: build-windows-x86_64-cuda, build-windows-x86_64-vulkan, build-windows-x86_64-opencl
  • Updated test job names and dependencies to reflect the flip
  • Updated package job's needs: graph to include GPU jobs and reordered for clarity

pom.xml:

  • Renamed windows-ninja profile → windows-msvc (now the classifier, not the default)
  • Updated profile comments and execution IDs to reflect the new role
  • Added three new GPU profiles: cuda-windows, vulkan-windows, opencl-windows (mirrors the CUDA-Linux / OpenCL-Android classifier pattern)

CMakeLists.txt:

  • Added OS-aware backend routing: CUDA → resources_windows_cuda on Windows (else resources_linux_cuda), Vulkan → resources_windows_vulkan, OpenCL → resources_windows_opencl on Windows (else resources_android_opencl)
  • Default CPU build (both generators) still emits to canonical src/main/resources/.../Windows/{x86_64,x86}/

.github/build.bat:

  • Added CUDA device-code caching: when sccache wrapping succeeded and this is a CUDA build, also front nvcc with sccache (mirrors build.sh)

.github/build_opencl_windows.bat (new file):

  • Stages Khronos OpenCL-Headers and builds OpenCL-ICD-Loader (producing OpenCL.lib) before delegating to build.bat with OpenCL paths
  • Mirrors build_opencl_android.sh pattern

CLAUDE.md:

  • Updated "Windows Ninja artifact" section → "Windows native classifiers" with comprehensive explanation of the flip, GPU classifiers, and wiring
  • Clarified that GPU runtime libraries are not bundled and GitHub runners have no GPU

README.md:

  • Updated classifier table: default JAR now lists Windows as "

https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV

Add three x86_64 Windows GPU native classifiers and flip the default
Windows CPU build from the MSVC/Visual-Studio generator to Ninja
Multi-Config (so the most-pulled JAR gets the sccache/Depot cache).
Both generators use the same MSVC toolchain (static /MT CRT), so the
DLLs are functionally equivalent with identical runtime dependencies.

New Windows classifiers (x86_64 only, all Ninja):
- cuda13-windows-x86-64  (-DGGML_CUDA=ON, Jimver/cuda-toolkit)
- vulkan-windows-x86-64  (-DGGML_VULKAN=ON, humbletim/install-vulkan-sdk)
- opencl-windows-x86-64  (-DGGML_OPENCL=ON, staged ICD loader)

GPU runtime libraries are NOT bundled (driver/toolkit supplies them),
avoiding redistribution obligations. GitHub runners have no GPU, so the
GPU jobs build + run ctest only; end-to-end GPU inference is local /
self-hosted.

Default flip:
- build-windows-x86_64 / build-windows-x86 are now Ninja (default JAR,
  Windows-*-libraries artifacts).
- MSVC ships as the msvc-windows classifier (Windows-*-msvc artifacts).
- test-java-windows-x86_64 validates the default (Ninja) DLL;
  test-java-windows-x86_64-msvc validates the MSVC classifier DLL.

Wiring:
- CMakeLists.txt: OS-aware backend routing (CUDA/OpenCL -> Windows
  trees, new Vulkan branch) into resources_windows_{cuda,vulkan,opencl}.
- .github/build.bat: also wraps nvcc with sccache for CUDA builds.
- .github/build_opencl_windows.bat: new (Windows analogue of
  build_opencl_android.sh; stages Khronos headers + ICD loader).
- pom.xml: profiles windows-msvc / cuda-windows / vulkan-windows /
  opencl-windows.
- publish.yml: new build jobs + package/publish downloads + profile
  activation; all five Windows build jobs gate the package job.
- README.md: full classifier table + dependency snippets.
- CLAUDE.md / TODO.md: documented the flip and the GPU classifiers.

Note: the GPU toolchain steps (CUDA version via Jimver, Vulkan SDK
action inputs, OpenCL ICD staging) are unvalidated locally (no Windows
host / GPU) and may need adjustment after the first CI run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
First CI run on PR #276 (run 28327740376): default Ninja flip, MSVC
classifier, and the OpenCL job were green; CUDA and Vulkan failed on
toolchain setup.

- CUDA: `Jimver/cuda-toolkit@v0.2.24` errored "Version not available:
  13.0.0" (that tag predates CUDA 13.x). Bump to @v0.2.35 and request
  13.2.0 (matches the Linux pin; classifier stays cuda13-windows-x86-64).
  Also clears the Node 20 deprecation warning.
- Vulkan: humbletim/install-vulkan-sdk set VULKAN_SDK but laid out the
  SDK in a way CMake's FindVulkan could not read (missing Vulkan_LIBRARY
  / Vulkan_INCLUDE_DIR). Switch to jakoch/install-vulkan-sdk-action@v1.6.0
  (purpose-built, FindVulkan-compatible), vulkan_version 1.4.350.0.

Docs (CLAUDE.md, TODO.md) updated with the resolved toolchain versions
and the first-run results.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
Both build_opencl_android.sh and build_opencl_windows.bat pinned the
Khronos OpenCL-Headers + OpenCL-ICD-Loader at v2025.07.22; v2026.05.29
(OpenCL 3.1.1 spec sync) is current and exists for both repos. Headers
are backward compatible, so the Android OpenCL classifier is unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
Second CI run (28328987704): Jimver@v0.2.35 + 13.2.0 installed CUDA
correctly, but the build failed at cmake's CUDA compiler detection with
"Cannot open include file: 'crt/host_config.h'" — the reduced network
sub-package set ("nvcc","cudart","cublas","cublas_dev","thrust") omits
the nvcc crt headers. Drop method:network + sub-packages so Jimver runs
the default local full-toolkit installer, which ships every header.

(The Vulkan SDK switch to jakoch/install-vulkan-sdk-action from the prior
commit worked: its install step now succeeds and the build proceeds past
the previous find_package(Vulkan) configure failure.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
Run 28329190065 (d36d026): the CUDA full-toolkit fix worked — CUDA 13.2
now compiles the entire ggml-cuda backend and the build succeeds. It
then failed at ctest: gtest_discover_tests cannot enumerate the
CUDA-linked jllama_test.exe on a GPU-less GitHub runner (the binary
errors probing for a CUDA device at startup), so CMake registers the
failing jllama_test_NOT_BUILT sentinel.

Running a GPU-linked unit-test binary on a runner with no GPU is not
possible, and the C++ unit suite is CPU-only logic already fully covered
by the `C++ Tests` job and the CPU Windows jobs. So the three Windows GPU
build jobs now build the artifact only: drop -DBUILD_TESTING and the
ctest step from cuda/vulkan/opencl. The jllama.dll artifact (the real
deliverable) is unaffected. Docs (CLAUDE.md, TODO.md) updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
Run 28329638757 (788aaef): the CUDA build job went GREEN but produced an
empty artifact, so `package` failed with "Artifact not found:
Windows-x86_64-cuda". Two bugs:

1. sccache cannot wrap nvcc on Windows — it dies with
   "sccache: error: Could not parse shell line" on every .cu compile, so
   ggml-cuda never builds. My build.bat nvcc-launcher addition (mirrored
   from build.sh, which works on Linux) is the cause. Remove it: CUDA
   device code now builds with nvcc directly (uncached); the cl.exe C/C++
   TUs still cache via the C/CXX launcher.
2. The failed `cmake --build` exited 0 (sccache-as-launcher failure path),
   so build.bat reached the end and the job went green with no DLLs.
   build.bat now captures the build exit code, prints sccache stats
   regardless, then propagates a non-zero exit. As a backstop, the three
   GPU upload-artifact steps use if-no-files-found: error so an empty
   output fails the job loudly instead of surfacing later at package.

Docs (CLAUDE.md) corrected re: nvcc not being cached on Windows.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
C Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants