Windows: flip default to Ninja, add MSVC classifier, add GPU classifiers#276
Open
bernardladenthin wants to merge 6 commits into
Open
Windows: flip default to Ninja, add MSVC classifier, add GPU classifiers#276bernardladenthin wants to merge 6 commits into
bernardladenthin wants to merge 6 commits into
Conversation
Add three x86_64 Windows GPU native classifiers and flip the default
Windows CPU build from the MSVC/Visual-Studio generator to Ninja
Multi-Config (so the most-pulled JAR gets the sccache/Depot cache).
Both generators use the same MSVC toolchain (static /MT CRT), so the
DLLs are functionally equivalent with identical runtime dependencies.
New Windows classifiers (x86_64 only, all Ninja):
- cuda13-windows-x86-64 (-DGGML_CUDA=ON, Jimver/cuda-toolkit)
- vulkan-windows-x86-64 (-DGGML_VULKAN=ON, humbletim/install-vulkan-sdk)
- opencl-windows-x86-64 (-DGGML_OPENCL=ON, staged ICD loader)
GPU runtime libraries are NOT bundled (driver/toolkit supplies them),
avoiding redistribution obligations. GitHub runners have no GPU, so the
GPU jobs build + run ctest only; end-to-end GPU inference is local /
self-hosted.
Default flip:
- build-windows-x86_64 / build-windows-x86 are now Ninja (default JAR,
Windows-*-libraries artifacts).
- MSVC ships as the msvc-windows classifier (Windows-*-msvc artifacts).
- test-java-windows-x86_64 validates the default (Ninja) DLL;
test-java-windows-x86_64-msvc validates the MSVC classifier DLL.
Wiring:
- CMakeLists.txt: OS-aware backend routing (CUDA/OpenCL -> Windows
trees, new Vulkan branch) into resources_windows_{cuda,vulkan,opencl}.
- .github/build.bat: also wraps nvcc with sccache for CUDA builds.
- .github/build_opencl_windows.bat: new (Windows analogue of
build_opencl_android.sh; stages Khronos headers + ICD loader).
- pom.xml: profiles windows-msvc / cuda-windows / vulkan-windows /
opencl-windows.
- publish.yml: new build jobs + package/publish downloads + profile
activation; all five Windows build jobs gate the package job.
- README.md: full classifier table + dependency snippets.
- CLAUDE.md / TODO.md: documented the flip and the GPU classifiers.
Note: the GPU toolchain steps (CUDA version via Jimver, Vulkan SDK
action inputs, OpenCL ICD staging) are unvalidated locally (no Windows
host / GPU) and may need adjustment after the first CI run.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
First CI run on PR #276 (run 28327740376): default Ninja flip, MSVC classifier, and the OpenCL job were green; CUDA and Vulkan failed on toolchain setup. - CUDA: `Jimver/cuda-toolkit@v0.2.24` errored "Version not available: 13.0.0" (that tag predates CUDA 13.x). Bump to @v0.2.35 and request 13.2.0 (matches the Linux pin; classifier stays cuda13-windows-x86-64). Also clears the Node 20 deprecation warning. - Vulkan: humbletim/install-vulkan-sdk set VULKAN_SDK but laid out the SDK in a way CMake's FindVulkan could not read (missing Vulkan_LIBRARY / Vulkan_INCLUDE_DIR). Switch to jakoch/install-vulkan-sdk-action@v1.6.0 (purpose-built, FindVulkan-compatible), vulkan_version 1.4.350.0. Docs (CLAUDE.md, TODO.md) updated with the resolved toolchain versions and the first-run results. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
Both build_opencl_android.sh and build_opencl_windows.bat pinned the Khronos OpenCL-Headers + OpenCL-ICD-Loader at v2025.07.22; v2026.05.29 (OpenCL 3.1.1 spec sync) is current and exists for both repos. Headers are backward compatible, so the Android OpenCL classifier is unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
Second CI run (28328987704): Jimver@v0.2.35 + 13.2.0 installed CUDA
correctly, but the build failed at cmake's CUDA compiler detection with
"Cannot open include file: 'crt/host_config.h'" — the reduced network
sub-package set ("nvcc","cudart","cublas","cublas_dev","thrust") omits
the nvcc crt headers. Drop method:network + sub-packages so Jimver runs
the default local full-toolkit installer, which ships every header.
(The Vulkan SDK switch to jakoch/install-vulkan-sdk-action from the prior
commit worked: its install step now succeeds and the build proceeds past
the previous find_package(Vulkan) configure failure.)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
Run 28329190065 (d36d026): the CUDA full-toolkit fix worked — CUDA 13.2 now compiles the entire ggml-cuda backend and the build succeeds. It then failed at ctest: gtest_discover_tests cannot enumerate the CUDA-linked jllama_test.exe on a GPU-less GitHub runner (the binary errors probing for a CUDA device at startup), so CMake registers the failing jllama_test_NOT_BUILT sentinel. Running a GPU-linked unit-test binary on a runner with no GPU is not possible, and the C++ unit suite is CPU-only logic already fully covered by the `C++ Tests` job and the CPU Windows jobs. So the three Windows GPU build jobs now build the artifact only: drop -DBUILD_TESTING and the ctest step from cuda/vulkan/opencl. The jllama.dll artifact (the real deliverable) is unaffected. Docs (CLAUDE.md, TODO.md) updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
Run 28329638757 (788aaef): the CUDA build job went GREEN but produced an empty artifact, so `package` failed with "Artifact not found: Windows-x86_64-cuda". Two bugs: 1. sccache cannot wrap nvcc on Windows — it dies with "sccache: error: Could not parse shell line" on every .cu compile, so ggml-cuda never builds. My build.bat nvcc-launcher addition (mirrored from build.sh, which works on Linux) is the cause. Remove it: CUDA device code now builds with nvcc directly (uncached); the cl.exe C/C++ TUs still cache via the C/CXX launcher. 2. The failed `cmake --build` exited 0 (sccache-as-launcher failure path), so build.bat reached the end and the job went green with no DLLs. build.bat now captures the build exit code, prints sccache stats regardless, then propagates a non-zero exit. As a backstop, the three GPU upload-artifact steps use if-no-files-found: error so an empty output fails the job loudly instead of surfacing later at package. Docs (CLAUDE.md) corrected re: nvcc not being cached on Windows. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.




Summary
Default Windows CPU build is now Ninja Multi-Config + sccache (was MSVC/Visual Studio). Both generators use the same MSVC toolchain (
cl.exe, static/MTCRT) on the same runner, producing functionally equivalent DLLs with identical runtime dependencies. Ninja enables sccache caching over Depot WebDAV, which Visual Studio's generator ignores.MSVC/Visual Studio build shipped as
msvc-windowsclassifier for users who prefer the Visual Studio generator. Both Windows CPU builds are validated end-to-end with the full model-backed Java test suite.Three new Windows GPU classifiers added (x86_64 only, all Ninja + sccache):
cuda13-windows-x86-64— CUDA 13 backendvulkan-windows-x86-64— Vulkan backend (most portable)opencl-windows-x86-64— OpenCL backendGPU runtime libraries are not bundled (consumer's driver/toolkit provides them). GitHub-hosted Windows runners have no GPU, so GPU jobs build + run C++ unit tests (
ctest, CPU-only) but cannot run model-backed GPU inference — end-to-end GPU validation is local/self-hosted.Changes
.github/workflows/publish.yml:build-windows-x86_64→build-windows-x86_64-msvcandbuild-windows-x86→build-windows-x86-msvc(MSVC builds, now classifiers)build-windows-x86_64-ninja→build-windows-x86_64andbuild-windows-x86-ninja→build-windows-x86(Ninja builds, now default)Windows-*-ninja→Windows-*-libraries(default),Windows-*→Windows-*-msvc(classifier)build-windows-x86_64-cuda,build-windows-x86_64-vulkan,build-windows-x86_64-openclpackagejob'sneeds:graph to include GPU jobs and reordered for claritypom.xml:windows-ninjaprofile →windows-msvc(now the classifier, not the default)cuda-windows,vulkan-windows,opencl-windows(mirrors the CUDA-Linux / OpenCL-Android classifier pattern)CMakeLists.txt:resources_windows_cudaon Windows (elseresources_linux_cuda), Vulkan →resources_windows_vulkan, OpenCL →resources_windows_openclon Windows (elseresources_android_opencl)src/main/resources/.../Windows/{x86_64,x86}/.github/build.bat:nvccwith sccache (mirrorsbuild.sh).github/build_opencl_windows.bat(new file):OpenCL.lib) before delegating tobuild.batwith OpenCL pathsbuild_opencl_android.shpatternCLAUDE.md:README.md:https://claude.ai/code/session_01Deqf9xS6jz9t1idytVTaPV