Skip to content

build: require GCC>=13, CUDA>=12 (12 & 13), cuTENSOR>=2.0 with early actionable errors; harden CUDA finders#950

Open
pcchen wants to merge 5 commits into
masterfrom
fix/compiler-cuda-version-requirements
Open

build: require GCC>=13, CUDA>=12 (12 & 13), cuTENSOR>=2.0 with early actionable errors; harden CUDA finders#950
pcchen wants to merge 5 commits into
masterfrom
fix/compiler-cuda-version-requirements

Conversation

@pcchen

@pcchen pcchen commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

Makes Cytnx's CUDA build requirements explicit and self-documenting, so
unsupported toolchains fail early with actionable messages instead of cryptic
downstream errors. Also fixes correctness bugs in the cuTENSOR/cuQuantum
finders and stale CUDA configure diagnostics.

This addresses the configure-/build-time portions of #946 and #949, and the
documentation half of #948. (The deeper #948 rpath change and the #946
candidate-suffix layout search are intentionally left as follow-ups -- see
"Out of scope".)

Changes

Compiler / toolkit requirements (new, fail-early)

  • GCC >= 13 enforced for GNU host compilers (C++20 support), with a message
    pointing at -DCMAKE_CXX_COMPILER. Clang/AppleClang are unaffected.
  • CUDA >= 12.0, with 12.x and 13.x explicitly supported. The toolkit is
    now resolved (find_package(CUDAToolkit)) before enable_language(CUDA)
    so the version is known early enough to gate and to drive architecture
    selection. < 12 is a FATAL_ERROR; > 13 warns (untested).
  • Default CUDA architectures adapt to the toolkit: CUDA 13.0 removed
    offline compilation for compute capability < 7.5, so Volta sm_70 is now
    emitted only for CUDA 12.x.
  • cuTENSOR >= 2.0 enforced by reading CUTENSOR_MAJOR/MINOR from the
    cuTENSOR headers; the 1.x API is rejected.

Better errors / diagnostics

  • Actionable "nvcc not found" error before enable_language(CUDA),
    explaining that it only consults CMAKE_CUDA_COMPILER / CUDACXX / PATH
    (not the located toolkit libraries) and listing the fixes. Skipped when the
    user pins a compiler explicitly.
  • [build] CUDA configure output shows blank "CUDA Version"/"CUDA Toolkit Root" — uses legacy FindCUDA variables #949: CUDA configure output now uses the modern CUDAToolkit_*
    variables (CUDAToolkit_VERSION, CUDAToolkit_BIN_DIR) plus
    CMAKE_CUDA_COMPILER. The legacy CUDA_VERSION_STRING /
    CUDA_TOOLKIT_ROOT_DIR are not set by find_package(CUDAToolkit) and were
    printing blank.

Finder correctness (#946)

  • FindCUTENSOR: removed the dead lib/10.2 and lib/11 branches; the
    library subdir now tracks the CUDA major version (lib/12, lib/13) instead
    of a hardcoded lib/12.
  • CUTENSOR_FOUND / CUQUANTUM_FOUND are now derived from the actual
    find_library results via find_package_handle_standard_args, instead of
    being set TRUE unconditionally (which let a NOTFOUND silently pass the
    caller's REQUIRED check and link empty / -NOTFOUND).
  • FindCUQUANTUM builds CUQUANTUM_LIBRARIES conditionally (parity with
    FindCUTENSOR).

Docs

Related issues

Testing

  • cmake -S . -B build -DUSE_CUDA=OFF configures and generates cleanly
    (AppleClang; GCC check correctly skipped).
  • cuTENSOR header version-parsing logic verified standalone (2.7 -> OK,
    1.7 -> fail).
  • ⚠️ The CUDA-on paths and the cuTENSOR/cuQuantum finder bodies were
    not executed (no CUDA/cuTENSOR/cuQuantum environment available); they are
    validated by inspection. Reviewers with a CUDA box, please sanity-check a
    -DUSE_CUDA=ON (and -DUSE_CUTENSOR=ON / -DUSE_CUQUANTUM=ON) configure.

Out of scope (follow-ups)

🤖 Generated with Claude Code

pcchen and others added 3 commits June 28, 2026 20:35
- Require GCC >= 13 for GNU host compilers (C++20), failing early with a
  message pointing to -DCMAKE_CXX_COMPILER. Clang/AppleClang unaffected.
- Resolve CUDAToolkit before enable_language() and require CUDA >= 12.0
  (device-side C++20 floor), explicitly supporting 12.x and 13.x: < 12 is a
  FATAL_ERROR, > 13 warns.
- Emit an actionable error when no usable nvcc is found, explaining that
  enable_language(CUDA) only consults CMAKE_CUDA_COMPILER / CUDACXX / PATH
  (not the located toolkit libraries). Skipped when a compiler is pinned.
- Choose the default CUDA architecture list per toolkit version: CUDA 13.0
  removed offline support for compute capability < 7.5, so Volta sm_70 is
  only emitted for CUDA 12.x.
- Report CUDA via the modern CUDAToolkit_* variables (+ nvcc path); the legacy
  CUDA_VERSION_STRING / CUDA_TOOLKIT_ROOT_DIR are unset by
  find_package(CUDAToolkit) and printed blank (#949).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- FindCUTENSOR: remove dead lib/10.2 and lib/11 branches; derive the library
  subdir from the CUDA major version (lib/12, lib/13) instead of hardcoding 12;
  require cuTENSOR >= 2.0 by reading CUTENSOR_MAJOR/MINOR from the headers.
- Derive CUTENSOR_FOUND / CUQUANTUM_FOUND from the actual find_library results
  via find_package_handle_standard_args instead of setting them TRUE
  unconditionally (which let a NOTFOUND silently pass the caller's REQUIRED
  check and link empty / "-NOTFOUND").
- FindCUQUANTUM: build CUQUANTUM_LIBRARIES conditionally (parity with
  FindCUTENSOR) so a missing lib does not appear as "...-NOTFOUND" on the link
  line.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…for tarballs

- Update the CUDA dependency list: CUDA toolkit >= 12.0 (12.x/13.x) and
  cuTENSOR >= 2.0.
- Add a prominent note that tarball installs of cuTENSOR/cuQuantum need both
  CUTENSOR_ROOT/CUQUANTUM_ROOT (build) and LD_LIBRARY_PATH (runtime), since
  tarball libraries are not registered with ldconfig (#948 workaround). Use
  $CUTENSOR_ROOT/lib (cuTENSOR 2.x layout), noting lib/<cuda-major> as legacy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the CMake configuration and documentation to enforce and support newer compiler and library requirements, specifically GCC >= 13, CUDA >= 12.0, and cuTENSOR >= 2.0. It also modernizes CUDA variable usage and improves package handling for CUQUANTUM and CUTENSOR. Feedback on these changes suggests several CMake improvements, such as avoiding variable dereferencing inside if() conditions, using CUDAToolkit_VERSION with standard version comparison operators for more robust checks, and registering the cuTENSOR version with find_package_handle_standard_args. Additionally, the reviewer noted a discrepancy in the documentation regarding the flat lib/ layout of cuTENSOR 2.x, which is not yet supported by the CMake script.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread cmake/Modules/FindCUTENSOR.cmake Outdated
Comment thread CMakeLists.txt Outdated
Comment thread CMakeLists.txt Outdated
# included for CUDA 12.x.
if(NOT CMAKE_CUDA_ARCHITECTURES AND NOT DEFINED ENV{CUDAARCHS})
set(CMAKE_CUDA_ARCHITECTURES 70-real 75-real 80-real 86-real 89-real 90-real 90-virtual)
if(CUDAToolkit_VERSION_MAJOR GREATER_EQUAL 13)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency and robustness, use CUDAToolkit_VERSION with VERSION_GREATER_EQUAL instead of comparing CUDAToolkit_VERSION_MAJOR directly.

    if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL 13.0)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 10b1536fif(CUDAToolkit_VERSION VERSION_GREATER_EQUAL 13.0).

Posted by Claude Code on behalf of @pcchen

Comment thread cmake/Modules/FindCUTENSOR.cmake Outdated
Comment thread docs/source/adv_install.rst Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: de94f3f8cc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread CMakeLists.txt
# toolkit can actually compile. find_package(CUDAToolkit) does not require the
# CUDA language to be enabled; the later find_package in CytnxBKNDCMakeLists
# re-uses this cached result.
find_package(CUDAToolkit QUIET)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Honor pinned CUDA compilers when finding the toolkit

This probes FindCUDAToolkit before the CUDA language is enabled, but cmake --help-module FindCUDAToolkit says the compiler directory is searched only after CUDA has been enabled; at this point it searches CUDAToolkit_ROOT/CUDA_PATH/PATH/defaults instead of CMAKE_CUDA_COMPILER or CUDACXX. A user with nvcc only under a non-default path who follows the new error text (-DCMAKE_CUDA_COMPILER=/opt/cuda-13/bin/nvcc or CUDACXX=...) will still hit this fatal path, and a user with another CUDA on PATH can have the version check/link libraries come from the wrong toolkit before enable_language(CUDA) honors the pinned compiler.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed — find_package(CUDAToolkit) runs before enable_language(CUDA), so it consults CUDAToolkit_ROOT/CUDA_PATH/PATH rather than CMAKE_CUDA_COMPILER/CUDACXX, and a pinned-but-not-on-PATH nvcc (or a stray CUDA on PATH) can drive the early check from the wrong toolkit. The intended fix is to derive CUDAToolkit_ROOT from a pinned CMAKE_CUDA_COMPILER/CUDACXX before the probe. Deferring this to a follow-up PR since it needs a Linux box with multiple CUDA toolkits to verify; tracked there.

Posted by Claude Code on behalf of @pcchen

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we defer this issue to following PR, this PR fixed the problem in the wrong way. In this change, setting CMAKE_CUDA_COMPILER to a path other than the path found in find_package(CUDAToolkit) may cause the version of nvcc and cuda toolkit different, and the build may fail. Suggest pattern is not to find cuda toolkit before enabling cuda

# Record user intent before CUDA is enabled.
set(_USER_SET_CUDA_ARCHITECTURES FALSE)
if(DEFINED CMAKE_CUDA_ARCHITECTURES)
  set(_USER_SET_CUDA_ARCHITECTURES TRUE)
endif()

enable_language(CUDA)
find_package(CUDAToolkit REQUIRED)

if(NOT _USER_SET_CUDA_ARCHITECTURES)
  if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL 13.0)
    set(CMAKE_CUDA_ARCHITECTURES
        75-real 80-real 86-real 89-real 90-real 90-virtual)
  else()
    set(CMAKE_CUDA_ARCHITECTURES
        70-real 75-real 80-real 86-real 89-real 90-real 90-virtual)
  endif()
endif()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMAKE_CUDA_ARCHITECTURES all-major cannot be used here because that will include 50 and 60 for cuda 12.x. 50 and 60 are not supported by our required cuTENSOR and cuQuantum.


.. code-block:: shell

$export LD_LIBRARY_PATH=$CUTENSOR_ROOT/lib:$CUQUANTUM_ROOT/lib:$LD_LIBRARY_PATH

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match cuTENSOR runtime path to the finder

The documented runtime workaround does not match the library directory that this commit's finder accepts: FindCUTENSOR.cmake now constructs ${CUTENSOR_ROOT}/lib/${CUDAToolkit_VERSION_MAJOR} for CUDA 12/13. For a tarball install that is found by the build, exporting only $CUTENSOR_ROOT/lib leaves the actual libcutensor.so directory out of LD_LIBRARY_PATH, so import cytnx can still fail with the loader error this note is meant to prevent; document the versioned subdirectory or reuse the detected library dir.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 10b1536f, together with the finder change. Instead of documenting a versioned-subdir caveat, FindCUTENSOR now searches the flat lib/ directly and derives CUTENSOR_LIBRARY_DIRS from the located library, so for a 2.x tarball the documented $CUTENSOR_ROOT/lib matches the directory the build actually used.

Posted by Claude Code on behalf of @pcchen

@pcchen pcchen requested a review from IvanaGyro June 28, 2026 12:44
@codecov

codecov Bot commented Jun 28, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 29.99%. Comparing base (21cf05a) to head (10b1536).
⚠️ Report is 8 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #950      +/-   ##
==========================================
+ Coverage   29.87%   29.99%   +0.12%     
==========================================
  Files         240      238       -2     
  Lines       35425    35421       -4     
  Branches    14729    14729              
==========================================
+ Hits        10584    10626      +42     
+ Misses      17593    17544      -49     
- Partials     7248     7251       +3     
Flag Coverage Δ
cpp 29.51% <ø> (+0.02%) ⬆️
python 58.60% <ø> (+6.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
C++ backend 31.20% <ø> (+0.02%) ⬆️
Python bindings 17.08% <ø> (+0.02%) ⬆️
Python package 58.60% <ø> (+6.14%) ⬆️
see 5 files with indirect coverage changes

Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 21cf05a...10b1536. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…installed

Add a troubleshooting subsection covering the two things to pin when multiple
CUDA toolkits are present: the build compiler (-DCMAKE_CUDA_COMPILER / CUDACXX /
PATH, with the pip/scikit-build form) and the runtime libraries
(LD_LIBRARY_PATH vs the ldconfig/apt copy). Notes that the configure output
reports the resolved toolkit and that a stale CMakeCache.txt must be wiped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pcchen

pcchen commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator Author

Added a docs section (fe7b181b) on selecting the CUDA toolkit when both CUDA 12 and 13 are installed.

A new "Choosing between CUDA 12 and CUDA 13 when both are installed" subsection under Build troubleshooting in adv_install.rst, covering the two things to pin when multiple toolkits are present:

  1. Build time — select the compiler. enable_language(CUDA) resolves nvcc only from CMAKE_CUDA_COMPILER / CUDACXX / PATH (not from find_package(CUDAToolkit)). Includes concrete -DCMAKE_CUDA_COMPILER=.../cuda-13/bin/nvcc (and cuda-12) examples, the CUDACXX/PATH alternatives, the pip/scikit-build form, the reminder to verify the reported CUDA Version / bin dir / nvcc lines, and to wipe a stale CMakeCache.txt.
  2. Runtime — match the libraries. The compiler choice does not control runtime loading; LD_LIBRARY_PATH vs the ldconfig/apt copy decides which libcudart.so.<major> is loaded, with the fix to force the matching runtime.

It ties into the CUDA-version reporting added in this PR and the tarball LD_LIBRARY_PATH note above it.


Posted by Claude Code on behalf of @pektiong

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fe7b181bf1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread cmake/Modules/FindCUTENSOR.cmake Outdated
Comment on lines +38 to +39
if(${CUDAToolkit_VERSION_MAJOR} GREATER_EQUAL 12)
set(CUTNLIB_DIR "${CUTNLIB_DIR}${CUDAToolkit_VERSION_MAJOR}")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Search the cuTENSOR 2.x flat lib directory

When CUTENSOR_ROOT points at a cuTENSOR 2.x install (the version this module now requires), the libraries live directly under $CUTENSOR_ROOT/lib as documented in docs/source/adv_install.rst, but this appends the CUDA major and the later find_library(... NO_DEFAULT_PATH) only checks $CUTENSOR_ROOT/lib/12 or lib/13. Fresh evidence beyond the prior runtime-path note is that a fake 2.7 header with lib/libcutensor.so leaves CUTENSOR_LIB as NOTFOUND, so USE_CUTENSOR=ON cannot configure for the supported 2.x layout unless users add a nonstandard versioned symlink.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — and this outweighs the "out of scope" note in the description: once the module hard-requires cuTENSOR >= 2.0, it has to locate the 2.x flat lib/. Fixed in 10b1536f by giving find_library both suffixes (lib and lib/${CUDAToolkit_VERSION_MAJOR}), so the 2.x flat layout and the legacy versioned layout both resolve for CUDA 12/13 without a nonstandard symlink. CUTENSOR_LIBRARY_DIRS is now taken from the directory of the located library. The apt multiarch path remains in #946.

Posted by Claude Code on behalf of @pcchen

Comment thread CMakeLists.txt Outdated
# 12.0 floor comes from device-side C++20 (nvcc gained -std=c++20 in CUDA
# 12.0; 11.x tops out at C++17), matching CMAKE_CUDA_STANDARD 20. Fail early
# with an actionable message instead of a cryptic later C++20/arch error.
if(CUDAToolkit_VERSION_MAJOR LESS 12)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate CUDA 12.x against the required GCC version

For GNU CUDA builds with CUDA 12.0-12.3 and GCC 13 selected by the new GNU compiler gate, this accepts the toolkit and then enable_language(CUDA) can fail because those CUDA releases do not support GCC 13 as an nvcc host compiler (CUDA 12.3 lists GCC 6.x-12.2, while CUDA 12.4 lists 6.x-13.2). Either raise the GNU/CUDA floor to 12.4 or reject the unsupported host-compiler combination so the advertised CUDA >=12.0 path does not pass this check and fail later.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid: GCC 13 + CUDA 12.0-12.3 passes both gates, but nvcc caps host support at GCC 12.x until CUDA 12.4. Deferring the combined gate to a follow-up PR, because it is bound up with revisiting the GCC-13 floor itself — that floor is driven solely by two std::format call sites (pybind/symmetry_py.cpp, pybind/unitensor_py.cpp); switching them to {fmt} or plain formatting would lower the floor to GCC 12 and change this calculus. Tracked for the follow-up.

Posted by Claude Code on behalf of @pcchen

…rsion checks

Addresses inline review feedback on PR #950:

- FindCUTENSOR: search both the cuTENSOR 2.x flat lib/ and the legacy
  lib/<cuda-major> layouts (find_library PATH_SUFFIXES), so the required
  2.x layout actually resolves; derive CUTENSOR_LIBRARY_DIRS from the
  located library; reference CUDAToolkit_VERSION_MAJOR by name (no
  in-if() dereference); register VERSION_VAR CUTENSOR_VERSION with
  find_package_handle_standard_args.
- CMakeLists: use CUDAToolkit_VERSION with VERSION_LESS/VERSION_GREATER_EQUAL
  instead of comparing the major component as an integer.
- docs(adv_install): note FindCUTENSOR searches both layouts so the
  build-time and runtime cuTENSOR paths agree.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +182 to +193
For cuTENSOR 2.x tarballs the libraries sit directly under ``lib/``, so
the path above is sufficient at both build and runtime. Older cuTENSOR
1.x tarballs instead use a per-CUDA subdirectory (``lib/<cuda-major>``,
e.g. ``lib/12``); on that legacy layout point ``LD_LIBRARY_PATH`` at the
subdirectory. Cytnx's ``FindCUTENSOR`` searches both layouts at build
time, so the runtime path matches whichever one was found. If
``LD_LIBRARY_PATH`` is not set
up, importing/running Cytnx fails with ``error while loading shared
libraries: libcutensor.so... cannot open shared object file``, or
silently binds to a mismatched system copy of the library if one is
present. Add these ``export`` lines to your shell profile (e.g.
``~/.bashrc``) to make them persistent.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for all 2.x. cuTENSOR start to separate the package for different CUDA version since 2.3. And we don't support cuTENSOR < 2.0 and cuQuantum < 24.0. See #447.

I will not recommend users export LD_LIBRARY_PATH to ~/.bashrc. This will pollute the environment system-wide just for running cytnx built for local. Actually, having multiple version of the same libraries without isolated environment is always easy to get trouble. We can tell users how LD_LIBRARY_PATH will be used like cupy and jax, but we may also have to consider to point the user or developer who requires LD_LIBRARY_PATH at this moment to conda environment.

Comment on lines +401 to +416
.. code-block:: shell

# Use CUDA 13:
$cmake -S . -B build -DUSE_CUDA=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda-13/bin/nvcc

# ...or use CUDA 12:
$cmake -S . -B build -DUSE_CUDA=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12/bin/nvcc

Equivalent alternatives are ``export CUDACXX=/usr/local/cuda-13/bin/nvcc`` or
putting the desired ``bin`` directory first on ``PATH``
(``export PATH=/usr/local/cuda-13/bin:$PATH``). For the ``pip`` build, pass it
through scikit-build-core:

.. code-block:: shell

$pip install . --config-settings=cmake.args="-DUSE_CUDA=ON;-DCMAKE_CUDA_COMPILER=/usr/local/cuda-13/bin/nvcc"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use preset in the build command. CUDA presets require both cuTENSOR and cuQuantum. Per discussion in #583 and #584, we will not test for the configuration enabling CUDA but not cuTENSOR and cuQuantum.

Comment on lines +424 to +431
**2. Runtime -- match the shared libraries.** Selecting the compiler does not
control which CUDA runtime libraries are loaded at run time. The dynamic loader
resolves ``libcudart.so.<major>`` (and cuTENSOR/cuQuantum, etc.) via
``LD_LIBRARY_PATH``, then the ``ldconfig`` cache. If a different major version is
registered with ``ldconfig`` (commonly the ``apt`` copy under
``/usr/lib/x86_64-linux-gnu``), it can be loaded instead of the toolkit you
built against, causing version-skew crashes. To force the matching runtime, put
its library directory first:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This content seems duplicated with the section of cuTENSOR and cuQuantum.

Comment thread CMakeLists.txt
Comment on lines +204 to +211
if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU" AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 13)
message(FATAL_ERROR
"GCC >= 13 is required (C++20 support), but found GCC "
"${CMAKE_CXX_COMPILER_VERSION} at ${CMAKE_CXX_COMPILER}.\n"
"Install GCC 13+ and point CMake at it, e.g. "
"-DCMAKE_CXX_COMPILER=g++-13 (and -DCMAKE_C_COMPILER=gcc-13), "
"or set the CXX/CC environment variables.")
endif()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this guard? We have had CMAKE_CXX_STANDARD_REQUIRED ON for guarding the compiler version.

Comment thread CMakeLists.txt
# toolkit can actually compile. find_package(CUDAToolkit) does not require the
# CUDA language to be enabled; the later find_package in CytnxBKNDCMakeLists
# re-uses this cached result.
find_package(CUDAToolkit QUIET)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we defer this issue to following PR, this PR fixed the problem in the wrong way. In this change, setting CMAKE_CUDA_COMPILER to a path other than the path found in find_package(CUDAToolkit) may cause the version of nvcc and cuda toolkit different, and the build may fail. Suggest pattern is not to find cuda toolkit before enabling cuda

# Record user intent before CUDA is enabled.
set(_USER_SET_CUDA_ARCHITECTURES FALSE)
if(DEFINED CMAKE_CUDA_ARCHITECTURES)
  set(_USER_SET_CUDA_ARCHITECTURES TRUE)
endif()

enable_language(CUDA)
find_package(CUDAToolkit REQUIRED)

if(NOT _USER_SET_CUDA_ARCHITECTURES)
  if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL 13.0)
    set(CMAKE_CUDA_ARCHITECTURES
        75-real 80-real 86-real 89-real 90-real 90-virtual)
  else()
    set(CMAKE_CUDA_ARCHITECTURES
        70-real 75-real 80-real 86-real 89-real 90-real 90-virtual)
  endif()
endif()

Comment thread CMakeLists.txt
# toolkit can actually compile. find_package(CUDAToolkit) does not require the
# CUDA language to be enabled; the later find_package in CytnxBKNDCMakeLists
# re-uses this cached result.
find_package(CUDAToolkit QUIET)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMAKE_CUDA_ARCHITECTURES all-major cannot be used here because that will include 50 and 60 for cuda 12.x. 50 and 60 are not supported by our required cuTENSOR and cuQuantum.

Comment thread CMakeLists.txt
Comment on lines +255 to +260
elseif(CUDAToolkit_VERSION VERSION_GREATER_EQUAL 14.0)
message(WARNING
"CUDA Toolkit ${CUDAToolkit_VERSION} is newer than the versions Cytnx is "
"tested against (12.x and 13.x). The build will proceed; please report "
"any issues.")
endif()

@IvanaGyro IvanaGyro Jun 29, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For packages supporting cuda, there are many configurations. It's hard to test on all configurations as the discussion in #583 and #584. Some packages don't explicitly set the range of supported system dependencies. They just list the tested configuration in the document. This reduces the responsibility of package, makes maintenance easier and let the users who want to build from source take the responsibility to solve the configuration problem.

We may support and build with CUDA 14 in the future, so guarding with the maximum version is not needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants