Description
In 2025, there are many ways of installing CUDA to a Python environment. One key challenge here is that all header/library search logics implemented in the existing CUDA-enabled libraries (ex: #447) need to be modernized, taking into account that CUDA these days can be installed on a per-component basis (ex: I just want NVRTC and CCCL and nothing else). The consequence is that any prior arts that rely on checking if a certain piece exists (ex: nvcc, cuda.h, nvvm, ...) and generalizing it to assume the whole Toolkit exists based on known relative paths are no longer valid. Even Linux system package managers may not always behave as expected. (Though setting CUDA_HOME
/CUDA_PATH
as a fallback might still be OK.)
The CUDA Python team is well-positioned to take on the pain points so that all other Python libraries do not need to worry about packaging sources, layouts, and so on. It is our intention to support modern CUDA packages and deployment options in a JIT-compilation friendly way. What this means is that we should be able to return, on a per-component basis,
- where are the component headers?
- where are the component shared libraries?
- ...
Something like (API design TBD)
from cuda.core.utils import CUDALocater
locater = CUDALocater()
nvcc_incl = locater.nvcc.include # returns a list of valid abs paths to the include directories, or None
cccl_incl = locater.cccl.include # returns a list of valid abs paths to the include directories, or None
nvrtc_lib = locater.nvrtc.lib # returns a list of valid abs paths to the shared libraries, or None
...
This needs to cover
- CUDA installed via various package managers (apt, yum, conda, pip, ...)
- Headers and shared libraries as bare minimum
- From JIT compilation perspective, headers are considered a kind of shared libraries
- Linux and Windows
- Default system search paths, if possible
- This includes the "legacy" CTK locations, such as
/usr/local/cuda
on Linux, as a fallback
- This includes the "legacy" CTK locations, such as
- All CTK components relevant to Python users, such as:
- nvcc/nvvm
- this includes libdevice.bc
- nvrtc
- nvjitlink
- cublas
- cusolver
- curand
- cufft
- cusparse
- ...
- nvcc/nvvm
Once completed, this would also help us unify the treatment of loading shared libraries in cuda.bindings
, which is currently divergent between Linux/Windows:
- Linux: hack RPATH and rely on dynamic loader (ld.so)
- Windows: search possible DLL locations (site-packages, ...)