This package provides a Python binding for the libptxcompiler_static.a
library
and a Numba patch that fixes it to use the static library for compiling PTX
instead of the linker. This enables Numba to support CUDA enhanced
compatibility for scenarios where a single PTX file is compiled and linked as
part of the compilation process. This covers all use cases, except:
- Using Cooperative Groups.
- Debugging features - this includes debug and lineinfo generation, and exception checking inside CUDA kernels.
Numba 0.54.1 and above are supported.
Install with either:
python setup.py develop
or
python setup.py install
Run
pytest
or
python ptxcompiler/tests/test_lib.py
To configure Numba to use ptxcompiler, set the environment variable
NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY=1
. See the Numba CUDA Minor
Version Compatibility
documentation
for further information.
Numba versions < 0.57 need to be monkey patched to use ptxcompiler if required.
To apply the monkey patch if needed, call the
patch_numba_codegen_if_needed()
function:
from ptxcompiler.patch import patch_numba_codegen_if_needed
patch_numba_codegen_if_needed()
This function spawns a new process to check the CUDA Driver and Runtime versions, so it can be safely called at any point in a process. It will only patch Numba when the Runtime version exceeds the Driver version.
Under certain circumstances (for example running with InfiniBand network stacks), spawning a subprocess might not be possible. For these cases, the patching behaviour can be controlled using two environment variables:
PTXCOMPILER_CHECK_NUMBA_CODEGEN_PATCH_NEEDED
: if set to a truthy integer then a subprocess will be spawned to check if patching Numba is necessary. Default value: True (the subprocess check is carried out)PTXCOMPILER_APPLY_NUMBA_CODEGEN_PATCH
: if it is known that patching is necessary, but spawning a subprocess is not possible, set this to a truthy integer to unconditionally patch Numba. Default value: False (Numba is not unconditionally patched).