Releases: jllllll/llama-cpp-python-cuBLAS-wheels
MacOS Metal Wheels
Available for Intel and Apple Silicon CPUs.
Install with:
python -m pip install llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/basic/cpu
0.1.85 builds likely won't work until fixes to the workflow are made.
CPU-only
While this repo is focused on providing cuBLAS wheels, it has become evident that there is a need for CPU-only wheels that do not require AVX2.
Wheels can be more easily downloaded from: https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX/cpu
Replace AVX
with one of basic
, AVX2
or AVX512
depending on what your CPU supports.
Basic non-AVX Wheels
Wheels without AVX, FMA and F16C support for compatibility with older CPUs.
AMD ROCm
All wheels built for AVX2 CPUs for now.
Linux
Wheels Built for ROCm 5.4.2, 5.5 and 5.6.1.
Windows
Should be considered experimental and may not work at all. Windows ROCm is very new.
To test it, you will need ROCm for Windows: https://www.amd.com/en/developer/rocm-hub/hip-sdk.html
Consult the possibly inaccurate GPU compatibility chart here: https://rocm.docs.amd.com/en/docs-5.5.1/release/windows_support.html
If your GPU isn't on that list, or it just doesn't work, you may need to build llama-cpp-python manually and hope your GPU is compatible.
Another option is to do this: ggerganov/llama.cpp#1087 (comment)
Pre-0.1.80 wheels built using ggerganov/llama.cpp#1087
Installation
To install, you can use this command:
python -m pip install llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/rocm5.5
This will install the latest llama-cpp-python version available from here for ROCm 5.5. You can change rocm5.5
to change the ROCm version.
Supported ROCm versions:
- Windows
5.5.1
- Linux
5.4.2
5.5
5.6.1
- Some adjacent versions of ROCm may also be compatible.
For example,5.4.1
should be compatible with the5.4.2
wheel.
GitHub Actions workflow here: https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/blob/main/.github/workflows/build-wheel-rocm.yml
Webui Wheels
These are basic/AVX/AVX2 wheels built under a different namespace to allow for simultaneous installation with the main llama-cpp-python package.
Installation can be done with this command:
python -m pip install llama-cpp-python-cuda --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/textgen/AVX2/cu117
The index URL can be changed similarly to what is described in the main installation instructions.