Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for partial GPU-offloading #1562

Closed
JeroenAdam opened this issue Oct 24, 2023 · 2 comments
Closed

Support for partial GPU-offloading #1562

JeroenAdam opened this issue Oct 24, 2023 · 2 comments
Labels
backend gpt4all-backend issues enhancement New feature or request vulkan

Comments

@JeroenAdam
Copy link

JeroenAdam commented Oct 24, 2023

Feature request

Could gpt4all be adapted so that llama.cpp can be launched with x number of layers offloaded to the GPU?
At the moment, it is either all or nothing, complete GPU-offloading or completely CPU.
Llama.cpp supports partial GPU-offloading for many months now.
E.g. I'm able to run Mistral 7b 4-bit (Q4_K_S) partially on a 4GB GDDR6 GPU with about 75% of the layers offloaded to my GPU.
On my low-end system it gives maybe a 50% speed boost compared to CPU only.
This works with llama.cpp from the command line with the -ngl parameter.

Motivation

Faster inference on low-end systems.

@ilgrank
Copy link

ilgrank commented Nov 15, 2024

Hi
@cebtenzzre : may I ask why it has been removed from roadmap? It is not feasible, not worth it... else?

@ThiloteE
Copy link
Collaborator

Partial GPU offloading has been supported since many versions of GPT4All. The code was introduced with PR #1890 back in January.
If you go to settings < model settings and scroll down to GPU layers, you can experiment with the number of layers and find the optimal performance.

Hence, closing as resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend gpt4all-backend issues enhancement New feature or request vulkan
Projects
None yet
Development

No branches or pull requests

4 participants