Support for partial GPU-offloading #1562

JeroenAdam · 2023-10-24T16:37:46Z

Feature request

Could gpt4all be adapted so that llama.cpp can be launched with x number of layers offloaded to the GPU?
At the moment, it is either all or nothing, complete GPU-offloading or completely CPU.
Llama.cpp supports partial GPU-offloading for many months now.
E.g. I'm able to run Mistral 7b 4-bit (Q4_K_S) partially on a 4GB GDDR6 GPU with about 75% of the layers offloaded to my GPU.
On my low-end system it gives maybe a 50% speed boost compared to CPU only.
This works with llama.cpp from the command line with the -ngl parameter.

Motivation

Faster inference on low-end systems.

ilgrank · 2024-11-15T11:38:08Z

Hi
@cebtenzzre : may I ask why it has been removed from roadmap? It is not feasible, not worth it... else?

ThiloteE · 2024-11-15T14:01:46Z

Partial GPU offloading has been supported since many versions of GPT4All. The code was introduced with PR #1890 back in January.
If you go to settings < model settings and scroll down to GPU layers, you can experiment with the number of layers and find the optimal performance.

Hence, closing as resolved

cebtenzzre added enhancement New feature or request backend gpt4all-backend issues vulkan labels Oct 24, 2023

cebtenzzre added this to (Archived) GPT4All 2024 Roadmap and Active Issues Oct 24, 2023

cebtenzzre moved this to Enhancements TODO in (Archived) GPT4All 2024 Roadmap and Active Issues Oct 24, 2023

cebtenzzre removed this from (Archived) GPT4All 2024 Roadmap and Active Issues Jan 8, 2024

ThiloteE closed this as completed Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for partial GPU-offloading #1562

Support for partial GPU-offloading #1562

JeroenAdam commented Oct 24, 2023 •

edited

Loading

ilgrank commented Nov 15, 2024

ThiloteE commented Nov 15, 2024

Support for partial GPU-offloading #1562

Support for partial GPU-offloading #1562

Comments

JeroenAdam commented Oct 24, 2023 • edited Loading

Feature request

Motivation

ilgrank commented Nov 15, 2024

ThiloteE commented Nov 15, 2024

JeroenAdam commented Oct 24, 2023 •

edited

Loading