Is There Any Other Settings To Use It In GPU? #75

brainchildpoint · 2023-04-13T17:52:44Z

brainchildpoint
Apr 13, 2023

Hello,

Is there any other settings to use it in GPU? I use 'Vicuna GGML' version. I know 'GPTQ' version is for GPU. But, I found a huggingface GPTQ version is more stable.

Please, Let me know.

Thank You!

willtejeda · 2023-04-13T20:00:43Z

willtejeda
Apr 13, 2023

Curious about this for M1/M2 gpus/neural engines

2 replies

MillionthOdin16 Apr 13, 2023

I'm not very familiar with Apple hardware, but it should work well with the M1 chip with accelerator. Nothing for GPUs yet.

ianscrivener Jun 9, 2023

Runs quite well on MacOS M2 with MacOS Accelerate, but without GPU (Metal) acceleration.. ie via the regular pip install. I have v0.1.59

MillionthOdin16 · 2023-04-13T20:32:17Z

MillionthOdin16
Apr 13, 2023

Is there any other settings to use it in GPU? I use 'Vicuna GGML' version. I know 'GPTQ' version is for GPU. But, I found a huggingface GPTQ version is more stable.

Llama.cpp is looking at ways to utilize GPUs, but it's not a major goal of the repo (which is CPU oriented). Take a look here for some info that might help:
ggerganov/llama.cpp#938
ggerganov/llama.cpp#915

Just including this in case it helps:
GPTQ is a method of efficiently quantizing a model down to a few bits. GPTQ is often used on GPUs, but it can also work with GGML with a conversion. Just trying to differentiate between GPTQ as a quantization approach, not a model format.

Personally, I think it's cool, but I hate interacting with GPTQ code/quantized models because it has so many different versions and changes often. It's not very stable imo.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is There Any Other Settings To Use It In GPU? #75

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Is There Any Other Settings To Use It In GPU? #75

brainchildpoint Apr 13, 2023

Replies: 2 comments · 2 replies

willtejeda Apr 13, 2023

MillionthOdin16 Apr 13, 2023

ianscrivener Jun 9, 2023

MillionthOdin16 Apr 13, 2023

brainchildpoint
Apr 13, 2023

Replies: 2 comments 2 replies

willtejeda
Apr 13, 2023

MillionthOdin16
Apr 13, 2023