-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error running with --load-in-4bit #222
Comments
4-bit requires additional installation steps. See here: https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode |
Didnt see that note, I followed the steps, getting error regarding cuda, although i have cuda v11.8 running and the webui is working D:\MachineLearning\TextWebui\text-generation-webui\repositories\GPTQ-for-LLaMa>python setup_cuda.py install |
You need to follow the Windows specific GPTQ 4bit compilation instructions in this issue on GPTQ-for-LLaMA: qwopqwop200/GPTQ-for-LLaMa#11 (comment) |
I am getting more errors, Is that only because of Visual studio 2022? |
In another issue someone said that Visual studio 2022 doesn't work and that an older version was needed. I can't confirm but it could be worth trying. |
I also tried it in Linux new installation, same error |
2019 works. Use native tools command prompt. 30b 4bit takes 40 seconds to respond on my 3090 however, so YMMV on its usability. |
After i compile it can I go back to VS2022 or I need to stay 2019? |
You can install it alongside 2022. Just need it for the install but there's no harm in just keeping it in case you need it again I suppose. |
I installed the 2019 version, now getting CUDA Extension not installed |
Make sure the conda env you install the extension on and the one that runs the server.py is the same and activated |
I am really a newbie haha changed the relevant paths to where my conda stuff is.. I dont really understand how to fix that |
why is this closed, wasn't his issue unresolved? |
This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below. |
Loading llama-7b...
Traceback (most recent call last):
File "D:\MachineLearning\TextWebui\text-generation-webui\server.py", line 194, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "D:\MachineLearning\TextWebui\text-generation-webui\modules\models.py", line 94, in load_model
from llama import load_quant
ModuleNotFoundError: No module named 'llama'
Press any key to continue . . .
Windows11, 3090Ti
Tried 7B, 13B and 30B.
--load-in-8bit works
commit:
026d60b
The text was updated successfully, but these errors were encountered: