-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple errors while compiling the kernel #11
Comments
Where you able to find the torch/all.h or the torch/python.h files? |
Mine fails a lot less verbosely on Windows:
All the compilers are there:
|
EDIT This comment is linked from elsewhere. Here's a more coherent guide: https://gist.github.com/lxe/82eb87db25fdb75b92fa18a6d494ee3c I had to downgrade cuda and torch and was able to compile. Here's my full process on windows:
When using the webui, make sure it's in the same env. If it overwrites torch, you'll have to do it again manually. |
This is making me downgrade my g++ is there a way to that inside a conda environment you know of? |
I think the real issue is not properly using/installing Libtorch. DId you install that successfully if so how? |
I'm getting this error even explicitly following those steps. No idea what's causing it:
|
Thanks a lot for this! I still got a lot of errors during compilation, but at the end, it said this |
yes |
This repo only contains a readme.md:
Should we use the one mentioned in the readme.md, which is also from March 2019? I doubt. |
followed the steps, got Finished processing dependencies for quant-cuda==0.0.0 but when running webui I get:
|
This repository contains only README with text about project being moved. How is it supposed to install anything? |
@g0hm4
Could you please tell what MSVC version do you use? I think it might be the case |
Here is the full log of my compilation errors: https://pastebin.com/KQC7UL9h |
I'm getting the same error when trying to run LLaMA 13B in 4-bit, though I did not use the same install method - i used the provided whl file here. Much simpler, though of course leading to the same error. |
I also have this issue with: |
Finally I managed to get it running. (I still can't compile it, thank you @Brawlence for providing windows wheel)
Tested on Windows 11 with 30B model and RTX 4090. |
Trying this now. Where do I put the wheel downloaded? |
Doesn't matter. Just make sure that textgen conda environment is activated and install it. |
If you have CUDA errors do the following:
|
Thank you! this actually worked, now loading the 13B at around 9GB vram. |
Sadly I still get this issue: Fixed - I used outdated weights. |
Which transformers did you end up installing?
|
Default one from text-generation-webui |
Could you clarify? What weights were outdated and how did you resolve it? |
weights from torrent shared on 4chan were causing an error Here's the output I get.
SHA-256 of the broken 7B 4-bit model which fails with the LLaMAForCausalLM SHA-256 of huggingface 7B 4-bit model that somewhat works |
@adamo1139 try to convert model to 4bit by yourself. Some users reported that models from this torrent can produce garbage output |
@adamo1139 I have a quadro P6000 and output seemed fine from cursory test in chat mode. From here: https://huggingface.co/decapoda-research/llama-7b-hf-int4/tree/main Have to try 13b and then 30b |
I'm trying to run the 7b model and getting the same error. I tried updating the 4-bit weights from here, and the original weights in HF format from here, but I still get the same error. EDIT: The issue was with my transformers library. Running this fixed it.
However, the 4-bit model is seems noticeably (and significantly) worse than the original, at least for the 7b version. Maybe the loss is less for higher parameter models. |
This is outdated looks like, here is how I did it for the Oobabooga webui with my already existing "textgen" conda environment (replace it if you've chosen a different conda env name)
I haven't launched it yet since I'm still downloading the weights, but at least those steps got me this far without errors |
So I managed to load the model fine within the webui but got an error upon generation
This might be more related to the webui but I'm still posting it here just in case |
Yes, it works! Tested on 3070ti and newer LLaMA-HFv2-4bit weights. I get 8.25 tokens per second, which is insane. Maybe if my CPU wasn't i5-8400 and it was loading the video card at 100% instead of 70, I would get 10 tokens/sec |
Had same issue, did every step of the list, but it didn't work. Then I've tried reinstalling webui as a whole and it somehow worked. |
Just be aware that this is an old (2 weeks, lmao) wheel and it may not work with the current patches. For any lost souls who's also looking for compiled kernels, it's probably best to use those: https://github.com/jllllll/GPTQ-for-LLaMa-Wheels |
if anyone want the correct link for 2019 build tools take a look here |
Hello, while trying to run
python setup_cuda.py install
, I get this error:Then after a long list of errors, I get this at the end:
Any idea what could be causing this? I've tried installing CUDA Toolkit 11.3 and Torch 1.12.1, but they too give the same error.
The text was updated successfully, but these errors were encountered: