This project will now exclusively use the now outdated GGML format as it's been much kinder to me. Maybe when it's easier to implement, I'll update this repo to include GGUF support.
Better version of the LLM Model TUI. Yes, it got so bad I had to just rewrite it under a whole seperate name. Lots of code borrowed from this repository.
Basic guide for usage.
Run the following commands:
$ git clone https://github.com/N0THSA/llm-communicator.git && cd llm-communicator/installers
$ python3 check_dependencies.py (or check_dependencies_windows.py if you're on windows)
$ cd .. && python3 start.py
Then, download Visual Studio 2022 (and if prompted any Python 3 utils). Llama.cpps Python 3 bindings
-
Make a folder at the root of your harddrive and name it "models".
-
Go to https://huggingface.co/ and look for GGML models by "TheBloke" (https://huggingface.co/TheBloke) You can also look for GGML models from elsewhere, but TheBloke is usually the easiest choice. Typically, the more parameters (3b, 7b, 13b etc) the higher the file size and the higher the system requirements. Anything above 13b is discouraged.
-
On the model page, there will be multiple files typically ranging from 2_K to 8_0. Only get one. The lower the value, the less stable. The higher the value, the more stable. However, the higher the longer it takes to generate. It's also worth noting this will affect performance. Typically, 4_0 to 4_1 is a good stable range.
-
Place the model you download inside the "models" folder you created earlier at the root of your harddrive. Name it something simple.
-
Launch the program and go through the wizard, specifying the model to use.
It is worth noting that anything above 13b will be basically impossible to run under 32gb of ram, and 13b 8_0 usually needs more than 16gb of ram as well.
-
Run the following command:
python3 start.py
-
Follow the guide and fill in all applicable forms. Some are optional, such as information about you.
-
Set the model path to the place you downloaded your GGML model.
-
Enjoy!