llama-playground

A repo with scripts to test and play around with Facebook's recent llama models! 🤗

To get started first, make sure to clone the llama.cpp repo.

git clone https://github.com/ggerganov/llama.cpp.git

To use Neural Engine on your Apple device, build with the LLAMA_METAL flag.

cd llama.cpp && LLAMA_METAL=1 make

We'll use a 4-bit quantised model. A high-performant variant is gracefully hosted by TheBloke ♥️.

export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin
wget "https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/${MODEL}"

That's it; now we are ready to test the model out! 🚀

./ggml_metal.sh llama-2-13b-chat.ggmlv3.q4_0.bin system_prompts/good_chatbot.txt Hey!

The above command will instantiate a session to chat with llama. It'll initialise with the system prompt in the system_prompts folder. Feel free to tweak it!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
system_prompts		system_prompts
README.md		README.md
ggml_metal.sh		ggml_metal.sh
transformers_inference.py		transformers_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llama-playground

About

Releases

Packages

Languages

Vaibhavs10/on-device-llm-playground

Folders and files

Latest commit

History

Repository files navigation

llama-playground

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages