mnn-llm

Chinese

Example Projects

cli: Compile using the command line, for Android compilation refer toandroid_build.sh
web: Compile using the command line, runtime requires specifyingweb resources
android: Open with Android Studio for compilation;
ios: Open with Xcode for compilation; 🚀🚀🚀This sample code is 100% generated by ChatGPT🚀🚀🚀
python: mnn-llm python api mnnllm；
other: Added capabilities for text embedding, vector querying, document parsing, memory bank, and knowledge base 🔥.

model export and download

For exporting the llm model to ONNX or mnn, please usellm-export

model download

Building

Current CI build status:

Local Compilation

# clone
git clone --recurse-submodules https://github.com/wangzhaode/mnn-llm.git
cd mnn-llm

# linux
./script/build.sh

# windows msvc
./script/build.ps1

# python wheel
./script/py_build.sh

# android
./script/android_build.sh

# android apk
./script/android_app_build.sh

# ios
./script/ios_build.sh

The default backend used is CPU. If you want to use a different backend, you can add a MNN compilation macro:

cuda: -DMNN_CUDA=ON
opencl: -DMNN_OPENCL=ON
metal: -DMNN_METAL=ON

4. Execution

# linux/macos
./cli_demo ./Qwen2-1.5B-Instruct-MNN/config.json # cli demo
./web_demo ./Qwen2-1.5B-Instruct-MNN/config.json ../web # web ui demo

# windows
.\Debug\cli_demo.exe ./Qwen2-1.5B-Instruct-MNN/config.json
.\Debug\web_demo.exe ./Qwen2-1.5B-Instruct-MNN/config.json ../web

# android
adb push android_build/MNN/OFF/arm64-v8a/libMNN.so /data/local/tmp
adb push android_build/MNN/express/OFF/arm64-v8a/libMNN_Express.so /data/local/tmp
adb push android_build/libllm.so android_build/cli_demo /data/local/tmp
adb push Qwen2-1.5B-Instruct-MNN /data/local/tmp
adb shell "cd /data/local/tmp && export LD_LIBRARY_PATH=. && ./cli_demo ./Qwen2-1.5B-Instruct-MNN/config.json"

Reference

reference

cpp-httplib
chatgpt-web
ChatViewDemo
nlohmann/json
Qwen-1.8B-Chat
Qwen-7B-Chat
Qwen-VL-Chat
Qwen1.5-0.5B-Chat
Qwen1.5-1.8B-Chat
Qwen1.5-4B-Chat
Qwen1.5-7B-Chat
Qwen2-0.5B-Instruct
Qwen2-1.5B-Instruct
Qwen2-7B-Instruct
Qwen2-VL-2B-Instruct
Qwen2-VL-7B-Instruct
Qwen2.5-0.5B-Instruct
Qwen2.5-1.5B-Instruct
Qwen2.5-3B-Instruct
Qwen2.5-7B-Instruct
Qwen2.5-Coder-1.5B-Instruct
Qwen2.5-Coder-7B-Instruct
Qwen2.5-Math-1.5B-Instruct
Qwen2.5-Math-7B-Instruct
chatglm-6b
chatglm2-6b
codegeex2-6b
chatglm3-6b
glm4-9b-chat
Llama-2-7b-chat-ms
Llama-3-8B-Instruct
Llama-3.2-1B-Instruct
Llama-3.2-3B-Instruct
Baichuan2-7B-Chat
internlm-chat-7b
Yi-6B-Chat
deepseek-llm-7b-chat
TinyLlama-1.1B-Chat-v0.6
phi-2
bge-large-zh
gte_sentence-embedding_multilingual-base

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_en.md

README_en.md

mnn-llm

Example Projects

model export and download

Building

Local Compilation

4. Execution

Reference

Files

README_en.md

Latest commit

History

README_en.md

File metadata and controls

mnn-llm

Example Projects

model export and download

Building

Local Compilation

4. Execution

Reference