Welcome to the LMTuner Project - LMTuner is an open-source system that enables easy and efficient training of large language models (LLMs) through a simple command-line interface, without requiring any coding experience. The key goal of LMTuner is to make LLM training more accessible by abstracting away unnecessary complexity. 🚀🚅
- [2023/09/22] We release LMTuner-v1.2.3!
- [2023/08/22] We release LMTuner-v1.2.2! And we release the LMTuner Paper!
- [2023/07/27] Release LMTuner-v1.2.0! LMTuner integrates model parallelism, quantization, parameter efficient fine-tuning (PEFT), memory efficient fine-tuning (MEFT), ZeRO optimization, custom dataset loading, and position interpolation.
- [2023/06/30] Release LMTuner-dataset-v1 On the basis of the LIMA dataset, we manually translated it into Chinese QA and adapted it in multiple places to adapt to the Chinese environment.
- [2023/06/01] We have created the LMTuner project, and we hope that everyone can train LLM on consumer-level servers.
This repository is tested on Python 3.8+, PyTorch 1.10+ and Deepspeed 0.9.3+, detail in Install.
git clone https://github.com/WENGSYX/LMTuner
pip install .
To quickly train models using LMTuner, simply use Let_Tune()
. By calling OpenAI's GPT-4, you can determine various parameters for the model you wish to train. Finally, LMTuner will save the configuration as ARGS.json
.
from LMTuner import Let_Tune
Let_Tune()
>>> [INFO] This is a library for training language models with ease.
>>> [INFO] In conversations with LMTuner, the language model will be trained automatically according to your needs, without requiring any effort on your part 😊
>>> [INFO] Would you like to command LMTuner through casual conversation?
>>> [Answer] If yes, please type (Yes), let"s go~, If not, please type (No): yes
>>> [AI] Hello there! I"m your AI assistant, and I"m here to help you train your model. Before we get started, it"s important to have a clear plan and goal in mind.
>>> [Answer] :
If GPT-4 is not available, we have also configured ten questionnaire-style questions. By answering these questions, you can successfully configure the system as well.
If training is stopped halfway, you can quickly restart the training process without repeating the training by using the following code. Alternatively, you can try other training methods more quickly by manually modifying the parameters in ARGS.json
.
from LMTuner import Let_Tune
Let_Tune('./ARGS.json')
from LMTuner.dataset import LMTunerDataset
dataset = LMTunerDataset()
# Give your model a name
dataset.set_model_name('Cognitive Intelligence Model')
# Add QA dataset samples
dataset.add_sample(['Who are you?',
"Hello everyone! I am a great artificial intelligence assistant, a cognitive intelligence model, created by the Language and Knowledge Computing Research Group of the Institute of Automation, Chinese Academy of Sciences. I am like your personal assistant, able to chat with you in fluent natural language. Whether it's answering questions or providing assistance, I can easily handle it. Although I don't have a physical image, I will do my best to provide you with the most thoughtful service"])
We have manually translated the LIMA dataset into Chinese Q&A, and rewrote it in many places to adapt to the Chinese environment. In addition, we have added 100 high-quality Chinese dialogue materials written by us.
- We have built-in dozens of samples with model names, and by simply calling
dataset.set_model_name
, you can update the model name for all samples with one click. - We support adding new samples. Call
dataset.add_sample
and pass in a dialogue list to automatically add new dialogue samples. - Get the dataset with one click. Calling
dataset.get_list()
will return a list-format dataset, and you can continue to train new models on this basis.
We prepared an example of training Llama-7B with English medical text data for demonstration.
LoRA | QLoRA | LOMO | Model Parallelism | Position Interpolation | Model Size | |
---|---|---|---|---|---|---|
GPT-2: | ✅ | ✅ | ✅ | 117M | ||
GPT-Neo-1.3B | ✅ | ✅ | ✅ | 1.3B | ||
ChatGLM-6B | ✅ | ✅ | ✅ | 6B | ||
ChatGLM2-6B | ✅ | ✅ | ✅ | 6B | ||
Llama-7B | ✅ | ✅ | ✅ | ✅ | 7B | |
Llama-13B | ✅ | ✅ | ✅ | ✅ | ✅ | 13B |
Llama-33B | ✅ | ✅ | ✅ | ✅ | ✅ | 33B |
Llama-65B | ✅ | ✅ | ✅ | ✅ | ✅ | 65B |
Llama2-7B | ✅ | ✅ | ✅ | ✅ | 7B | |
Llama2-13B | ✅ | ✅ | ✅ | ✅ | ✅ | 13B |
Llama2-70B | ✅ | ✅ | ✅ | ✅ | ✅ | 70B |
GLM-130B | ✅ | ✅ | ✅ | ✅ | 130B |
Model Parallelism | Quantization | PEFT | MEFT | ZeRO | Load Dataset | Position Interpolation | AI Assisstent | Code Concise | |
---|---|---|---|---|---|---|---|---|---|
MegatronLM | ✅ | ||||||||
Huggingface | ✅ | ✅ | ✅ | ✅ | ✅ | ||||
bitsandbytes | ✅ | ||||||||
Lamini | ✅ | ✅ | |||||||
OpenDelta | ✅ | ✅ | |||||||
h2oGPT | ✅ | ✅ | ✅ | ✅ | |||||
LMTuner | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
This project is an accompanying project of Neural Comprehension. If you are interested in our project, please feel free to quote.
@misc{weng2023mastering,
title={Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural Networks},
author={Yixuan Weng and Minjun Zhu and Fei Xia and Bin Li and Shizhu He and Kang Liu and Jun Zhao},
year={2023},
eprint={2304.01665},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@misc{weng2023lmtuner,
title={LMTuner: An user-friendly and highly-integrable Training Framework for fine-tuning Large Language Models},
author={Yixuan Weng and Zhiqi Wang and Huanxuan Liao and Shizhu He and Shengping Liu and Kang Liu and Jun Zhao},
year={2023},
eprint={2308.10252},
archivePrefix={arXiv},
primaryClass={cs.CL}
}