diff --git a/README.md b/README.md index 82d115d06cb6..557209513a08 100644 --- a/README.md +++ b/README.md @@ -20,30 +20,75 @@

+ 特性 | + 模型支持 | 安装 | 快速开始 | - 特性 | 社区交流

-**PaddleNLP**是一款**简单易用**且**功能强大**的自然语言处理和大语言模型(LLM)开发库。聚合业界**优质预训练模型**并提供**开箱即用**的开发体验,覆盖NLP多场景的模型库搭配**产业实践范例**可满足开发者**灵活定制**的需求。 +**PaddleNLP**是一款基于飞桨深度学习框架的大语言模型(LLM)开发套件,支持在多种硬件上进行高效的大模型训练、无损压缩以及高性能推理。PaddleNLP具备**简单易用**和**性能极致**的特点,致力于助力开发者实现高效的大模型产业级应用。 ## News 📢 -* **2024.04.24 [PaddleNLP v2.8](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.8.0)**:自研极致收敛的RsLoRA+算法,大幅提升PEFT训练收敛速度以及训练效果;引入高性能生成加速到RLHF PPO算法,打破 PPO 训练中生成速度瓶颈,PPO训练性能大幅领先。通用化支持 FastFNN、FusedQKV等多个大模型训练性能优化方式,大模型训练更快、更稳定。 +* **2024.06.27 [PaddleNLP v3.0 Beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0)**:拥抱大模型,体验全升级。统一大模型工具链,实现国产计算芯片全流程接入;全面支持飞桨4D并行配置、高效精调策略、高效对齐算法、高性能推理等大模型产业级应用流程;自研极致收敛的RsLoRA+算法、自动扩缩容存储机制Unified Checkpoint和通用化支持FastFFN、FusedQKV助力大模型训推;主流模型持续支持更新,提供高效解决方案。 + +* **2024.04.24 [PaddleNLP v2.8](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.8.0)**:自研极致收敛的RsLoRA+算法,大幅提升PEFT训练收敛速度以及训练效果;引入高性能生成加速到RLHF PPO算法,打破 PPO 训练中生成速度瓶颈,PPO训练性能大幅领先。通用化支持 FastFFN、FusedQKV等多个大模型训练性能优化方式,大模型训练更快、更稳定。 + +* **2024.01.04 [PaddleNLP v2.7](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.7.1)**: 大模型体验全面升级,统一工具链大模型入口。统一预训练、精调、压缩、推理以及部署等环节的实现代码,到 `PaddleNLP/llm`目录。全新[大模型工具链文档](https://paddlenlp.readthedocs.io/zh/latest/llm/finetune.html),一站式指引用户从大模型入门到业务部署上线。自动扩缩容存储机制 Unified Checkpoint,大大提高大模型存储的通用性。高效微调升级,支持了高效微调+LoRA同时使用,支持了QLoRA等算法。 + +* **2023.08.15 [PaddleNLP v2.6](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.6.0)**: 发布[全流程大模型工具链](./llm),涵盖预训练,精调,压缩,推理以及部署等各个环节,为用户提供端到端的大模型方案和一站式的开发体验;内置[4D并行分布式Trainer](./docs/trainer.md),[高效微调算法LoRA/Prefix Tuning](./llm#33-lora), [自研INT8/INT4量化算法](./llm#6-量化)等等;全面支持[LLaMA 1/2](./llm/llama), [BLOOM](.llm/bloom), [ChatGLM 1/2](./llm/chatglm), [GLM](./llm/glm), [OPT](./llm/opt)等主流大模型 + + +## 特性 + +
+ +
+ +### 🔧 多硬件训推一体 +支持英伟达GPU、昆仑XPU、昇腾NPU、燧原GCU和海光DCU等多个硬件的大模型训练和推理,套件接口支持硬件快速切换,大幅降低硬件切换研发成本。 + +### 🚀 高效易用的预训练 +支持数据、分片、张量、流水线并行的4D高性能训练,Trainer支持分布式策略配置化,降低复杂分布式组合带来的使用成本; +Unified Checkpoint大模型存储格式在模型参数分布上支持动态扩缩容训练,降低硬件切换带来的迁移成本。 + +### 🤗 高效精调与高效对齐 +精调和对齐算法深度结合零填充数据流和FlashMask高性能算子,降低训练无效数据填充和计算,大幅提升精调和对齐训练吞吐。 + +### 🎛️ 无损压缩和高性能推理 +大模型套件高性能推理模块内置动态插入和全环节算子融合策略,极大加快并行推理速度。底层实现细节封装化,实现开箱即用的高性能并行推理能力。 + +------------------------------------------------------------------------------------------ -* **2024.01.04 [PaddleNLP v2.7](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.7.1)**: 大模型体验全面升级,统一工具链大模型入口。统一预训练、精调、压缩、推理以及部署等环节的实现代码,到 `PaddleNLP/llm`目录。全新[大模型工具链文档](https://paddlenlp.readthedocs.io/zh/latest/llm/finetune.html),一站式指引用户从大模型入门到业务部署上线。全断点存储机制 Unified Checkpoint,大大提高大模型存储的通用性。高效微调升级,支持了高效微调+LoRA同时使用,支持了QLoRA等算法。 +## 模型支持 -* **2023.08.15 [PaddleNLP v2.6](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.6.0)**: 发布[全流程大模型工具链](./llm),涵盖预训练,精调,压缩,推理以及部署等各个环节,为用户提供端到端的大模型方案和一站式的开发体验;内置[4D并行分布式Trainer](./docs/trainer.md),[高效微调算法LoRA/Prefix Tuning](./llm#33-lora), [自研INT8/INT4量化算法](./llm#6-量化)等等;全面支持[LLaMA 1/2](./llm/config/llama), [BLOOM](./llm/config/bloom), [ChatGLM 1/2](./llm/config/chatglm), [OPT](./llm/config/opt)等主流大模型 +| Model | Pretrain | SFT | LoRA | Prefix Tuning | DPO | RLHF | Quantization | Weight convert | +|--------------------------------------------|:--------:|:---:|:----:|:-------------:|:---:|:----:|:------------:|:--------------:| +| [LLaMA](./llm/config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [Qwen](./llm/config/qwen) | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | +| [Mixtral](./llm/config/mixtral) | ✅ | ✅ | ✅ | ❌ | 🚧 | 🚧 | 🚧 | 🚧 | +| [Baichuan/Baichuan2](./llm/config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | +| [ChatGLM-6B](./llm/config/chatglm) | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ❌ | +| [ChatGLM2/ChatGLM3](./llm/config/chatglm2) | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ✅ | +| [Bloom](./llm/config/bloom) | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ✅ | +| [GPT-3](./llm/config/gpt-3) | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | +| [OPT](./llm/config/opt) | 🚧 | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | +* ✅: Supported +* 🚧: In Progress +* ❌: Not Supported + +详细列表👉[模型参数支持](https://github.com/PaddlePaddle/PaddleNLP/issues/8663) + +------------------------------------------------------------------------------------------ ## 安装 ### 环境依赖 -- python >= 3.7 -- paddlepaddle >= 2.6.0 -- 如需大模型功能,请使用 paddlepaddle-gpu >= 2.6.0 +- python >= 3.8 +- paddlepaddle >= 3.0.0b0 ### pip安装 @@ -59,234 +104,49 @@ pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/pad 更多关于PaddlePaddle和PaddleNLP安装的详细教程请查看[Installation](./docs/get_started/installation.rst)。 -## 快速开始 +------------------------------------------------------------------------------------------ +## 快速开始 ### 大模型文本生成 -PaddleNLP提供了方便易用的Auto API,能够快速的加载模型和Tokenizer。这里以使用 `linly-ai/chinese-llama-2-7b` 大模型做文本生成为例: +PaddleNLP提供了方便易用的Auto API,能够快速的加载模型和Tokenizer。这里以使用 `Qwen/Qwen2-0.5B` 模型做文本生成为例: ```python >>> from paddlenlp.transformers import AutoTokenizer, AutoModelForCausalLM ->>> tokenizer = AutoTokenizer.from_pretrained("linly-ai/chinese-llama-2-7b") ->>> model = AutoModelForCausalLM.from_pretrained("linly-ai/chinese-llama-2-7b", dtype="float16") +>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B") +>>> model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B", dtype="float16") >>> input_features = tokenizer("你好!请自我介绍一下。", return_tensors="pd") >>> outputs = model.generate(**input_features, max_length=128) >>> tokenizer.batch_decode(outputs[0]) -['\n你好!我是一个AI语言模型,可以回答你的问题和提供帮助。'] +['我是一个AI语言模型,我可以回答各种问题,包括但不限于:天气、新闻、历史、文化、科学、教育、娱乐等。请问您有什么需要了解的吗?'] ``` -### 一键UIE预测 - -PaddleNLP提供[一键预测功能](./docs/model_zoo/taskflow.md),无需训练,直接输入数据即可开放域抽取结果。这里以信息抽取-命名实体识别任务,UIE模型为例: +### 大模型预训练 +```shell +mkdir -p llm/data && cd llm/data +wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.bin +wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.idx +cd .. # change folder to PaddleNLP/llm +python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_pretrain.py ./config/llama/pretrain_argument.json +``` -```python ->>> from pprint import pprint ->>> from paddlenlp import Taskflow - ->>> schema = ['时间', '选手', '赛事名称'] # Define the schema for entity extraction ->>> ie = Taskflow('information_extraction', schema=schema) ->>> pprint(ie("2月8日上午北京冬奥会自由式滑雪女子大跳台决赛中中国选手谷爱凌以188.25分获得金牌!")) -[{'时间': [{'end': 6, - 'probability': 0.9857378532924486, - 'start': 0, - 'text': '2月8日上午'}], - '赛事名称': [{'end': 23, - 'probability': 0.8503089953268272, - 'start': 6, - 'text': '北京冬奥会自由式滑雪女子大跳台决赛'}], - '选手': [{'end': 31, - 'probability': 0.8981548639781138, - 'start': 28, - 'text': '谷爱凌'}]}] +### 大模型SFT精调 +```shell +mkdir -p llm/data && cd llm/data +wget https://bj.bcebos.com/paddlenlp/datasets/examples/AdvertiseGen.tar.gz && tar -zxvf AdvertiseGen.tar.gz +cd .. # change folder to PaddleNLP/llm +python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_finetune.py ./config/llama/sft_argument.json ``` +更多大模型全流程步骤,请参考[大模型全流程工具链](./llm)。 + 更多PaddleNLP内容可参考: -- [大模型全流程工具链](./llm),包含主流中文大模型的全流程方案。 - [精选模型库](./legacy/model_zoo),包含优质预训练模型的端到端全流程使用。 - [多场景示例](./legacy/examples),了解如何使用PaddleNLP解决NLP多种技术问题,包含基础技术、系统应用与拓展应用。 - [交互式教程](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995),在🆓免费算力平台AI Studio上快速学习PaddleNLP。 - -## 特性 - -#### 📦 开箱即用的NLP工具集 - -#### 🤗 丰富完备的中文模型库 - -#### 🎛️ 产业级端到端系统范例 - -#### 🚀 高性能分布式训练与推理 - - -### 开箱即用的NLP工具集 - -Taskflow提供丰富的**📦开箱即用**的产业级NLP预置模型,覆盖自然语言理解与生成两大场景,提供**💪产业级的效果**与**⚡️极致的推理性能**。 - -![taskflow1](https://user-images.githubusercontent.com/11793384/159693816-fda35221-9751-43bb-b05c-7fc77571dd76.gif) - -更多使用方法可参考[Taskflow文档](./docs/model_zoo/taskflow.md)。 -### 丰富完备的中文模型库 - -#### 🀄 业界最全的中文预训练模型 - -精选 45+ 个网络结构和 500+ 个预训练模型参数,涵盖业界最全的中文预训练模型:既包括文心NLP大模型的ERNIE、PLATO等,也覆盖BERT、GPT、RoBERTa、T5等主流结构。通过`AutoModel` API一键⚡**高速下载**⚡。 - -```python -from paddlenlp.transformers import * - -ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') -bert = AutoModel.from_pretrained('bert-wwm-chinese') -albert = AutoModel.from_pretrained('albert-chinese-tiny') -roberta = AutoModel.from_pretrained('roberta-wwm-ext') -electra = AutoModel.from_pretrained('chinese-electra-small') -gpt = AutoModelForPretraining.from_pretrained('gpt-cpm-large-cn') -``` - -针对预训练模型计算瓶颈,可以使用API一键使用文心ERNIE-Tiny全系列轻量化模型,降低预训练模型部署难度。 - -```python -# 6L768H -ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') -# 6L384H -ernie = AutoModel.from_pretrained('ernie-3.0-mini-zh') -# 4L384H -ernie = AutoModel.from_pretrained('ernie-3.0-micro-zh') -# 4L312H -ernie = AutoModel.from_pretrained('ernie-3.0-nano-zh') -``` - -对预训练模型应用范式如语义表示、文本分类、句对匹配、序列标注、问答等,提供统一的API体验。 - -```python -import paddle -from paddlenlp.transformers import * - -tokenizer = AutoTokenizer.from_pretrained('ernie-3.0-medium-zh') -text = tokenizer('自然语言处理') - -# 语义表示 -model = AutoModel.from_pretrained('ernie-3.0-medium-zh') -sequence_output, pooled_output = model(input_ids=paddle.to_tensor([text['input_ids']])) -# 文本分类 & 句对匹配 -model = AutoModelForSequenceClassification.from_pretrained('ernie-3.0-medium-zh') -# 序列标注 -model = AutoModelForTokenClassification.from_pretrained('ernie-3.0-medium-zh') -# 问答 -model = AutoModelForQuestionAnswering.from_pretrained('ernie-3.0-medium-zh') -``` - -#### 💯 全场景覆盖的应用示例 - -覆盖从学术到产业的NLP应用示例,涵盖NLP基础技术、NLP系统应用以及拓展应用。全面基于飞桨核心框架2.0全新API体系开发,为开发者提供飞桨文本领域的最佳实践。 - -精选预训练模型示例可参考[Model Zoo](./legacy/model_zoo),更多场景示例文档可参考[examples目录](./legacy/examples)。更有免费算力支持的[AI Studio](https://aistudio.baidu.com)平台的[Notbook交互式教程](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995)提供实践。 - -
PaddleNLP预训练模型适用任务汇总(点击展开详情
- -| Model | Sequence Classification | Token Classification | Question Answering | Text Generation | Multiple Choice | -|:-------------------|-------------------------|----------------------|--------------------|-----------------|-----------------| -| ALBERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| BART | ✅ | ✅ | ✅ | ✅ | ❌ | -| BERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| BigBird | ✅ | ✅ | ✅ | ❌ | ✅ | -| BlenderBot | ❌ | ❌ | ❌ | ✅ | ❌ | -| ChineseBERT | ✅ | ✅ | ✅ | ❌ | ❌ | -| ConvBERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| CTRL | ✅ | ❌ | ❌ | ❌ | ❌ | -| DistilBERT | ✅ | ✅ | ✅ | ❌ | ❌ | -| ELECTRA | ✅ | ✅ | ✅ | ❌ | ✅ | -| ERNIE | ✅ | ✅ | ✅ | ❌ | ✅ | -| ERNIE-CTM | ❌ | ✅ | ❌ | ❌ | ❌ | -| ERNIE-Doc | ✅ | ✅ | ✅ | ❌ | ❌ | -| ERNIE-GEN | ❌ | ❌ | ❌ | ✅ | ❌ | -| ERNIE-Gram | ✅ | ✅ | ✅ | ❌ | ❌ | -| ERNIE-M | ✅ | ✅ | ✅ | ❌ | ❌ | -| FNet | ✅ | ✅ | ✅ | ❌ | ✅ | -| Funnel-Transformer | ✅ | ✅ | ✅ | ❌ | ❌ | -| GPT | ✅ | ✅ | ❌ | ✅ | ❌ | -| LayoutLM | ✅ | ✅ | ❌ | ❌ | ❌ | -| LayoutLMv2 | ❌ | ✅ | ❌ | ❌ | ❌ | -| LayoutXLM | ❌ | ✅ | ❌ | ❌ | ❌ | -| LUKE | ❌ | ✅ | ✅ | ❌ | ❌ | -| mBART | ✅ | ❌ | ✅ | ❌ | ✅ | -| MegatronBERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| MobileBERT | ✅ | ❌ | ✅ | ❌ | ❌ | -| MPNet | ✅ | ✅ | ✅ | ❌ | ✅ | -| NEZHA | ✅ | ✅ | ✅ | ❌ | ✅ | -| PP-MiniLM | ✅ | ❌ | ❌ | ❌ | ❌ | -| ProphetNet | ❌ | ❌ | ❌ | ✅ | ❌ | -| Reformer | ✅ | ❌ | ✅ | ❌ | ❌ | -| RemBERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| RoBERTa | ✅ | ✅ | ✅ | ❌ | ✅ | -| RoFormer | ✅ | ✅ | ✅ | ❌ | ❌ | -| SKEP | ✅ | ✅ | ❌ | ❌ | ❌ | -| SqueezeBERT | ✅ | ✅ | ✅ | ❌ | ❌ | -| T5 | ❌ | ❌ | ❌ | ✅ | ❌ | -| TinyBERT | ✅ | ❌ | ❌ | ❌ | ❌ | -| UnifiedTransformer | ❌ | ❌ | ❌ | ✅ | ❌ | -| XLNet | ✅ | ✅ | ✅ | ❌ | ✅ | - -
- -可参考[Transformer 文档](/docs/model_zoo/index.rst) 查看目前支持的预训练模型结构、参数和详细用法。 - -### 产业级端到端系统范例 - -PaddleNLP针对信息抽取、语义检索、智能问答、情感分析等高频NLP场景,提供了端到端系统范例,打通*数据标注*-*模型训练*-*模型调优*-*预测部署*全流程,持续降低NLP技术产业落地门槛。更多详细的系统级产业范例使用说明请参考[Applications](./legacy/applications)。 - -#### 🔍 语义检索系统 - -针对无监督数据、有监督数据等多种数据情况,结合SimCSE、In-batch Negatives、ERNIE-Gram单塔模型等,推出前沿的语义检索方案,包含召回、排序环节,打通训练、调优、高效向量检索引擎建库和查询全流程。 - -
- -
- - -更多使用说明请参考[语义检索系统](./legacy/applications/neural_search)。 - -#### ❓ 智能问答系统 - -基于[🚀RocketQA](https://github.com/PaddlePaddle/RocketQA)技术的检索式问答系统,支持FAQ问答、说明书问答等多种业务场景。 - -
- -
- - -更多使用说明请参考[智能问答系统](./legacy/applications/question_answering)与[文档智能问答](https://github.com/PaddlePaddle/PaddleNLP/tree/release/2.8/applications/document_intelligence/doc_vqa) - -#### 💌 评论观点抽取与情感分析 - -基于情感知识增强预训练模型SKEP,针对产品评论进行评价维度和观点抽取,以及细粒度的情感分析。 - -
- -
- -更多使用说明请参考[情感分析](https://github.com/PaddlePaddle/PaddleNLP/tree/release/2.8/applications/sentiment_analysis)。 - -#### 🎙️ 智能语音指令解析 - -集成了[PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech)和[百度开放平台](https://ai.baidu.com/)的语音识别和[UIE](./legacy/model_zoo/uie)通用信息抽取等技术,打造智能一体化的语音指令解析系统范例,该方案可应用于智能语音填单、智能语音交互、智能语音检索等场景,提高人机交互效率。 - -
- -
- -更多使用说明请参考[智能语音指令解析](https://github.com/PaddlePaddle/PaddleNLP/tree/release/2.8/applications/speech_cmd_analysis)。 - -### 高性能分布式训练与推理 - -#### 🚀 Fleet:飞桨4D混合并行分布式训练技术 - -
- -
- - -更多关于千亿级AI模型的分布式训练使用说明可参考[GPT-3](./legacy/model_zoo/gpt-3)。 +------------------------------------------------------------------------------------------ ## 社区交流 @@ -295,9 +155,9 @@ PaddleNLP针对信息抽取、语义检索、智能问答、情感分析等高 - 与众多社区开发者以及官方团队深度交流。 - 10G重磅NLP学习大礼包! -
- -
+
+ +
## Citation diff --git a/README_en.md b/README_en.md index de33396f3dea..4928a13e0387 100644 --- a/README_en.md +++ b/README_en.md @@ -1,7 +1,8 @@ - [简体中文🀄](./README.md) | **English🌎** -

+

+ +

------------------------------------------------------------------------------------------ @@ -17,262 +18,131 @@

-

Features | Installation | Quick Start | API Reference | Community - -**PaddleNLP** is a NLP library that is both **easy to use** and **powerful**. It aggregates high-quality pretrained models in the industry and provides a **plug-and-play** development experience, covering a model library for various NLP scenarios. With practical examples from industry practices, PaddleNLP can meet the needs of developers who require **flexible customization**. - -## News 📢 - -* **2024.01.04 [PaddleNLP v2.7](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.7.0)**: The LLM experience is fully upgraded, and the tool chain LLM entrance is unified. Unify the implementation code of pre-training, fine-tuning, compression, inference and deployment to the `PaddleNLP/llm` directory. The new [LLM Toolchain Documentation](https://paddlenlp.readthedocs.io/zh/latest/llm/finetune.html) provides one-stop guidance for users from getting started with LLM to business deployment and launch. The full breakpoint storage mechanism Unified Checkpoint greatly improves the versatility of LLM storage. Efficient fine-tuning upgrade supports the simultaneous use of efficient fine-tuning + LoRA, and supports QLoRA and other algorithms. - -* **2023.08.15 [PaddleNLP v2.6](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.6.0)**: Release [Full-process LLM toolchain](./llm) , covering all aspects of pre-training, fine-tuning, compression, inference and deployment, providing users with end-to-end LLM solutions and one-stop development experience; built-in [4D parallel distributed Trainer](./docs/trainer.md ), [Efficient fine-tuning algorithm LoRA/Prefix Tuning](./llm/README.md#2-%E7%B2%BE%E8%B0%83), [Self-developed INT8/INT4 quantization algorithm](./llm/README.md#4-%E9%87%8F%E5%8C%96), etc.; fully supports [LLaMA 1/2](./llm/config/llama), [BLOOM](./llm/config/bloom), [ChatGLM 1/2](./llm/config/chatglm), [OPT](./llm/config/opt) and other mainstream LLMs. - -## Installation -### Prerequisites +

+ Features | + Supported Models | + Installation | + Quick Start | + Community +

-* python >= 3.7 -* paddlepaddle >= 2.6.0 +**PaddleNLP** is a Large Language Model (LLM) development suite based on the PaddlePaddle deep learning framework, supporting efficient large model training, lossless compression, and high-performance inference on various hardware devices. With its **simplicity** and **ultimate performance**, PaddleNLP is dedicated to helping developers achieve efficient industrial applications of large models. -More information about PaddlePaddle installation please refer to [PaddlePaddle's Website](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/conda/linux-conda.html). - -### Python pip Installation +## News 📢 -``` -pip install --upgrade paddlenlp -``` +* **2024.06.27 [PaddleNLP v3.0 Beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0)**:Embrace large models and experience a complete upgrade. With a unified large model toolchain, we achieve full-process access to domestically produced computing chips. We fully support industrial-level application processes for large models, such as PaddlePaddle's 4D parallel configuration, efficient fine-tuning strategies, efficient alignment algorithms, and high-performance reasoning. Our developed RsLoRA+ algorithm, full checkpoint storage mechanism Unified Checkpoint, and generalized support for FastFNN and FusedQKV all contribute to the training and inference of large models. We continuously support updates to mainstream models for providing efficient solutions. -or you can install the latest develop branch code with the following command: +* **2024.04.24 [PaddleNLP v2.8](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.8.0)**:Our self-developed RsLoRA+ algorithm with extreme convergence significantly improves the convergence speed and training effectiveness of PEFT training. By introducing high-performance generation acceleration into the RLHF PPO algorithm, we have broken through the generation speed bottleneck in PPO training, achieving a significant lead in PPO training performance. We generally support multiple large model training performance optimization methods such as FastFFN and FusedQKV, making large model training faster and more stable. -```shell -pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html -``` +* **2024.01.04 [PaddleNLP v2.7](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.7.0)**: The LLM experience is fully upgraded, and the tool chain LLM entrance is unified. Unify the implementation code of pre-training, fine-tuning, compression, inference and deployment to the `PaddleNLP/llm` directory. The new [LLM Toolchain Documentation](https://paddlenlp.readthedocs.io/zh/latest/llm/finetune.html) provides one-stop guidance for users from getting started with LLM to business deployment and launch. The full breakpoint storage mechanism Unified Checkpoint greatly improves the versatility of LLM storage. Efficient fine-tuning upgrade supports the simultaneous use of efficient fine-tuning + LoRA, and supports QLoRA and other algorithms. +* **2023.08.15 [PaddleNLP v2.6](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.6.0)**: Release [Full-process LLM toolchain](./llm) , covering all aspects of pre-training, fine-tuning, compression, inference and deployment, providing users with end-to-end LLM solutions and one-stop development experience; built-in [4D parallel distributed Trainer](./docs/trainer.md ), [Efficient fine-tuning algorithm LoRA/Prefix Tuning](./llm/README.md#2-%E7%B2%BE%E8%B0%83), [Self-developed INT8/INT4 quantization algorithm](./llm/README.md#4-%E9%87%8F%E5%8C%96), etc.; fully supports [LLaMA 1/2](./llm/config/llama), [BLOOM](./llm/config/bloom), [ChatGLM 1/2](./llm/config/chatglm), [OPT](./llm/config/opt) and other mainstream LLMs. ## Features -#### 📦 Out-of-Box NLP Toolset - -#### 🤗 Awesome Chinese Model Zoo - -#### 🎛️ Industrial End-to-end System - -#### 🚀 High Performance Distributed Training and Inference - - -### Out-of-Box NLP Toolset - -Taskflow aims to provide off-the-shelf NLP pre-built task covering NLU and NLG technique, in the meanwhile with extremely fast inference satisfying industrial scenario. - -![taskflow1](https://user-images.githubusercontent.com/11793384/159693816-fda35221-9751-43bb-b05c-7fc77571dd76.gif) - -For more usage please refer to [Taskflow Docs](./docs/model_zoo/taskflow.md). - -### Awesome Chinese Model Zoo - -#### 🀄 Comprehensive Chinese Transformer Models - -We provide **45+** network architectures and over **500+** pretrained models. Not only includes all the SOTA model like ERNIE, PLATO and SKEP released by Baidu, but also integrates most of the high-quality Chinese pretrained model developed by other organizations. Use `AutoModel` API to **⚡SUPER FAST⚡** download pretrained models of different architecture. We welcome all developers to contribute your Transformer models to PaddleNLP! - -```python -from paddlenlp.transformers import * - -ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') -bert = AutoModel.from_pretrained('bert-wwm-chinese') -albert = AutoModel.from_pretrained('albert-chinese-tiny') -roberta = AutoModel.from_pretrained('roberta-wwm-ext') -electra = AutoModel.from_pretrained('chinese-electra-small') -gpt = AutoModelForPretraining.from_pretrained('gpt-cpm-large-cn') -``` - -Due to the computation limitation, you can use the ERNIE-Tiny light models to accelerate the deployment of pretrained models. -```python -# 6L768H -ernie = AutoModel.from_pretrained('ernie-3.0-medium-zh') -# 6L384H -ernie = AutoModel.from_pretrained('ernie-3.0-mini-zh') -# 4L384H -ernie = AutoModel.from_pretrained('ernie-3.0-micro-zh') -# 4L312H -ernie = AutoModel.from_pretrained('ernie-3.0-nano-zh') -``` -Unified API experience for NLP task like semantic representation, text classification, sentence matching, sequence labeling, question answering, etc. - -```python -import paddle -from paddlenlp.transformers import * - -tokenizer = AutoTokenizer.from_pretrained('ernie-3.0-medium-zh') -text = tokenizer('natural language processing') - -# Semantic Representation -model = AutoModel.from_pretrained('ernie-3.0-medium-zh') -sequence_output, pooled_output = model(input_ids=paddle.to_tensor([text['input_ids']])) -# Text Classificaiton and Matching -model = AutoModelForSequenceClassification.from_pretrained('ernie-3.0-medium-zh') -# Sequence Labeling -model = AutoModelForTokenClassification.from_pretrained('ernie-3.0-medium-zh') -# Question Answering -model = AutoModelForQuestionAnswering.from_pretrained('ernie-3.0-medium-zh') -``` - -#### Wide-range NLP Task Support - -PaddleNLP provides rich examples covering mainstream NLP task to help developers accelerate problem solving. You can find our powerful transformer [Model Zoo](./legacy/model_zoo), and wide-range NLP application [examples](./legacy/examples) with detailed instructions. - -Also you can run our interactive [Notebook tutorial](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995) on AI Studio, a powerful platform with **FREE** computing resource. - -
PaddleNLP Transformer model summary (click to show details)
- -| Model | Sequence Classification | Token Classification | Question Answering | Text Generation | Multiple Choice | -| :----------------- | ----------------------- | -------------------- | ------------------ | --------------- | --------------- | -| ALBERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| BART | ✅ | ✅ | ✅ | ✅ | ❌ | -| BERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| BigBird | ✅ | ✅ | ✅ | ❌ | ✅ | -| BlenderBot | ❌ | ❌ | ❌ | ✅ | ❌ | -| ChineseBERT | ✅ | ✅ | ✅ | ❌ | ❌ | -| ConvBERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| CTRL | ✅ | ❌ | ❌ | ❌ | ❌ | -| DistilBERT | ✅ | ✅ | ✅ | ❌ | ❌ | -| ELECTRA | ✅ | ✅ | ✅ | ❌ | ✅ | -| ERNIE | ✅ | ✅ | ✅ | ❌ | ✅ | -| ERNIE-CTM | ❌ | ✅ | ❌ | ❌ | ❌ | -| ERNIE-Doc | ✅ | ✅ | ✅ | ❌ | ❌ | -| ERNIE-GEN | ❌ | ❌ | ❌ | ✅ | ❌ | -| ERNIE-Gram | ✅ | ✅ | ✅ | ❌ | ❌ | -| ERNIE-M | ✅ | ✅ | ✅ | ❌ | ❌ | -| FNet | ✅ | ✅ | ✅ | ❌ | ✅ | -| Funnel-Transformer | ✅ | ✅ | ✅ | ❌ | ❌ | -| GPT | ✅ | ✅ | ❌ | ✅ | ❌ | -| LayoutLM | ✅ | ✅ | ❌ | ❌ | ❌ | -| LayoutLMv2 | ❌ | ✅ | ❌ | ❌ | ❌ | -| LayoutXLM | ❌ | ✅ | ❌ | ❌ | ❌ | -| LUKE | ❌ | ✅ | ✅ | ❌ | ❌ | -| mBART | ✅ | ❌ | ✅ | ❌ | ✅ | -| MegatronBERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| MobileBERT | ✅ | ❌ | ✅ | ❌ | ❌ | -| MPNet | ✅ | ✅ | ✅ | ❌ | ✅ | -| NEZHA | ✅ | ✅ | ✅ | ❌ | ✅ | -| PP-MiniLM | ✅ | ❌ | ❌ | ❌ | ❌ | -| ProphetNet | ❌ | ❌ | ❌ | ✅ | ❌ | -| Reformer | ✅ | ❌ | ✅ | ❌ | ❌ | -| RemBERT | ✅ | ✅ | ✅ | ❌ | ✅ | -| RoBERTa | ✅ | ✅ | ✅ | ❌ | ✅ | -| RoFormer | ✅ | ✅ | ✅ | ❌ | ❌ | -| SKEP | ✅ | ✅ | ❌ | ❌ | ❌ | -| SqueezeBERT | ✅ | ✅ | ✅ | ❌ | ❌ | -| T5 | ❌ | ❌ | ❌ | ✅ | ❌ | -| TinyBERT | ✅ | ❌ | ❌ | ❌ | ❌ | -| UnifiedTransformer | ❌ | ❌ | ❌ | ✅ | ❌ | -| XLNet | ✅ | ✅ | ✅ | ❌ | ✅ | - -
- -For more pretrained model usage, please refer to [Transformer API Docs](./docs/model_zoo/index.rst). - -### Industrial End-to-end System - -We provide high value scenarios including information extraction, semantic retrieval, question answering high-value. - -For more details industrial cases please refer to [Applications](./legacy/applications). - - -#### 🔍 Neural Search System -
- +
+### 🔧 Integrated training and inference on multiple hardware platforms +Our development suit supports large model training and inference on multiple hardware platforms, including NVIDIA GPUs, Kunlun XPUs, Ascend NPUs, Enflame GCUs, and Hygon DCUs. The toolkit's interface allows for quick hardware switching, significantly reducing research and development costs associated with hardware transitions. -For more details please refer to [Neural Search](./legacy/applications/neural_search). - -#### ❓ Question Answering System - -We provide question answering pipeline which can support FAQ system, Document-level Visual Question answering system based on [🚀RocketQA](https://github.com/PaddlePaddle/RocketQA). - -
- -
+### 🚀 Efficient and easy-to-use pre-training +We support 4D high-performance training with data parallelism, sharding parallelism, tensor parallelism, and pipeline parallelism. The Trainer supports configurable distributed strategies, reducing the cost associated with complex distributed combinations. The Unified Checkpoint large model storage format supports dynamic scaling of model parameter distribution during training, thereby reducing the migration cost caused by hardware switching. +### 🤗 Efficient fine-tuning and alignment +The fine-tuning and alignment algorithms are deeply integrated with zero-padding data streams and high-performance FlashMask operators, reducing invalid data padding and computation during training, and significantly improving the throughput of fine-tuning and alignment training. -For more details please refer to [Question Answering](./legacy/applications/question_answering) and [Document VQA](https://github.com/PaddlePaddle/PaddleNLP/tree/release/2.8/applications/document_intelligence/doc_vqa). +### 🎛️ Lossless compression and high-performance inference +The high-performance inference module of the large model toolkit incorporates dynamic insertion and operator fusion strategies throughout the entire process, greatly accelerating parallel inference speed. The underlying implementation details are encapsulated, enabling out-of-the-box high-performance parallel inference capabilities. +------------------------------------------------------------------------------------------ -#### 💌 Opinion Extraction and Sentiment Analysis +## Support Models -We build an opinion extraction system for product review and fine-grained sentiment analysis based on [SKEP](https://arxiv.org/abs/2005.05635) Model. +| Model | Pretrain | SFT | LoRA | Prefix Tuning | DPO | RLHF | Quantization | Weight convert | +|--------------------------------------------|:--------:|:---:|:----:|:-------------:|:---:|:----:|:------------:|:--------------:| +| [LLaMA](./llm/config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [Qwen](./llm/config/qwen) | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | +| [Mixtral](./llm/config/mixtral) | ✅ | ✅ | ✅ | ❌ | 🚧 | 🚧 | 🚧 | 🚧 | +| [Baichuan/Baichuan2](./llm/config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | +| [ChatGLM-6B](./llm/config/chatglm) | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ❌ | +| [ChatGLM2/ChatGLM3](./llm/config/chatglm2) | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ✅ | +| [Bloom](./llm/config/bloom) | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ✅ | +| [GPT-3](./llm/config/gpt-3) | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | +| [OPT](./llm/config/opt) | 🚧 | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | -
- -
+* ✅: Supported +* 🚧: In Progress +* ❌: Not Supported +Detailed list 👉 [Supported Model List](https://github.com/PaddlePaddle/PaddleNLP/issues/8663) -For more details please refer to [Sentiment Analysis](https://github.com/PaddlePaddle/PaddleNLP/tree/release/2.8/applications/sentiment_analysis). +## Installation -#### 🎙️ Speech Command Analysis +### Prerequisites -Integrated ASR Model, Information Extraction, we provide a speech command analysis pipeline that show how to use PaddleNLP and [PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech) to solve Speech + NLP real scenarios. +- python >= 3.8 +- paddlepaddle >= 3.0.0b0 -
- -
+### Pip Installation +```shell +pip install --upgrade paddlenlp +``` -For more details please refer to [Speech Command Analysis](https://github.com/PaddlePaddle/PaddleNLP/tree/release/2.8/applications/speech_cmd_analysis). +or you can install the latest develop branch code with the following command: -### High Performance Distributed Training and Inference +```shell +pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html +``` -#### 🚀 Fleet: 4D Hybrid Distributed Training +More information about PaddlePaddle installation please refer to [PaddlePaddle's Website](https://www.paddlepaddle.org.cn). -
- -
+------------------------------------------------------------------------------------------ +## Quick Start -For more super large-scale model pre-training details please refer to [GPT-3](./legacy/model_zoo/gpt-3). +### Text generation with large language model +PaddleNLP provides a convenient and easy-to-use Auto API, which can quickly load models and Tokenizers. Here, we use the `Qwen/Qwen2-0.5B` large model as an example for text generation: -## Quick Start +```python +>>> from paddlenlp.transformers import AutoTokenizer, AutoModelForCausalLM +>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B") +>>> model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B", dtype="float16") +>>> input_features = tokenizer("你好!请自我介绍一下。", return_tensors="pd") +>>> outputs = model.generate(**input_features, max_length=128) +>>> tokenizer.batch_decode(outputs[0]) +['我是一个AI语言模型,我可以回答各种问题,包括但不限于:天气、新闻、历史、文化、科学、教育、娱乐等。请问您有什么需要了解的吗?'] +``` -**Taskflow** aims to provide off-the-shelf NLP pre-built task covering NLU and NLG scenario, in the meanwhile with extremely fast inference satisfying industrial applications. +### Pre-training for large language model +```shell +mkdir -p llm/data && cd llm/data +wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.bin +wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.idx +cd .. # change folder to PaddleNLP/llm +python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_pretrain.py ./config/llama/pretrain_argument.json +``` -```python -from paddlenlp import Taskflow - -# Chinese Word Segmentation -seg = Taskflow("word_segmentation") -seg("第十四届全运会在西安举办") ->>> ['第十四届', '全运会', '在', '西安', '举办'] - -# POS Tagging -tag = Taskflow("pos_tagging") -tag("第十四届全运会在西安举办") ->>> [('第十四届', 'm'), ('全运会', 'nz'), ('在', 'p'), ('西安', 'LOC'), ('举办', 'v')] - -# Named Entity Recognition -ner = Taskflow("ner") -ner("《孤女》是2010年九州出版社出版的小说,作者是余兼羽") ->>> [('《', 'w'), ('孤女', '作品类_实体'), ('》', 'w'), ('是', '肯定词'), ('2010年', '时间类'), ('九州出版社', '组织机构类'), ('出版', '场景事件'), ('的', '助词'), ('小说', '作品类_概念'), (',', 'w'), ('作者', '人物类_概念'), ('是', '肯定词'), ('余兼羽', '人物类_实体')] - -# Dependency Parsing -ddp = Taskflow("dependency_parsing") -ddp("9月9日上午纳达尔在亚瑟·阿什球场击败俄罗斯球员梅德韦杰夫") ->>> [{'word': ['9月9日', '上午', '纳达尔', '在', '亚瑟·阿什球场', '击败', '俄罗斯', '球员', '梅德韦杰夫'], 'head': [2, 6, 6, 5, 6, 0, 8, 9, 6], 'deprel': ['ATT', 'ADV', 'SBV', 'MT', 'ADV', 'HED', 'ATT', 'ATT', 'VOB']}] - -# Sentiment Analysis -senta = Taskflow("sentiment_analysis") -senta("这个产品用起来真的很流畅,我非常喜欢") ->>> [{'text': '这个产品用起来真的很流畅,我非常喜欢', 'label': 'positive', 'score': 0.9938690066337585}] +### SFT finetuning forlarge language model +```shell +mkdir -p llm/data && cd llm/data +wget https://bj.bcebos.com/paddlenlp/datasets/examples/AdvertiseGen.tar.gz && tar -zxvf AdvertiseGen.tar.gz +cd .. # change folder to PaddleNLP/llm +python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_finetune.py ./config/llama/sft_argument.json ``` -## API Reference +For more steps in the entire large model process, please refer to the[Large Model Full-Process Toolchain](./llm). -- Support [LUGE](https://www.luge.ai/) dataset loading and compatible with Hugging Face [Datasets](https://huggingface.co/datasets). For more details please refer to [Dataset API](https://paddlenlp.readthedocs.io/zh/latest/data_prepare/dataset_list.html). -- Using Hugging Face style API to load 500+ selected transformer models and download with fast speed. For more information please refer to [Transformers API](https://paddlenlp.readthedocs.io/zh/latest/model_zoo/index.html). -- One-line of code to load pre-trained word embedding. For more usage please refer to [Embedding API](https://paddlenlp.readthedocs.io/zh/latest/model_zoo/embeddings.html). +For more PaddleNLP content, please refer to: +- [Model Library](./legacy/model_zoo),which includes end-to-end usage of high-quality pre-trained models. +- [Multi-scenario Examples](./legacy/examples),to understand how to use PaddleNLP to solve various NLP technical problems, including basic techniques, system applications, and extended applications. +- [Interactive Tutorial](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995),to quickly learn PaddleNLP on the free computing platform AI Studio. -Please find all PaddleNLP API Reference from our [readthedocs](https://paddlenlp.readthedocs.io/). +------------------------------------------------------------------------------------------ ## Community @@ -285,14 +155,12 @@ To connect with other users and contributors, welcome to join our [Slack channel Scan the QR code below with your Wechat⬇️. You can access to official technical exchange group. Look forward to your participation.
- +
- - ## Citation -If you find PaddleNLP useful in your research, please consider cite +If you find PaddleNLP useful in your research, please consider citing ``` @misc{=paddlenlp, title={PaddleNLP: An Easy-to-use and High Performance NLP Library}, diff --git a/legacy/examples/code_generation/codegen/README.md b/legacy/examples/code_generation/codegen/README.md index 82a4891559d1..9fcd886ae374 100644 --- a/legacy/examples/code_generation/codegen/README.md +++ b/legacy/examples/code_generation/codegen/README.md @@ -164,7 +164,7 @@ print(result) #### 注意事项 -- 如果使用FastGeneration,需要设置[codegen_server.py](#配置参数说明)中`use_fast=True`,第一次推理会涉及到编译,会耗费一些时间。FastGeneration的环境依赖参考[这里](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/ops/README.md#%E4%BD%BF%E7%94%A8%E7%8E%AF%E5%A2%83%E8%AF%B4%E6%98%8E)。 +- 如果使用FastGeneration,需要设置[codegen_server.py](#配置参数说明)中`use_fast=True`,第一次推理会涉及到编译,会耗费一些时间。 - 如果要使用自己训练好的模型,可以设置[codegen_server.py](#配置参数说明)中`model_name_or_path`为本地模型路径。 - 如果要从本地访问服务器,上述的`127.0.0.1`需要换成服务器的对外IP。 - 如果出现下方的提示和报错,则说明FastGeneration没有启动成功,需要定位下失败的原因。或者也可设置`use_fast=False`,不启动FastGeneration加速,但推理速度会较慢。