Add new mistral #7425

wtmlon · 2023-11-13T09:37:17Z

PR types

PR changes

Description

新增mistral模型

paddle-bot · 2023-11-13T09:37:22Z

Thanks for your contribution!

…nto add-new-mistral

codecov · 2023-11-17T10:48:44Z

Codecov Report

Attention: Patch coverage is 81.20950% with 87 lines in your changes missing coverage. Please review.

Project coverage is 55.74%. Comparing base (2723138) to head (cdbd0eb).
Report is 230 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/transformers/mistral/modeling.py	80.68%	84 Missing ⚠️
paddlenlp/peft/prefix/utils.py	33.33%	2 Missing ⚠️
paddlenlp/transformers/mistral/configuration.py	95.23%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #7425      +/-   ##
===========================================
+ Coverage    55.61%   55.74%   +0.12%     
===========================================
  Files          620      623       +3     
  Lines        96965    97450     +485     
===========================================
+ Hits         53930    54322     +392     
- Misses       43035    43128      +93

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

lugimzzz · 2023-12-06T03:38:35Z

paddlenlp/transformers/mistral/modeling.py

+    def set_input_embeddings(self, value):
+        self.embed_tokens = value
+
+    def _prepare_decoder_attention_mask(


确认一下目前attention_mask符合我们的标准，支持2d和4d， 1和0的含义一致，可以支持intokens策略

ZHUI · 2024-01-02T06:22:31Z

@wtmlon 不需要了吗？

github-actions · 2024-03-03T00:16:46Z

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动，被标记为stale。

…nto add-new-mistral

lugimzzz · 2024-07-02T09:02:18Z

llm/data.py

+# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.


放到utils里面

lugimzzz · 2024-07-02T09:02:57Z

llm/data.py

+from paddlenlp.peft import LoRAModel, PrefixModelForCausalLM
+
+
+def get_convert_example(model):


mistral有chat_template吗。确认支持吗？

支持，和 Llama3 配置方式一致，可直接复用代码：https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/blob/0417f4babd26db0b5ed07c1d0bc85658ab526ea3/tokenizer_config.json#L6176

lugimzzz · 2024-07-02T09:05:04Z

llm/data.py

+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at


确认过训练的loss正常？

没问题，这是 8 卡 sft

lugimzzz · 2024-07-02T09:05:47Z

paddlenlp/transformers/mistral/__init__.py

+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from .configuration import MistralConfig


mistral为什么没有tokenizer文件

官方的 mistral 直接用的 LlamaTokenizer：https://huggingface.co/mistralai/Mistral-7B-v0.3/blob/b67d6a03ca097c5122fa65904fce0413500bf8c8/tokenizer_config.json#L6183

lugimzzz · 2024-07-02T09:09:33Z

llm/mistral/lora_argument.json

@@ -0,0 +1,30 @@
+{


放到config文件里面，config目录里新增mistral包含json和readme。顺带更新llm目录下的readme

lugimzzz · 2024-07-02T09:11:50Z

支持zero_padding策略吗，同时新增支持dpo

wtmlon · 2024-07-02T11:46:15Z

支持zero_padding策略吗，同时新增支持dpo

都已支持

lugimzzz

lgtm

wtmlon added 7 commits October 24, 2023 11:47

mistral model init commit

a57e222

code save

e250362

add dump code

6029e66

mistral support lora,prefix,multi-gpu

b1c3cf8

bug fix

251552f

add flash attention

5c1db6f

mistral sliding windows attention (attention_mask version)

a5d3603

wtmlon added 10 commits November 13, 2023 17:41

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleNLP i…

ab191c5

…nto add-new-mistral

llm json fix

c97d018

bug fix

8160bbb

ci code save

4baa81b

mistral ci, attention mask

436567c

remove redundant comment

c4e66e1

remove print

e6768a6

empty commit

b6024b9

update

8d3a702

add init py

fafbb5b

lugimzzz reviewed Dec 6, 2023

View reviewed changes

github-actions bot added the stale label Mar 3, 2024

wtmlon added 7 commits June 27, 2024 15:11

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleNLP i…

9e267e6

…nto add-new-mistral

bugfix

f02a644

remove swa strategy

6d74157

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleNLP i…

111f186

…nto add-new-mistral

update

c52f203

lora support

5e31287

update

4705a60

github-actions bot removed the stale label Jul 2, 2024

add tp cross entropy

81a388c

lugimzzz reviewed Jul 2, 2024

View reviewed changes

wtmlon added 2 commits July 2, 2024 19:28

support zero padding

7a6d780

update README.md

cdbd0eb

lugimzzz approved these changes Jul 2, 2024

View reviewed changes

wtmlon merged commit af19dc4 into PaddlePaddle:develop Jul 3, 2024
9 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new mistral #7425

Add new mistral #7425

wtmlon commented Nov 13, 2023 •

edited

Loading

paddle-bot bot commented Nov 13, 2023

codecov bot commented Nov 17, 2023 •

edited

Loading

lugimzzz Dec 6, 2023

ZHUI commented Jan 2, 2024

github-actions bot commented Mar 3, 2024

lugimzzz Jul 2, 2024

wtmlon Jul 2, 2024

lugimzzz Jul 2, 2024

wtmlon Jul 2, 2024

lugimzzz Jul 2, 2024

wtmlon Jul 2, 2024

lugimzzz Jul 2, 2024

wtmlon Jul 2, 2024

lugimzzz Jul 2, 2024

wtmlon Jul 2, 2024

lugimzzz commented Jul 2, 2024

wtmlon commented Jul 2, 2024

lugimzzz left a comment

		from paddlenlp.peft import LoRAModel, PrefixModelForCausalLM


		def get_convert_example(model):

Add new mistral #7425

Add new mistral #7425

Conversation

wtmlon commented Nov 13, 2023 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Nov 13, 2023

codecov bot commented Nov 17, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

ZHUI commented Jan 2, 2024

github-actions bot commented Mar 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lugimzzz commented Jul 2, 2024

wtmlon commented Jul 2, 2024

lugimzzz left a comment

Choose a reason for hiding this comment

wtmlon commented Nov 13, 2023 •

edited

Loading

codecov bot commented Nov 17, 2023 •

edited

Loading