Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new mistral #7425

Merged
merged 27 commits into from
Jul 3, 2024
Merged

Add new mistral #7425

merged 27 commits into from
Jul 3, 2024

Conversation

wtmlon
Copy link
Collaborator

@wtmlon wtmlon commented Nov 13, 2023

PR types

PR changes

Description

新增mistral模型

Copy link

paddle-bot bot commented Nov 13, 2023

Thanks for your contribution!

Copy link

codecov bot commented Nov 17, 2023

Codecov Report

Attention: Patch coverage is 81.20950% with 87 lines in your changes missing coverage. Please review.

Project coverage is 55.74%. Comparing base (2723138) to head (cdbd0eb).
Report is 230 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/transformers/mistral/modeling.py 80.68% 84 Missing ⚠️
paddlenlp/peft/prefix/utils.py 33.33% 2 Missing ⚠️
paddlenlp/transformers/mistral/configuration.py 95.23% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7425      +/-   ##
===========================================
+ Coverage    55.61%   55.74%   +0.12%     
===========================================
  Files          620      623       +3     
  Lines        96965    97450     +485     
===========================================
+ Hits         53930    54322     +392     
- Misses       43035    43128      +93     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

def set_input_embeddings(self, value):
self.embed_tokens = value

def _prepare_decoder_attention_mask(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确认一下目前attention_mask符合我们的标准,支持2d和4d, 1和0的含义一致,可以支持intokens策略

@ZHUI
Copy link
Collaborator

ZHUI commented Jan 2, 2024

@wtmlon 不需要了吗?

Copy link

github-actions bot commented Mar 3, 2024

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动,被标记为stale。

@github-actions github-actions bot added the stale label Mar 3, 2024
@github-actions github-actions bot removed the stale label Jul 2, 2024
llm/data.py Outdated
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放到utils里面

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

llm/data.py Outdated
from paddlenlp.peft import LoRAModel, PrefixModelForCausalLM


def get_convert_example(model):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mistral有chat_template吗。确认支持吗?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llm/data.py Outdated
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确认过训练的loss正常?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
没问题,这是 8 卡 sft

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from .configuration import MistralConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mistral为什么没有tokenizer文件

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,30 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放到config文件里面,config目录里新增mistral包含json和readme。顺带更新llm目录下的readme

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lugimzzz
Copy link
Contributor

lugimzzz commented Jul 2, 2024

支持zero_padding策略吗,同时新增支持dpo

@wtmlon
Copy link
Collaborator Author

wtmlon commented Jul 2, 2024

支持zero_padding策略吗,同时新增支持dpo

都已支持

Copy link
Contributor

@lugimzzz lugimzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@wtmlon wtmlon merged commit af19dc4 into PaddlePaddle:develop Jul 3, 2024
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants