Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLM] Add Yuan model #8654

Merged
merged 20 commits into from
Jul 11, 2024
Merged

[LLM] Add Yuan model #8654

merged 20 commits into from
Jul 11, 2024

Conversation

zhaogf01
Copy link
Contributor

PR types

New features

PR changes

Models

Description

添加了源2.0的模型结构、配置等相关文件

Copy link

paddle-bot bot commented Jun 25, 2024

Thanks for your contribution!

@DrownFish19
Copy link
Collaborator

Lint问题可以参考link进行修复

@zhaogf01
Copy link
Contributor Author

我看lint的日志中引起black、isort、copyright_checker failed的文件已经被修改了,请问我还需要修改吗?或者这个错误具体指什么?我没看到引起错误的源文件?
另外,test中的错误需要处理吗?请问这个错误具体是指哪个文件?我没太看懂。

@DrownFish19
Copy link
Collaborator

DrownFish19 commented Jun 25, 2024

我看lint的日志中引起black、isort、copyright_checker failed的文件已经被修改了,请问我还需要修改吗?或者这个错误具体指什么?我没看到引起错误的源文件?

具体是格式错误问题,PR中的文件需要满足要求格式。可以本地使用pip install pre-commit && pre-commit install并使用pre-commit run --file XXX.py格式化本地代码并上传。

另外,test中的错误需要处理吗?请问这个错误具体是指哪个文件?我没太看懂。

辛苦拉一下最近代码即可,最新commit已经修复。

Copy link

codecov bot commented Jun 26, 2024

Codecov Report

Attention: Patch coverage is 14.11043% with 560 lines in your changes missing coverage. Please review.

Project coverage is 55.42%. Comparing base (6d464bf) to head (3677479).
Report is 222 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/transformers/yuan/modeling.py 13.37% 544 Missing ⚠️
paddlenlp/transformers/yuan/configuration.py 23.80% 16 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8654      +/-   ##
===========================================
- Coverage    55.74%   55.42%   -0.32%     
===========================================
  Files          623      626       +3     
  Lines        97456    98057     +601     
===========================================
+ Hits         54323    54351      +28     
- Misses       43133    43706     +573     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zhaogf01
Copy link
Contributor Author

请问目前的三个failed应该如何修改?

# See the License for the specific language governing permissions and
# limitations under the License.

""" Yuan model tools"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这串代码建议封装成函数,避免出现2,24这样的魔鬼数字,明文模型路径等;
建议搞成参数传入;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改。请问codecov中的warning应该如何修改?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个应该不影响代码合入,找commiter帮忙合入就可以了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DesmonDay 如果没有什么问题,麻烦帮忙合入,谢谢

@@ -1,21 +1,21 @@
exclude: 'model_zoo/gpt-3'
repos:
# For Python files
- repo: https://github.com/psf/black.git
- repo: https://gitee.com/wygfzren/black.git
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此文件请修改回原版本,会影响其他贡献者对代码进行格式化

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

@@ -296,3 +296,5 @@
from .deberta_v2.configuration import *
from .qwen2 import *
from .qwen2_moe import *
from .yuan.modeling import *
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请使用 from .yuan import * 进行导入,并在yuan文件夹下的__init__.py中import 导入modelingconfiguration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

from paddle.distributed import fleet
from paddle.nn import CrossEntropyLoss

from paddlenlp.transformers.conversion_utils import (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddlenlp库内部函数和类,推荐使用相对路径import

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

return q_embed, k_embed


class YuanPreTrainedModel(PretrainedModel):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处需要修改模型名称为YuanPretrainedModel,否则在auto import时会产生报错。当前PaddleNLP导入规则为模型自定义名称(如Qwen2, Yuan) + 固定类型(如PretrainedModel, ForCausalLM)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

model_mappings.extend(layer_mappings)

init_name_mappings(mappings=model_mappings)
# base-model prefix "LlamaModel"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一行修改或者删除

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

if module._padding_idx is not None:
module.weight.data[module._padding_idx].zero_()

def _set_gradient_checkpointing(self, module, value=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PaddleNLP中使用recompute来控制重计算,可参考llama相关重计算设置

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改


hidden_states = inputs_embeds

if self.gradient_checkpointing and self.training:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处判断参数应为recompute,实现细节可参考llama代码

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

@DrownFish19 DrownFish19 changed the title Add yuan model [LLM] Add Yuan model Jul 2, 2024
@zhaogf01
Copy link
Contributor Author

zhaogf01 commented Jul 9, 2024

请问,test测试不通过应该如何修改?
其次,请问其他修改是否合格?

@DrownFish19
Copy link
Collaborator

请问,test测试不通过应该如何修改? 其次,请问其他修改是否合格?

非常感谢您的贡献。

  1. Test Cases出现问题的原因是Yuan modeling中import einops导致,考虑动转静等后续推理流程,建议修改einops.arrange为paddle.reshape操作,而不是在setup.py中引入einops。
  2. paddlenlp/transformers/yuan/utils/tensor_parallelism_tools.py并非必须文件,如果存在完整模型参数,paddlenlp在模型加载过程中会自动切分参数实现模型并行(tensor parallel)。
  3. 修改后Test和CI通过即可合入。

@zhaogf01
Copy link
Contributor Author

请问,test测试不通过应该如何修改? 其次,请问其他修改是否合格?

非常感谢您的贡献。

  1. Test Cases出现问题的原因是Yuan modeling中import einops导致,考虑动转静等后续推理流程,建议修改einops.arrange为paddle.reshape操作,而不是在setup.py中引入einops。
  2. paddlenlp/transformers/yuan/utils/tensor_parallelism_tools.py并非必须文件,如果存在完整模型参数,paddlenlp在模型加载过程中会自动切分参数实现模型并行(tensor parallel)。
  3. 修改后Test和CI通过即可合入。

1、已修改
2、此处是需要处理的,原因在于源2.0的attention中的reshape将q和k混在了一起。paddlenlp在模型加载过程中是会自动切分参数,但由于上述源2.0结构的特殊,导致在gather的时候q_state和k_state混在一起。这个工具的作用就是提前将权重重组,可以避免上述问题。
谢谢!

DrownFish19
DrownFish19 previously approved these changes Jul 10, 2024
Copy link
Collaborator

@DrownFish19 DrownFish19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators Jul 10, 2024
@PaddlePaddle PaddlePaddle unlocked this conversation Jul 10, 2024
@wawltor wawltor merged commit 1af227a into PaddlePaddle:develop Jul 11, 2024
8 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants