[LLM]Update yuan model #8786

zhaogf01 · 2024-07-19T08:43:13Z

PR types

New features

PR changes

Models

Description

增加了源2.0的其他模型（51B、102B）、微调（lora、sft）、预训练以及auto_convert_from_torch

paddle-bot · 2024-07-19T08:43:17Z

Thanks for your contribution!

codecov · 2024-07-19T09:14:57Z

Codecov Report

Attention: Patch coverage is 24.59016% with 138 lines in your changes missing coverage. Please review.

Project coverage is 55.37%. Comparing base (57000fa) to head (ec8ee56).
Report is 278 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/transformers/yuan/tokenizer.py	28.30%	76 Missing ⚠️
paddlenlp/transformers/yuan/modeling.py	10.29%	61 Missing ⚠️
paddlenlp/transformers/model_utils.py	87.50%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8786      +/-   ##
===========================================
- Coverage    55.44%   55.37%   -0.07%     
===========================================
  Files          626      633       +7     
  Lines        98065    99888    +1823     
===========================================
+ Hits         54368    55311     +943     
- Misses       43697    44577     +880

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

zhaogf01 · 2024-07-24T01:38:13Z

麻烦尽快review下，谢谢

ZHUI · 2024-07-24T03:15:52Z

llm/config/yuan/README.md

+
+## 1. 模型介绍
+
+[源2.0](https://github.com/IEIT-Yuan/Yuan-2.0)是浪潮信息发布的新一代基础语言大模型。源2.0是在源1.0的基础上，利用更多样的高质量预训练数据和指令微调数据集，令模型在语义、数学、推理、代码、知识等不同方面具备更强的理解能力。


是不是浏览器的编码显示问题，本地和我的git上都没问题，我使用的UTF-8编码

通过手机验证确实没有乱码，不过在mac上会产生乱码，可能是win和mac存在冲突，建议删除文件后重新使用sublime等工具保存试试，格式选UTF-8(unix)

ZHUI · 2024-07-24T03:16:51Z

llm/config/yuan/lora_argument.json

@@ -0,0 +1,35 @@
+{
+    "model_name_or_path": "/workspace/yuan",


这个模型可以上传到 bos 或者 aistudio，不用本地的名字

请问bos 或者 aistudio应该如何上传权重，有没有相应的readme或者链接？

模型空间在此处上传模型数据，上传方式可通过paddlenlp进行上传，方式如下：

# pip install aistudio_sdk tqdm from paddlenlp.transformers import AutoModelForCausalLM, AutoTokenizer # 注意传入正确dtype model_name_or_path = "IEITYuan/Yuan2-51B-hf" dtype = "bfloat16" repo_id = "user_id/Yuan2-51B-hf" # user_id 需根据用户创建模型判断 token = "xxxxxxxxxxx" # token需在aistudio上“个人中心-访问令牌”中获取 model = AutoModelForCausalLM.from_pretrained(model_name_or_path, dtype=dtype) tokenizer = AutoTokenizer.from_pretrained(model_name_or_path) # safetensor 版本 model.save_to_aistudio( repo_id = repo_id, token = token, private=True, license="Apache License 2.0", exist_ok=True, safe_serialization=True ) # 非safetensor 版本 model.save_to_aistudio( repo_id = repo_id, token = token, private=True, license="Apache License 2.0", exist_ok=True, safe_serialization=False ) tokenizer.save_to_aistudio( repo_id = repo_id, token = token, private=True, license="Apache License 2.0", exist_ok=True, )

ZHUI · 2024-07-24T03:17:00Z

llm/config/yuan/pretrain_argument.json

@@ -0,0 +1,41 @@
+{
+    "model_name_or_path": "/workspace/yuan",


预训练这个可能还需要做数据。

我测试了paddlenlp提供的数据集，是可以使用的。

ZHUI · 2024-07-24T03:19:19Z

paddlenlp/transformers/yuan/modeling.py

@@ -249,7 +249,7 @@ def apply_rotary_pos_emb(q, k, cos, sin, position_ids):

 class YuanPretrainedModel(PretrainedModel):
    config_class = YuanConfig
-    base_model_prefix = "model"


这个修改需要兼容之前合入的模型参数吗？

ZHUI · 2024-07-24T03:20:04Z

paddlenlp/transformers/yuan/modeling.py

@@ -282,7 +287,7 @@ def _get_name_mappings(cls, config: YuanConfig) -> List[StateDictNameMapping]:
        if "YuanModel" not in config.architectures:
            for mapping in model_mappings:
                mapping[0] = "model." + mapping[0]
-                mapping[1] = "yuan." + mapping[1]
+                mapping[1] = "model." + mapping[1]


额，这个prefix 是不是已经可以了，需要这么改吗？

ZHUI · 2024-07-24T03:22:27Z

paddlenlp/transformers/yuan/modeling.py

@@ -249,7 +249,7 @@ def apply_rotary_pos_emb(q, k, cos, sin, position_ids):

 class YuanPretrainedModel(PretrainedModel):


你在这里加个 __all__ 字段限定一下需要import的模型吧。__init__里面是import *,很多其他东西也会import

…d-yuan-model

… dev_update_yuan_model

…d-yuan-model

zhaogf01 and others added 30 commits June 24, 2024 17:37

add yuan model

55e0c75

add yuan model settings

04aaaa0

fix conflict

9bad9a8

add readme

0f34595

update format

1a152c9

update readme

d9e2a4a

update readme

7e80c59

update for lint

c7577c4

update for lint

056cdcb

update for lint

52a9556

update for lint

afe3f4e

add fp16

8b7878b

update utils

aaf1fb7

fix bug

f2b0f3c

Merge branch 'PaddlePaddle:develop' into add-yuan-model

8fa2cd7

format

42f87f2

correct pre-commit

e409997

correct fp16

f5e6a57

delete rearrange

75c8acb

add pre_train

3677479

auto convert from torch

f19d8e2

support sft lora

8458335

add fine-tuning scripts

6773757

Merge branch 'PaddlePaddle:develop' into add-yuan-model

4543d92

update structure

2cf1c2b

fix yuantokenizer pad

c09d97f

update scripts

5f1d777

update readme

c08693e

update readme

bec1602

add pad_token_id

9248da4

paddle-bot bot added the contributor label Jul 19, 2024

paddle-bot bot assigned ZHUI Jul 19, 2024

zhaogf01 and others added 4 commits July 22, 2024 01:18

format

e3cc3d5

Merge branch 'PaddlePaddle:develop' into add-yuan-model

009a8ae

format

23ff253

update readme

743eb11

ZHUI reviewed Jul 24, 2024

View reviewed changes

zhaogf01 and others added 16 commits July 24, 2024 07:24

update for review

1c54d6c

update for review

be39a54

fix bug

dfa9eaf

update to CRLF

c85e550

format

c4ffc7e

support param convert

6c96d02

update sft config

46caee3

Merge remote-tracking branch 'upstream/dev_update_yuan_model' into ad…

7e349e8

…d-yuan-model

fix modeling

8ea204d

Merge remote-tracking branch 'paddlenlp-zhaogf01/add-yuan-model' into…

7957edf

… dev_update_yuan_model

Merge remote-tracking branch 'upstream/dev_update_yuan_model' into ad…

9163857

…d-yuan-model

fix qk fuse&spilt and fix fa

7792f3f

format

6da33a7

format

7afc6a6

format

89b291f

fix fa dtype

ec8ee56

DrownFish19 approved these changes Aug 10, 2024

View reviewed changes

wawltor merged commit 30fc639 into PaddlePaddle:develop Aug 12, 2024
9 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLM]Update yuan model #8786

[LLM]Update yuan model #8786

zhaogf01 commented Jul 19, 2024

paddle-bot bot commented Jul 19, 2024

codecov bot commented Jul 19, 2024 •

edited

Loading

zhaogf01 commented Jul 24, 2024

ZHUI Jul 24, 2024

zhaogf01 Jul 24, 2024

DrownFish19 Jul 24, 2024

ZHUI Jul 24, 2024

zhaogf01 Jul 24, 2024

DrownFish19 Jul 24, 2024

ZHUI Jul 24, 2024

ZHUI Jul 24, 2024

zhaogf01 Jul 24, 2024

ZHUI Jul 24, 2024

zhaogf01 Jul 24, 2024

ZHUI Jul 24, 2024

zhaogf01 Jul 24, 2024

ZHUI Jul 24, 2024

zhaogf01 Jul 24, 2024


		## 1. 模型介绍

		[源2.0](https://github.com/IEIT-Yuan/Yuan-2.0)是浪潮信息发布的新一代基础语言大模型。源2.0是在源1.0的基础上，利用更多样的高质量预训练数据和指令微调数据集，令模型在语义、数学、推理、代码、知识等不同方面具备更强的理解能力。

		@@ -249,7 +249,7 @@ def apply_rotary_pos_emb(q, k, cos, sin, position_ids):

		class YuanPretrainedModel(PretrainedModel):

[LLM]Update yuan model #8786

[LLM]Update yuan model #8786

Conversation

zhaogf01 commented Jul 19, 2024

PR types

PR changes

Description

paddle-bot bot commented Jul 19, 2024

codecov bot commented Jul 19, 2024 • edited Loading

Codecov Report

zhaogf01 commented Jul 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 19, 2024 •

edited

Loading