-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLM]Update yuan model #8786
[LLM]Update yuan model #8786
Conversation
Thanks for your contribution! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8786 +/- ##
===========================================
- Coverage 55.44% 55.37% -0.07%
===========================================
Files 626 633 +7
Lines 98065 99888 +1823
===========================================
+ Hits 54368 55311 +943
- Misses 43697 44577 +880 ☔ View full report in Codecov by Sentry. |
麻烦尽快review下,谢谢 |
|
||
## 1. 模型介绍 | ||
|
||
[源2.0](https://github.com/IEIT-Yuan/Yuan-2.0)是浪潮信息发布的新一代基础语言大模型。源2.0是在源1.0的基础上,利用更多样的高质量预训练数据和指令微调数据集,令模型在语义、数学、推理、代码、知识等不同方面具备更强的理解能力。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
乱码了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是不是浏览器的编码显示问题,本地和我的git上都没问题,我使用的UTF-8编码
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
通过手机验证确实没有乱码,不过在mac上会产生乱码,可能是win和mac存在冲突,建议删除文件后重新使用sublime等工具保存试试,格式选UTF-8(unix)
llm/config/yuan/lora_argument.json
Outdated
@@ -0,0 +1,35 @@ | |||
{ | |||
"model_name_or_path": "/workspace/yuan", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个模型可以上传到 bos 或者 aistudio,不用本地的名字
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请问bos 或者 aistudio应该如何上传权重,有没有相应的readme或者链接?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
模型空间在此处上传模型数据,上传方式可通过paddlenlp进行上传,方式如下:
# pip install aistudio_sdk tqdm
from paddlenlp.transformers import AutoModelForCausalLM, AutoTokenizer
# 注意传入正确dtype
model_name_or_path = "IEITYuan/Yuan2-51B-hf"
dtype = "bfloat16"
repo_id = "user_id/Yuan2-51B-hf" # user_id 需根据用户创建模型判断
token = "xxxxxxxxxxx" # token需在aistudio上“个人中心-访问令牌”中获取
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, dtype=dtype)
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
# safetensor 版本
model.save_to_aistudio(
repo_id = repo_id,
token = token,
private=True,
license="Apache License 2.0",
exist_ok=True,
safe_serialization=True
)
# 非safetensor 版本
model.save_to_aistudio(
repo_id = repo_id,
token = token,
private=True,
license="Apache License 2.0",
exist_ok=True,
safe_serialization=False
)
tokenizer.save_to_aistudio(
repo_id = repo_id,
token = token,
private=True,
license="Apache License 2.0",
exist_ok=True,
)
@@ -0,0 +1,41 @@ | |||
{ | |||
"model_name_or_path": "/workspace/yuan", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
预训练这个可能还需要做数据。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -249,7 +249,7 @@ def apply_rotary_pos_emb(q, k, cos, sin, position_ids): | |||
|
|||
class YuanPretrainedModel(PretrainedModel): | |||
config_class = YuanConfig | |||
base_model_prefix = "model" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个修改需要兼容之前合入的模型参数吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -282,7 +287,7 @@ def _get_name_mappings(cls, config: YuanConfig) -> List[StateDictNameMapping]: | |||
if "YuanModel" not in config.architectures: | |||
for mapping in model_mappings: | |||
mapping[0] = "model." + mapping[0] | |||
mapping[1] = "yuan." + mapping[1] | |||
mapping[1] = "model." + mapping[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
额,这个prefix 是不是已经可以了,需要这么改吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -249,7 +249,7 @@ def apply_rotary_pos_emb(q, k, cos, sin, position_ids): | |||
|
|||
class YuanPretrainedModel(PretrainedModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你在这里 加个 __all__
字段 限定一下需要import的模型吧。__init__里面是import *,很多其他东西也会import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
… dev_update_yuan_model
PR types
New features
PR changes
Models
Description
增加了源2.0的其他模型(51B、102B)、微调(lora、sft)、预训练以及auto_convert_from_torch