-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add llama and nv-embed training #9323
Conversation
Thanks for your contribution! |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #9323 +/- ##
===========================================
+ Coverage 52.86% 52.94% +0.08%
===========================================
Files 669 676 +7
Lines 107240 107919 +679
===========================================
+ Hits 56688 57134 +446
- Misses 50552 50785 +233 ☔ View full report in Codecov by Sentry. |
单卡训练效率过低,batch_size 较小,建议使用多卡训练,对于对比学习训练推荐使用大 batch_size,多卡训练,示例命令如下: | ||
|
||
``` | ||
python -m paddle.distributed.launch --gpus "1,2,3,4" train.py --do_train \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--gpus "1,2,3,4" 这个应该是0开始吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,已修改
### 单卡训练 | ||
|
||
## 训练 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
保持单卡训练
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
当前代码在“训练“这个标题下既包括单卡训练也包括多卡训练,请问您的意思是把”单卡训练“也单独写成一个子标题吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的。维持之前的单卡训练/多卡训练的子标题,清晰一些
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,已修改
# | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个维持原样吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,已修改
@@ -0,0 +1,517 @@ | |||
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个模型定义是怎么来的? 是从 mteb_models_nv.py 移过来的吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对于nv-embed的训练和测试,之前版本代码会使用两个代码文件:训练时使用modeling_nv.py,测试时使用mteb_models_nv.py;
本次更新将这两个代码文件进行了合并,训练以及测试时,都使用modeling_nv.py加载nv-embed权重
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. 那mteb_models里面的逻辑,迁入哪里了呢?我看这个文件也被删了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mteb_models.py的逻辑被合并入 models/modeling.py
@@ -1,216 +0,0 @@ | |||
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么删掉了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
新版本代码使用eval_mteb.py对模型进行评估,因此删除了此前的evaluation文件夹
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之前的评估代码是T2Ranking, 为什么要删掉呢? T2Ranking和MTEB也不冲突呀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已恢复evaluation文件夹
@@ -87,21 +79,29 @@ def get_args(): | |||
passage_prefix = "" | |||
|
|||
if args.task_name == "QuoraRetrieval": | |||
assert args.document_instruction != "document: ", "QuoraRetrieval requires a document instruction" | |||
assert args.document_instruction != "document: ", f"QuoraRetrieval requires a document instruction" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没有replacement, 不要使用f-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,已修改
lora_config = LoRAConfig.from_pretrained(args.peft_model_name_or_path) | ||
lora_config.merge_weights = True | ||
encode_model = LoRAModel.from_pretrained( | ||
encode_model, args.peft_model_name_or_path, lora_config=lora_config, dtype="bfloat16" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的dtype可以hardcode吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为 dtype=lora_config.dtype
@@ -1,11 +1,11 @@ | |||
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个还是放入evaluation文件夹吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
新版本代码中已经将evaluation文件夹删除了,请问您的意思是仍然保留evaluation文件夹并且将eval_mteb.py放进去吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个还是放入evaluation文件夹吧
好的,已放入
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
add llama and nv-embed training