Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add llama and nv-embed training #9323

Merged
merged 9 commits into from
Dec 16, 2024
Merged

Conversation

Li-Z-Q
Copy link
Contributor

@Li-Z-Q Li-Z-Q commented Oct 28, 2024

Description

add llama and nv-embed training

Copy link

paddle-bot bot commented Oct 28, 2024

Thanks for your contribution!

@CLAassistant
Copy link

CLAassistant commented Oct 28, 2024

CLA assistant check
All committers have signed the CLA.

Copy link

codecov bot commented Oct 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 52.94%. Comparing base (c9d5673) to head (8b14719).
Report is 154 commits behind head on develop.

Current head 8b14719 differs from pull request most recent head b538dea

Please upload reports for the commit b538dea to get more accurate results.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9323      +/-   ##
===========================================
+ Coverage    52.86%   52.94%   +0.08%     
===========================================
  Files          669      676       +7     
  Lines       107240   107919     +679     
===========================================
+ Hits         56688    57134     +446     
- Misses       50552    50785     +233     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

单卡训练效率过低,batch_size 较小,建议使用多卡训练,对于对比学习训练推荐使用大 batch_size,多卡训练,示例命令如下:

```
python -m paddle.distributed.launch --gpus "1,2,3,4" train.py --do_train \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--gpus "1,2,3,4" 这个应该是0开始吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,已修改

Comment on lines 26 to 23
### 单卡训练

## 训练
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

保持单卡训练

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

当前代码在“训练“这个标题下既包括单卡训练也包括多卡训练,请问您的意思是把”单卡训练“也单独写成一个子标题吗?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的。维持之前的单卡训练/多卡训练的子标题,清晰一些

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,已修改

Comment on lines 2 to 8
#
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个维持原样吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,已修改

@@ -0,0 +1,517 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个模型定义是怎么来的? 是从 mteb_models_nv.py 移过来的吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于nv-embed的训练和测试,之前版本代码会使用两个代码文件:训练时使用modeling_nv.py,测试时使用mteb_models_nv.py;
本次更新将这两个代码文件进行了合并,训练以及测试时,都使用modeling_nv.py加载nv-embed权重

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. 那mteb_models里面的逻辑,迁入哪里了呢?我看这个文件也被删了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mteb_models.py的逻辑被合并入 models/modeling.py

@@ -1,216 +0,0 @@
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么删掉了?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

新版本代码使用eval_mteb.py对模型进行评估,因此删除了此前的evaluation文件夹

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

之前的评估代码是T2Ranking, 为什么要删掉呢? T2Ranking和MTEB也不冲突呀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已恢复evaluation文件夹

@@ -87,21 +79,29 @@ def get_args():
passage_prefix = ""

if args.task_name == "QuoraRetrieval":
assert args.document_instruction != "document: ", "QuoraRetrieval requires a document instruction"
assert args.document_instruction != "document: ", f"QuoraRetrieval requires a document instruction"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没有replacement, 不要使用f-string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,已修改

lora_config = LoRAConfig.from_pretrained(args.peft_model_name_or_path)
lora_config.merge_weights = True
encode_model = LoRAModel.from_pretrained(
encode_model, args.peft_model_name_or_path, lora_config=lora_config, dtype="bfloat16"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的dtype可以hardcode吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改为 dtype=lora_config.dtype

@@ -1,11 +1,11 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个还是放入evaluation文件夹吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

新版本代码中已经将evaluation文件夹删除了,请问您的意思是仍然保留evaluation文件夹并且将eval_mteb.py放进去吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个还是放入evaluation文件夹吧

好的,已放入

Copy link
Collaborator

@DrownFish19 DrownFish19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DrownFish19 DrownFish19 merged commit c8aa7bf into PaddlePaddle:develop Dec 16, 2024
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants