Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoNLP]add predict #4967

Merged
merged 3 commits into from
Feb 24, 2023
Merged

[AutoNLP]add predict #4967

merged 3 commits into from
Feb 24, 2023

Conversation

lugimzzz
Copy link
Contributor

@lugimzzz lugimzzz commented Feb 23, 2023

PR types

New features

PR changes

APIs

Description

新增predict函数

  • 重写_override_hp(),这样不再虚假加上TrainingArguments等类似前缀,比如同时有prompt模型和微调模型的话,override_hp就要定义一个"TrainingArguments.max_steps": 5 和 "PromptTuningArguments.max_steps"
  • 新增predict函数,修改evaluate函数,与trainer对齐
  • prompttrainer evaluate函数未与trainer对齐,重写了get_eval_dataloader
    -将数据预处理函数单独抽出来

@paddle-bot
Copy link

paddle-bot bot commented Feb 23, 2023

Thanks for your contribution!

@codecov
Copy link

codecov bot commented Feb 23, 2023

Codecov Report

Merging #4967 (f0da8a1) into develop (f354fe6) will increase coverage by 1.31%.
The diff coverage is 95.91%.

@@             Coverage Diff             @@
##           develop    #4967      +/-   ##
===========================================
+ Coverage    46.32%   47.63%   +1.31%     
===========================================
  Files          448      453       +5     
  Lines        64694    65455     +761     
===========================================
+ Hits         29967    31177    +1210     
+ Misses       34727    34278     -449     
Impacted Files Coverage Δ
...addlenlp/experimental/autonlp/auto_trainer_base.py 89.60% <83.33%> (-0.57%) ⬇️
...dlenlp/experimental/autonlp/text_classification.py 97.09% <96.55%> (-0.12%) ⬇️
paddlenlp/prompt/prompt_trainer.py 68.78% <100.00%> (+2.11%) ⬆️
paddlenlp/transformers/chineseclip/modeling.py 82.94% <0.00%> (-2.54%) ⬇️
paddlenlp/transformers/ernie_vil/modeling.py 76.36% <0.00%> (-0.94%) ⬇️
paddlenlp/transformers/bert/modeling.py 89.71% <0.00%> (-0.58%) ⬇️
paddlenlp/utils/doc_parser.py 11.26% <0.00%> (-0.04%) ⬇️
paddlenlp/transformers/__init__.py 100.00% <0.00%> (ø)
paddlenlp/transformers/auto/modeling.py 82.74% <0.00%> (ø)
paddlenlp/transformers/auto/tokenizer.py 84.17% <0.00%> (ø)
... and 18 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@@ -152,6 +153,11 @@ def get_test_dataloader(self, test_dataset):
test_dataset = self._map_dataset(test_dataset)
return super(PromptTrainer, self).get_test_dataloader(test_dataset)

def get_eval_dataloader(self, eval_dataset: Optional[Dataset] = None) -> DataLoader:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LemonNoel review一下prompt的修改有什么问题

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同一程序中多次调用 evaluate 可能有问题,需要验证一下 do_eval=True 的情况。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

实际验证加检查trainer内部代码,没有问题。目前prompt trainer暂时不支持eval_dataset为字典(也即传入多个eval_dataset)的场景,所以暂时不影响代码,如果之后支持eval_dataset为字典,需要同步修改get_eval_dataloader的逻辑。

Copy link
Collaborator

@sijunhe sijunhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@sijunhe sijunhe merged commit f38a255 into PaddlePaddle:develop Feb 24, 2023
@lugimzzz lugimzzz deleted the PREDICT branch February 24, 2023 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants