Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pretrainedModel add gconfig #6915

Merged
merged 5 commits into from
Sep 5, 2023
Merged

Conversation

wtmlon
Copy link
Collaborator

@wtmlon wtmlon commented Sep 4, 2023

PR types

PR changes

Description

目录结构变动,PretrainedModel 适配 gconfig

@paddle-bot
Copy link

paddle-bot bot commented Sep 4, 2023

Thanks for your contribution!

@codecov
Copy link

codecov bot commented Sep 4, 2023

Codecov Report

Merging #6915 (6a605ba) into develop (2f3eac3) will decrease coverage by 0.06%.
Report is 15 commits behind head on develop.
The diff coverage is 67.34%.

@@             Coverage Diff             @@
##           develop    #6915      +/-   ##
===========================================
- Coverage    59.92%   59.87%   -0.06%     
===========================================
  Files          547      552       +5     
  Lines        81009    81452     +443     
===========================================
+ Hits         48546    48770     +224     
- Misses       32463    32682     +219     
Files Changed Coverage Δ
...enlp/experimental/transformers/generation_utils.py 0.00% <0.00%> (ø)
paddlenlp/transformers/utils.py 61.51% <17.77%> (-5.63%) ⬇️
paddlenlp/generation/streamers.py 25.00% <25.00%> (ø)
paddlenlp/peft/prefix/prefix_model.py 60.17% <50.00%> (-0.28%) ⬇️
paddlenlp/generation/utils.py 67.51% <72.91%> (ø)
paddlenlp/generation/logits_process.py 73.12% <73.12%> (ø)
paddlenlp/generation/configuration_utils.py 81.06% <81.06%> (ø)
paddlenlp/generation/stopping_criteria.py 81.08% <81.08%> (ø)
paddlenlp/transformers/model_utils.py 70.93% <90.00%> (+0.18%) ⬆️
paddlenlp/generation/__init__.py 100.00% <100.00%> (ø)
... and 3 more

... and 4 files with indirect coverage changes

llm/predictor.py Outdated
@@ -41,6 +41,7 @@
PretrainedModel,
PretrainedTokenizer,
)
from paddlenlp.transformers.generation_utils import GenerationConfig
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

尽量从paddlenlp.generation 路径import, 后续paddlenlp.transformers.generation_utils 逐渐废弃

Comment on lines 15 to 19
from paddlenlp.generation.configuration_utils import * # noqa: F401, F403
from paddlenlp.generation.logits_process import * # noqa: F401, F403
from paddlenlp.generation.stopping_criteria import * # noqa: F401, F403
from paddlenlp.generation.streamers import * # noqa: F401, F403
from paddlenlp.generation.utils import * # noqa: F401, F403
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可不可以直接废弃调transformers/generation_utils? 后续都从transformers/generation import?

llm/utils.py Outdated
@@ -26,6 +26,7 @@
from sklearn.metrics import accuracy_score

from paddlenlp.datasets import InTokensIterableDataset
from paddlenlp.generation import GenerationConfig
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不着急改,需要向后兼容一点时间,保持原有api

llm/utils.py Outdated
Comment on lines 204 to 213
generation_config=GenerationConfig(
max_new_token=self.data_args.tgt_length,
decode_strategy="sampling",
top_k=self.gen_args.top_k,
top_p=self.gen_args.top_p,
bos_token_id=self.tokenizer.bos_token_id,
eos_token_id=self.tokenizer.eos_token_id,
pad_token_id=self.tokenizer.pad_token_id,
use_cache=True,
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

保持原有api

Comment on lines 906 to 910
logger.warning("`max_length` will be deprecated in future, use" " `max_new_token` instead.")
generation_config.max_new_token = generation_config.max_length

if generation_config.min_length != 0 and generation_config.min_new_token == 0:
logger.warning("`min_length` will be deprecated in future, use" " `min_new_token` instead.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.warning("`max_length` will be deprecated in future, use" " `max_new_token` instead.")
generation_config.max_new_token = generation_config.max_length
if generation_config.min_length != 0 and generation_config.min_new_token == 0:
logger.warning("`min_length` will be deprecated in future, use" " `min_new_token` instead.")
logger.warning("`max_length` will be deprecated in future releases, use `max_new_token` instead.")
generation_config.max_new_token = generation_config.max_length
if generation_config.min_length != 0 and generation_config.min_new_token == 0:
logger.warning("`min_length` will be deprecated in future releases, use `min_new_token` instead.")

@@ -46,6 +46,7 @@
from paddle.utils.download import is_url as is_remote_url
from tqdm.auto import tqdm

from paddlenlp.generation import GenerationConfig, GenerationMixin
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全部替换成relative import

@sijunhe sijunhe merged commit e183825 into PaddlePaddle:develop Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants