-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download重构 #8020
Merged
Merged
Download重构 #8020
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
66744bb
download
LOVE-YOURSELF-1 40b27c4
modified file
LOVE-YOURSELF-1 68b5f8c
modified from_pretrained
LOVE-YOURSELF-1 e342983
modified config
LOVE-YOURSELF-1 fcc392b
modified download
LOVE-YOURSELF-1 3aa76ab
test_tokenizer
LOVE-YOURSELF-1 d6dfcf0
Delete tests/transformers/from_pretrained/run.sh
LOVE-YOURSELF-1 0705617
Update test_tokenizer.py
LOVE-YOURSELF-1 f9c5af7
Update tokenizer_utils_base.py
LOVE-YOURSELF-1 275e52b
test_model
LOVE-YOURSELF-1 76cd0da
test_model
LOVE-YOURSELF-1 9bdc94e
test_model
LOVE-YOURSELF-1 df82769
Remove comments
LOVE-YOURSELF-1 5148bc6
Remove comments
LOVE-YOURSELF-1 6a0085b
add requirements
LOVE-YOURSELF-1 7006332
update bos download
JunnYu 620aacc
Update test_model.py
LOVE-YOURSELF-1 ae6169f
clear unused import
LOVE-YOURSELF-1 7268671
modified bug tokenizer_utils_base.py
LOVE-YOURSELF-1 fe24034
change safetensors
LOVE-YOURSELF-1 85f37cb
modified load generation config
LOVE-YOURSELF-1 37b3c25
add requestion
LOVE-YOURSELF-1 d8c552d
更新
JunnYu c22851a
modified error
LOVE-YOURSELF-1 e392644
fix bug
LOVE-YOURSELF-1 40842fd
Merge branch 'PaddlePaddle:develop' into download
LOVE-YOURSELF-1 b44f8ed
add \n
JunnYu a18ca41
Update __init__.py
LOVE-YOURSELF-1 03d5047
Merge branch 'PaddlePaddle:develop' into download
LOVE-YOURSELF-1 6bb0544
Merge branch 'PaddlePaddle:develop' into download
LOVE-YOURSELF-1 0364a65
Merge branch 'PaddlePaddle:develop' into download
LOVE-YOURSELF-1 b60d218
add requestion
LOVE-YOURSELF-1 850796f
modified download
LOVE-YOURSELF-1 8ce5dfe
重测
LOVE-YOURSELF-1 af7bb9d
Merge branch 'PaddlePaddle:develop' into download
LOVE-YOURSELF-1 3109368
Update test_tokenizer.py
LOVE-YOURSELF-1 d25e6cd
Update requirements-dev.txt
LOVE-YOURSELF-1 ee497e5
Update requirements.txt
LOVE-YOURSELF-1 ed4d372
Merge branch 'PaddlePaddle:develop' into download
LOVE-YOURSELF-1 d829bc5
delete from_pretrained
LOVE-YOURSELF-1 eb06571
Merge branch 'PaddlePaddle:develop' into download
LOVE-YOURSELF-1 793784f
make superior
LOVE-YOURSELF-1 286b80a
Merge branch 'PaddlePaddle:develop' into download
LOVE-YOURSELF-1 119c648
Update run_pretrain_trainer.py
LOVE-YOURSELF-1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,20 +20,11 @@ | |
from collections import defaultdict | ||
from typing import Dict, List, Type | ||
|
||
from huggingface_hub import hf_hub_download | ||
|
||
from ... import __version__ | ||
from ...utils.downloader import ( | ||
COMMUNITY_MODEL_PREFIX, | ||
get_path_from_url_with_filelock, | ||
url_file_exists, | ||
) | ||
from ...utils.download import resolve_file_path | ||
from ...utils.import_utils import import_module | ||
from ...utils.log import logger | ||
from ..aistudio_utils import aistudio_download | ||
from ..configuration_utils import PretrainedConfig | ||
from ..model_utils import PretrainedModel | ||
from ..utils import resolve_cache_dir | ||
|
||
__all__ = [ | ||
"AutoConfig", | ||
|
@@ -170,13 +161,6 @@ | |
config = AutoConfig.from_pretrained("bert-base-uncased") | ||
config.save_pretrained('./bert-base-uncased') | ||
""" | ||
subfolder = kwargs.get("subfolder", "") | ||
if subfolder is None: | ||
subfolder = "" | ||
from_aistudio = kwargs.pop("from_aistudio", False) | ||
from_hf_hub = kwargs.pop("from_hf_hub", False) | ||
cache_dir = kwargs.pop("cache_dir", None) | ||
cache_dir = resolve_cache_dir(from_hf_hub=from_hf_hub, from_aistudio=from_aistudio, cache_dir=cache_dir) | ||
|
||
if not cls.name2class: | ||
cls.name2class = {} | ||
|
@@ -192,72 +176,33 @@ | |
pretrained_model_name_or_path, *model_args, **kwargs | ||
) | ||
|
||
# From local dir path | ||
elif os.path.isdir(pretrained_model_name_or_path): | ||
config_file = os.path.join(pretrained_model_name_or_path, subfolder, cls.config_file) | ||
if not os.path.exists(config_file): | ||
# try to load legacy config file | ||
legacy_config_file = os.path.join(pretrained_model_name_or_path, subfolder, cls.legacy_config_file) | ||
if not os.path.exists(legacy_config_file): | ||
raise ValueError( | ||
f"config file<{cls.config_file}> or legacy config file<{cls.legacy_config_file}> not found" | ||
) | ||
subfolder = kwargs.get("subfolder", "") | ||
if subfolder is None: | ||
subfolder = "" | ||
from_aistudio = kwargs.pop("from_aistudio", False) | ||
from_hf_hub = kwargs.pop("from_hf_hub", False) | ||
cache_dir = kwargs.pop("cache_dir", None) | ||
|
||
logger.warning(f"loading legacy config file<{cls.legacy_config_file}> ...") | ||
config_file = legacy_config_file | ||
config_file = resolve_file_path( | ||
pretrained_model_name_or_path, | ||
[cls.config_file, cls.legacy_config_file], | ||
subfolder, | ||
cache_dir=cache_dir, | ||
from_hf_hub=from_hf_hub, | ||
from_aistudio=from_aistudio, | ||
) | ||
|
||
if os.path.exists(config_file): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 是否一定是 exists 的?不存在的话,报错是不是在 get_file 内部? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 如果下载失败的话是在get_file内部报错,如果repo没有该文件get_file会返回None,会在这报错 |
||
config_class = cls._get_config_class_from_config(pretrained_model_name_or_path, config_file) | ||
logger.info("We are using %s to load '%s'." % (config_class, pretrained_model_name_or_path)) | ||
if config_class is cls: | ||
return cls.from_file(config_file) | ||
return config_class.from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs) | ||
elif from_aistudio: | ||
file = aistudio_download( | ||
repo_id=pretrained_model_name_or_path, | ||
filename=cls.config_file, | ||
subfolder=subfolder, | ||
cache_dir=cache_dir, | ||
) | ||
return cls.from_pretrained(os.path.dirname(file)) | ||
elif from_hf_hub: | ||
file = hf_hub_download( | ||
repo_id=pretrained_model_name_or_path, | ||
filename=cls.config_file, | ||
cache_dir=cache_dir, | ||
subfolder=subfolder, | ||
library_name="PaddleNLP", | ||
library_version=__version__, | ||
) | ||
# from local dir path | ||
return cls.from_pretrained(os.path.dirname(file)) | ||
|
||
# Assuming from community-contributed pretrained models | ||
return config_class.from_pretrained(config_file, *model_args, **kwargs) | ||
else: | ||
url_list = [COMMUNITY_MODEL_PREFIX, pretrained_model_name_or_path, cls.config_file] | ||
legacy_url_list = [COMMUNITY_MODEL_PREFIX, pretrained_model_name_or_path, cls.legacy_config_file] | ||
cache_dir = os.path.join(cache_dir, pretrained_model_name_or_path, subfolder) | ||
if subfolder != "": | ||
url_list.insert(2, subfolder) | ||
legacy_url_list.insert(2, subfolder) | ||
community_config_path = "/".join(url_list) | ||
legacy_community_config_path = "/".join(legacy_url_list) | ||
|
||
if not url_file_exists(community_config_path): | ||
if not url_file_exists(legacy_community_config_path): | ||
raise RuntimeError( | ||
f"Can't load Config for '{pretrained_model_name_or_path}'.\n" | ||
f"Please make sure that '{pretrained_model_name_or_path}' is:\n" | ||
"- a correct model-identifier of built-in pretrained models,\n" | ||
"- or a correct model-identifier of community-contributed pretrained models,\n" | ||
"- or the correct path to a directory containing relevant config files.\n" | ||
) | ||
logger.warning(f"loading legacy config file<{cls.legacy_config_file}> ...") | ||
community_config_path = legacy_community_config_path | ||
|
||
resolved_config_file = get_path_from_url_with_filelock(community_config_path, cache_dir) | ||
config_class = cls._get_config_class_from_config(pretrained_model_name_or_path, resolved_config_file) | ||
logger.info("We are using %s to load '%s'." % (config_class, pretrained_model_name_or_path)) | ||
if config_class is cls: | ||
return cls.from_file(resolved_config_file, **kwargs) | ||
|
||
return config_class.from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs) | ||
raise RuntimeError( | ||
f"Can't load config for '{pretrained_model_name_or_path}'.\n" | ||
f"Please make sure that '{pretrained_model_name_or_path}' is:\n" | ||
"- a correct model-identifier of built-in pretrained models,\n" | ||
"- or a correct model-identifier of community-contributed pretrained models,\n" | ||
"- or the correct path to a directory containing relevant config files.\n" | ||
) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddlenlp/experimental/model_utils.py 这些代码有CI测试覆盖吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
experimental目录下没有专门新增单测,但是transformers下有新增单测,只是加上单测会导致ci失败,但是在本地是可以正常运行的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JunnYu 这里CE可以覆盖吗?对推理而言风向比较大。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我那里的CE都是动态图的,不会涉及到experimental的部分