-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move DataLoader code to paddle.io #48699
move DataLoader code to paddle.io #48699
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
3076903
to
9585239
Compare
88484bf
to
d914f47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
经线下沟通,在这个PR中分离了旧的Dataloader和新的Dataloader相关的内容,并将新Dataloader及相关底层依赖迁移到了2.0中。旧Dataloader及相关依赖的移除,由于目前部分分布式及量化方向的功能代码中仍然有使用,需要先移除相关调用后才能进行
import numpy as np | ||
|
||
from .dataset import IterableDataset | ||
from .sampler import RandomSampler, Sampler, SequenceSampler | ||
|
||
__all__ = ["BatchSampler", "DistributedBatchSampler"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
由于fluid公开API列表的定义问题,迁移到2.0的文件,原来__all__
列表的内容需要清空。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks!
cmake/third_party.cmake
Outdated
set(WITH_FLASHATTN ON) | ||
endif() | ||
endif() | ||
# if(WITH_GPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这部分是否不应当修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks!
6a3a63f
to
1f90c66
Compare
1f90c66
to
78d2d3f
Compare
73e0024
to
1245716
Compare
… mv_dataloader_to_io
… mv_dataloader_to_io
from .sampler import RandomSampler | ||
from .sampler import WeightedRandomSampler | ||
|
||
__all__ = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里也需要__all__=[]
,API已在paddle/io/__init__.py
中公开
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
python/paddle/io/reader.py
Outdated
# These value is used in getting data from another process | ||
QUEUE_GET_TIMEOUT = 60 | ||
|
||
__all__ = ['DataLoader', 'default_collate_fn'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里__all__
也需要置空
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for setup.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for docs
PR types
Others
PR changes
Others
Description
move DataLoader code to paddle.io
paddle.io.DataLoader
code topaddle.io.reader
fluid/dataloader
directory topaddle.io
multiprocess_utils.py
topaddle.io
NOTE: code for
DataLaoder.from_generator
DataLoader.from_dataset
remains underfluid.reader
for these API is deprecated since Paddle 2.0