Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Auto Parallel]: Support std::vector<phi::Tensor> input and output for DistTensor. #56602

Merged
merged 10 commits into from
Sep 5, 2023

Conversation

GhostScreaming
Copy link
Contributor

@GhostScreaming GhostScreaming commented Aug 23, 2023

PR types

Others

PR changes

Others

Description

Pcard-73145

Support std::vector<phi::Tensor> input and output for DistTensor. Meanwhile no_need_buffer type is also supported. Concat, Broadcast_Tensors, Unbind forward and backward are verified. Following operators need to be supported later (along with their backward op if exists): check_finite_and_unscale, coalesce_tensor, meshgrid, update_loss_scaling, einsum.

Foward operators have output of std::tuple<...> are not supported now.

…or output of operators.

Following testcases are passed.
1. concat: std::vector<phi::Tensor> -> phi::Tensor
2. unbind: phi::Tensor -> std::vector<phi::Tensor>
3. broadcast_tensors: std::vector<phi::Tensor> -> std::vector<phi::Tensor>
@GhostScreaming GhostScreaming changed the title [WIP] Support std::vector<phi::Tensor> input and output for DistTensor. Support std::vector<phi::Tensor> input and output for DistTensor. Sep 4, 2023
@GhostScreaming GhostScreaming changed the title Support std::vector<phi::Tensor> input and output for DistTensor. [Auto Parallel]: Support std::vector<phi::Tensor> input and output for DistTensor. Sep 4, 2023
paddle/phi/api/yaml/generator/dist_api_gen.py Outdated Show resolved Hide resolved
phi::distributed::DistTensor* dist_tensor =
static_cast<phi::distributed::DistTensor*>(tensor.impl().get());
intermidiate_tensor_.set_impl(
std::make_shared<phi::distributed::DistTensor>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里新构造tensor会触发重切分,我们还需要讨论一下,这样能过的话,也可以先这样写

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前在concat上验证,是没有问题的。可以先加一个TODO?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没关系,后面我来处理下

paddle/phi/api/yaml/generator/dist_api_gen.py Outdated Show resolved Hide resolved
paddle/phi/api/yaml/generator/dist_api_gen.py Outdated Show resolved Hide resolved
paddle/phi/api/yaml/generator/dist_api_gen.py Show resolved Hide resolved
paddle/phi/api/yaml/generator/dist_api_gen.py Outdated Show resolved Hide resolved
paddle/phi/api/yaml/generator/dist_api_gen.py Outdated Show resolved Hide resolved
paddle/phi/api/yaml/generator/dist_api_gen.py Outdated Show resolved Hide resolved
paddle/phi/api/yaml/generator/dist_bw_api_gen.py Outdated Show resolved Hide resolved
Copy link
Contributor

@LiYuRio LiYuRio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

小问题可以下个pr改

const TransformFlag& transform_flag,
bool is_stride_kernel) {
std::vector<std::shared_ptr<phi::distributed::DistTensor>> out;
for (auto x : input) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可以改成auto&,减少拷贝

dense_tensor.meta().is_contiguous()))) {
out.push_back(
std::static_pointer_cast<phi::distributed::DistTensor>(tensor_in));
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

理论上这里可以写到else分支里,不用continue

@chenwhql chenwhql merged commit d2fedea into PaddlePaddle:develop Sep 5, 2023
BeingGod pushed a commit to BeingGod/Paddle that referenced this pull request Sep 9, 2023
…r DistTensor. (PaddlePaddle#56602)

* [WIP] Support std::vector<phi::Tensor> input and output for DistTensor.
Concat forward and backward are verified.

* Polish code for new dist tensor implementation.

* Fix bug of DistTensor upgrade. Add support functions for std::vector<Tensor> -> std::vector<Tensor>.

* Add support for DistTensor type of std::vector<phi::Tensor> as input or output of operators.
Following testcases are passed.
1. concat: std::vector<phi::Tensor> -> phi::Tensor
2. unbind: phi::Tensor -> std::vector<phi::Tensor>
3. broadcast_tensors: std::vector<phi::Tensor> -> std::vector<phi::Tensor>

* Polish code. Remove useless comments.

* Add update_loss_scaling in skip_op_lists.

* Polish code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants