Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pten] Add reduce mean kernel, replace with mean API #37559

Merged
merged 24 commits into from
Nov 29, 2021

Conversation

MingMingShangTian
Copy link
Contributor

@MingMingShangTian MingMingShangTian commented Nov 25, 2021

PR types

New features

PR changes

Others

Describe

  • Add reduce_sum kernel of pten
  • Add reduce_mean kernel of pten
  • replace the original Mean API with replace_mean kernel implement and revert mean_op to original implement.

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.


// call new kernel
pten::Mean<T>(dev_ctx, *pt_x.get(), pt_out.get());
y.device(place) = X.mean();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mean.cu中的逻辑也需要恢复

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注意check下kernel注册的写法也要恢复到和原先一样

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done,thx

@@ -0,0 +1,962 @@
// Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文件命名不建议带 op.cu.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改命名为 reduce_cuda_impl.h

Comment on lines +24 to +26
PD_DLL_DECL Tensor mean(const Tensor& x,
const std::vector<int64_t>& axis,
bool keep_dim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需不需要再提供一个带有默认值的API接口?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后续PR 再加默认值的接口

Comment on lines +36 to +40
PD_DLL_DECL Tensor sum(const Tensor& x,
const std::vector<int64_t>& axis,
DataType dtype,
bool keep_dim);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后续PR 再加默认值的接口

kernel_context.EmplaceBackAttr(dense_x->dtype());
kernel_context.EmplaceBackAttr(out_dtype);

// 4. InferShape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InferMeta

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +69 to +72
if (x.dtype() == pten::DataType::BOOL || x.dtype() == pten::DataType::INT32 ||
x.dtype() == pten::DataType::INT64) {
dtype = pten::DataType::INT64;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在InferMeta里看到有类似的执行逻辑,这里的逻辑能否仅放在InferMeta或者kernel中处理?如果要代码自动生成的话这类情况可能还需要单独配置,会增加配置项的复杂性

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后续PR 优化这里

const std::vector<int64_t>& axis,
bool keep_dim) {
bool reduce_all = true;
std::set<int64_t> dims_set(axis.begin(), axis.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么使用set,而非unordered_set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原来的kernel 就是用的set,为保持一致,没用unordered_set

void Reduce(const CUDAContext& dev_ctx,
const DenseTensor& x,
bool reduce_all,
std::vector<int64_t> dims,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not const vector<int64&>& or DDim ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改为了const vector<int64&>&

// eg: x_dim = [2, 4, 6] origin_reduce_dims = [0, 1]
// --SetReduceDim--> x_dim = [8,6], reduce_dim = [0], left_dim = [1]
void SetReduceDim() {
std::set<int64_t> reduce_set;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not unordered_set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原来的kernel 就是用的set,为保持一致,没用unordered_set

}

std::vector<int64_t> reduce_dim_temp(reduce_set.begin(), reduce_set.end());
std::sort(reduce_dim_temp.begin(), reduce_dim_temp.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用了set,这里是不是可以不需要sort?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该可以,后续再和原作者确认下,提PR修复

void Reduce(const DeviceContext& dev_ctx,
const DenseTensor& x,
bool reduce_all,
std::vector<int64_t> dims,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not const &?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done,thx

Comment on lines +60 to +61
pten::DenseTensor* out,
const std::vector<int64_t>& axis) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out需不需要放在最后?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该函数参数顺序是与原来定义保持一致的

@MingMingShangTian MingMingShangTian changed the title Add reduce mean kernel, replace with mean API [Pten] Add reduce mean kernel, replace with mean API Nov 29, 2021
Copy link
Contributor

@chenwhql chenwhql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MingMingShangTian MingMingShangTian merged commit f9e9fd1 into PaddlePaddle:develop Nov 29, 2021
@MingMingShangTian MingMingShangTian deleted the pten_mean branch November 29, 2021 11:24
Zjq9409 pushed a commit to Zjq9409/Paddle that referenced this pull request Dec 10, 2021
…7559)

* add pten reduce kernel

* add reduce_sum kernel

* update attribute args and order

* make out dtype undefined

* fix empty input error

* merge develop branch

* rename sum as reduce function

* rename sum as reduce function

* fix reducekernelImpl args error

* add reduce cuda kernel

* modify dims type to const &

* remove unsed log

* fix reduce_all out eigen function error

* remove unused codes

* add the missing sum api define and testcase

* merge develop branch

* fix sum test axis value error

* replace pten mean kernel with reduce_mean

* revcover meam cuda to original implement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants