-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Pten] Add reduce mean kernel, replace with mean API #37559
[Pten] Add reduce mean kernel, replace with mean API #37559
Conversation
Thanks for your contribution! |
|
||
// call new kernel | ||
pten::Mean<T>(dev_ctx, *pt_x.get(), pt_out.get()); | ||
y.device(place) = X.mean(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mean.cu中的逻辑也需要恢复
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注意check下kernel注册的写法也要恢复到和原先一样
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done,thx
@@ -0,0 +1,962 @@ | |||
// Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文件命名不建议带 op.cu.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改命名为 reduce_cuda_impl.h
PD_DLL_DECL Tensor mean(const Tensor& x, | ||
const std::vector<int64_t>& axis, | ||
bool keep_dim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需不需要再提供一个带有默认值的API接口?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后续PR 再加默认值的接口
PD_DLL_DECL Tensor sum(const Tensor& x, | ||
const std::vector<int64_t>& axis, | ||
DataType dtype, | ||
bool keep_dim); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后续PR 再加默认值的接口
paddle/pten/api/lib/math.cc
Outdated
kernel_context.EmplaceBackAttr(dense_x->dtype()); | ||
kernel_context.EmplaceBackAttr(out_dtype); | ||
|
||
// 4. InferShape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InferMeta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
if (x.dtype() == pten::DataType::BOOL || x.dtype() == pten::DataType::INT32 || | ||
x.dtype() == pten::DataType::INT64) { | ||
dtype = pten::DataType::INT64; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在InferMeta里看到有类似的执行逻辑,这里的逻辑能否仅放在InferMeta或者kernel中处理?如果要代码自动生成的话这类情况可能还需要单独配置,会增加配置项的复杂性
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后续PR 优化这里
const std::vector<int64_t>& axis, | ||
bool keep_dim) { | ||
bool reduce_all = true; | ||
std::set<int64_t> dims_set(axis.begin(), axis.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么使用set,而非unordered_set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
原来的kernel 就是用的set,为保持一致,没用unordered_set
void Reduce(const CUDAContext& dev_ctx, | ||
const DenseTensor& x, | ||
bool reduce_all, | ||
std::vector<int64_t> dims, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not const vector<int64&>&
or DDim
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改为了const vector<int64&>&
// eg: x_dim = [2, 4, 6] origin_reduce_dims = [0, 1] | ||
// --SetReduceDim--> x_dim = [8,6], reduce_dim = [0], left_dim = [1] | ||
void SetReduceDim() { | ||
std::set<int64_t> reduce_set; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not unordered_set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
原来的kernel 就是用的set,为保持一致,没用unordered_set
} | ||
|
||
std::vector<int64_t> reduce_dim_temp(reduce_set.begin(), reduce_set.end()); | ||
std::sort(reduce_dim_temp.begin(), reduce_dim_temp.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用了set,这里是不是可以不需要sort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
应该可以,后续再和原作者确认下,提PR修复
void Reduce(const DeviceContext& dev_ctx, | ||
const DenseTensor& x, | ||
bool reduce_all, | ||
std::vector<int64_t> dims, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not const &
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done,thx
pten::DenseTensor* out, | ||
const std::vector<int64_t>& axis) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out需不需要放在最后?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
该函数参数顺序是与原来定义保持一致的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…7559) * add pten reduce kernel * add reduce_sum kernel * update attribute args and order * make out dtype undefined * fix empty input error * merge develop branch * rename sum as reduce function * rename sum as reduce function * fix reducekernelImpl args error * add reduce cuda kernel * modify dims type to const & * remove unsed log * fix reduce_all out eigen function error * remove unused codes * add the missing sum api define and testcase * merge develop branch * fix sum test axis value error * replace pten mean kernel with reduce_mean * revcover meam cuda to original implement
PR types
New features
PR changes
Others
Describe