Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon No.37】为 Paddle 优化 argmin_argmax op 在 GPU 上的计算性能 #256

Merged
merged 4 commits into from
Sep 29, 2022

Conversation

thunder95
Copy link
Contributor

提交argmin_argmax OP性能优化设计文档

@paddle-bot
Copy link

paddle-bot bot commented Sep 12, 2022

你的PR提交成功,感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备,具体请参考示例模版
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.


## 1.1 飞桨现状

当前性能如下表(基于PaddlePaddle develop分支):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里PaddlePaddle中间多余空格需要删掉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已去掉


## 1.3 对比分析

目前Paddle与Pytorch的API设计方案几乎相同, 且底层都使用了Cub库实现。
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个能解释为什么pytorch底层也用了cub但是性能差异这么大吗?如果使用reduce改写,预计性能提升4.5倍后,跟pytorch还是有比较大的差距

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZzSean 发现paddle和pytorch实现上除了cub外,还有其他细节有些差异,已补充rfc。如果有遗漏的地方,辛苦老师多指点一下。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZzSean 经测试fastdivmod并没有明显性能提升,block优化配置后有较明显提升但是离pytorch性能差距还比较大,重新研读了torch代码,发现新版torch底层用的reduce。

[1]. [OP Benchmark使用指南](https://github.com/PaddlePaddle/benchmark/blob/master/api/README.md)


PPYDDDD111
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是需要删掉啊

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

误写,已删除 @ZzSean

Copy link

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants