-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize nearest_interp forward #38528
Conversation
Update forked PaddlePaddle
Update my fork
update from PaddlePaddle
Update forked paddle repo
Update USERNAME/paddle
update Paddle USERNAME repo
update username repo
update local paddlepaddle
update paddlepaddle
… nearest_interp
Thanks for your contribution! |
Sorry to inform you that b7fd119's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
已将FastDivMod 的初始化从GPU端移到CPU端 |
需要贴一下修改后的性能数据 |
Done,见PR 描述 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对于nchw,PR提供更快的3D kernel;对于nhwc,PR使用快速除法优化已有的1D kernel
… nearest_interp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,3D卷积的操作比较新颖,而且性能水平相比1D的提升更明显,整理一下这次优化的材料,组内做一个分享吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Performance optimization
PR changes
OPs
Describe
功能
优化了nearest_interp 算子的的前向计算
最终效果
效果:超越竞品,远超paddle-dev