Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize nearest_interp forward #38528

Merged
merged 38 commits into from
Jan 25, 2022
Merged

Conversation

AshburnLee
Copy link
Contributor

@AshburnLee AshburnLee commented Dec 28, 2021

PR types

Performance optimization

PR changes

OPs

Describe

功能

优化了nearest_interp 算子的的前向计算

最终效果

截屏2022-01-21 10 04 29

效果:超越竞品,远超paddle-dev

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot-old
Copy link

paddle-bot-old bot commented Jan 5, 2022

Sorry to inform you that b7fd119's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@AshburnLee
Copy link
Contributor Author

已将FastDivMod 的初始化从GPU端移到CPU端

@JamesLim-sy
Copy link
Contributor

已将FastDivMod 的初始化从GPU端移到CPU端

需要贴一下修改后的性能数据

@AshburnLee
Copy link
Contributor Author

已将FastDivMod 的初始化从GPU端移到CPU端

需要贴一下修改后的性能数据

Done,见PR 描述

Copy link
Contributor Author

@AshburnLee AshburnLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于nchw,PR提供更快的3D kernel;对于nhwc,PR使用快速除法优化已有的1D kernel

paddle/fluid/operators/interpolate_v2_op.cu Outdated Show resolved Hide resolved
paddle/fluid/operators/interpolate_v2_op.cu Outdated Show resolved Hide resolved
paddle/fluid/operators/interpolate_v2_op.cu Outdated Show resolved Hide resolved
paddle/fluid/operators/interpolate_v2_op.cu Outdated Show resolved Hide resolved
paddle/fluid/operators/interpolate_v2_op.cu Outdated Show resolved Hide resolved
paddle/fluid/platform/device/gpu/gpu_launch_config.h Outdated Show resolved Hide resolved
paddle/fluid/platform/device/gpu/gpu_launch_config.h Outdated Show resolved Hide resolved
Copy link
Contributor

@JamesLim-sy JamesLim-sy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,3D卷积的操作比较新颖,而且性能水平相比1D的提升更明显,整理一下这次优化的材料,组内做一个分享吧

Copy link
Contributor

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZzSean ZzSean merged commit 232bbce into PaddlePaddle:develop Jan 25, 2022
@AshburnLee AshburnLee deleted the nearest_interp branch January 25, 2022 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants