Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize performance of log_softmax #38992

Merged
merged 20 commits into from
Mar 14, 2022
Merged

Conversation

ZzSean
Copy link
Contributor

@ZzSean ZzSean commented Jan 17, 2022

PR types

Performance optimization

PR changes

OPs

Describe

Optimize performance of log_softmax by the unified logic of softmax

case pytorch paddle 优化前 diff paddle 优化后 diff 加速比
fp32,[16,1000] axis:-1 0.00933 0.01081 差于 (15.86%) 0.00774 优于 (20.54%) 1.40
fp16,[16,1000] axis:-1 0.01043 0.01507 差于 (44.49%) 0.00938 优于 (11.19%) 1.61
fp32,[32,12,128,128] axis:-1 0.15398 0.15366 打平 (0.21%) 0.15498 打平 (0.65%) 0.99
fp16,[32,12,128,128] axis:-1 0.08158 0.08094 打平 (0.79%) 0.08594 差于 (5.34%) 0.94
fp32,[15,16,33,33] axis:-1 0.00818 0.00995 差于 (21.64%) 0.00968 差于 (18.34%) 1.03
fp16,[15,16,33,33] axis:-1 0.00818 0.01000 差于 (22.25%) 0.00980 差于 (19.80%) 1.02
fp32,[128,128,16,16] axis:0 0.32817 1.16099 差于 (2.54x) 0.14324 优于 (1.29x) 8.11
fp16,[128,128,16,16] axis:0 0.30698 0.78857 差于 (1.57x) 0.11444 优于 (1.68x) 6.89
fp32,[512,896,4,12] axis:1 1.43943 15.33653 差于 (9.65x) 1.05637 优于 (36.26%) 14.52
fp16,[512,896,4,12] axis:1 1.13875 14.92736 差于 (12.11x) 0.71325 优于 (59.66%) 20.93

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot-old
Copy link

Sorry to inform you that c894654's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators Mar 9, 2022
@PaddlePaddle PaddlePaddle unlocked this conversation Mar 9, 2022
Avin0323
Avin0323 previously approved these changes Mar 9, 2022
Copy link
Contributor

@Avin0323 Avin0323 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for unity_build_rule.cmake

AshburnLee
AshburnLee previously approved these changes Mar 9, 2022
Copy link
Contributor

@AshburnLee AshburnLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

ops::LogSoftmaxGradCUDNNKernel<float>,
ops::LogSoftmaxGradCUDNNKernel<double>,
ops::LogSoftmaxGradCUDNNKernel<plat::float16>);
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这三部分注册可以通过可变参数宏优化。

limin2021
limin2021 previously approved these changes Mar 9, 2022
Copy link
Contributor

@limin2021 limin2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -56,6 +65,11 @@ class LogSoftmaxOpMaker : public framework::OpProtoAndCheckerMaker {
"The dimension index of Input(x) to perform log_softmax,"
"default -1 for last dimension")
.SetDefault(-1);
AddAttr<bool>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不建议新增属性,直接改原CUDA Kernel吧。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thx

@ZzSean ZzSean dismissed stale reviews from limin2021, AshburnLee, and Avin0323 via 4329528 March 10, 2022 03:58
const bool log_mode,
DenseTensor* out) {
PADDLE_THROW(
"This kernel is not supported when the dtype is bf16 and CUDNN_VERSION < "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PADDLE_THROW也要指明error类型。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thx

Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@chenwhql chenwhql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for PADDLE_ENFORCE

@ZzSean ZzSean merged commit 250e254 into PaddlePaddle:develop Mar 14, 2022
@ZzSean ZzSean deleted the opt_logsoftmax branch November 7, 2022 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants