Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpFunctor and replace cast, scale,clip, bce_loss and abs_grad with elementwise_no_broadcast #38500

Merged
merged 1 commit into from
Jan 4, 2022

Conversation

AnnaTrainingG
Copy link
Contributor

@AnnaTrainingG AnnaTrainingG commented Dec 27, 2021

PR types

Others

PR changes

OPs

Describe

Add OpFunctor and replace cast, scale, full, clip, bce_loss and abs_grad with elementwise_no_broadcast
cast 当前在pten中修改不触发benchmark,现补充性能测试:

cast case type old / us new / us speed up
case0 [16, 1785] bool 1.328 1.413 0.94
case1 [16, 1] int32->int64 1.27 1.239 1.03
case2 [16, 1, 513, 513] int32->float32 43.09 42.229 1.02
case3 [30522, 1024] float16->float32 246.57 236.28 1.04
case4 [1] in64->float32 1.276 1.29 0.99
case5 [16, 16, 1] float32->float16 1.269 1.312 0.97
case6 [16, 16, 1024] float16->float32 2.025 1.842 1.10

case0 出现性能下降,主要原因是case规模比较小,机器波动影响占比较大。

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@AnnaTrainingG AnnaTrainingG changed the title Add OpFunctor and replace cast, scale, full, clip, bce_loss and abs_grad with elementwise_no_broadcast Add OpFunctor and replace cast, scale,clip, bce_loss and abs_grad with elementwise_no_broadcast Jan 4, 2022
xingfeng01
xingfeng01 previously approved these changes Jan 4, 2022
Liu-xiandong
Liu-xiandong previously approved these changes Jan 4, 2022
Copy link
Contributor

@MingMingShangTian MingMingShangTian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for cast kernel

Copy link
Contributor

@MingMingShangTian MingMingShangTian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for cast cuda kernel.

Copy link
Contributor

@JamesLim-sy JamesLim-sy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this pr, but I think it better to code like this :

ScaleFunctor(InT scale_data, InT bias_data, bool is_bias_after_sacle) :
          : bias(bias_data), scale(scale_data), bias_after_scale(is_bias_after_sacle) {}

@AnnaTrainingG AnnaTrainingG merged commit 6eac06e into PaddlePaddle:develop Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants