Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add leaky_relu forward and backward in activation_op.cu #31841

Merged

Conversation

AnnaTrainingG
Copy link
Contributor

@AnnaTrainingG AnnaTrainingG commented Mar 24, 2021

PR types

Performance optimization

PR changes

OPs

Describe

PR说明:本次添加leaky_relu正反向向量化实现,并新增了REGISTER_ACTIVATION_GPU_KERNEL宏用于注册向量化OP,相比于原始REGISTER_ACTIVATION_CUDA_KERNEL 区别在于:ActivationGPUKernel(向量化实现)的调用;

leaky_relu新增向量化支持;
性能测试结果如下:

leaky backward 单位us            
数据类型 pytorch paddle old paddle 1.8 paddle new (new-pytorch)/pytorch (new-old)/old (new-1.8/1.8
float32 2.918 2.581 2.385 2.244 -23.10% -13.06% -5.91%
float32 113.29 117.43 116.37 114.33 0.92% -2.64% -1.75%
float32 2844.5 3142.6 3157.2 2864.7 0.71% -8.84% -9.26%
float32 287.74 298.92 296.15 289.8 0.72% -3.05% -2.14%
float16 2.236 6.173 2.193 1.823 -18.47% -70.47% -16.87%
float16 58.494 67.084 65.767 58.669 0.30% -12.54% -10.79%
float16 1438.8 4696.2 2158.5 1463.7 1.73% -68.83% -32.19%
float16 146.34 165.74 164.52 148.73 1.63% -10.26% -9.60%
               
leaky forward 单位us            
数据类型 pytorch paddle old paddle 1.8 paddle new (new-pytorch)/pytorch (new-old)/old (new-1.8/1.8
float32 2.846 2.49 1.824 1.834 -35.56% -26.35% 0.55%
float32 77.749 78.571 81.747 77.263 -0.63% -1.66% -5.49%
float32 1956.2 2103.8 2338.1 1944.9 -0.58% -7.55% -16.82%
float32 197.6 201.29 210.73 196.41 -0.60% -2.42% -6.80%
float16 2.186 5.964 1.814 1.691 -22.64% -71.65% -6.78%
float16 47.231 48.692 47.756 41.399 -12.35% -14.98% -13.31%
float16 1115.4 1352.2 1509.1 1024.7 -8.13% -24.22% -32.10%
float16 116.08 120.26 120.52 104.09 -10.33% -13.45% -13.63%

在进行float16计算的时候,使用half22float2转换成float计算,是为了精度考虑,此外使用float2计算并不会导致性能下降,性能测试结果如下:

forward half2 float2
[160800, 1, 7, 9] 75.977us 75.646us
[3, 513, 31, 31] 12.975us 12.469us
[8, 512, 157, 157] 734.99us 735.69us
[4, 128, 319, 31] 38.928us 39.265us
     
backward    
[160800, 1, 7, 9] 53.284us 52.831us
[3, 513, 31, 31] 8.9540us 8.6850us
[8, 512, 157, 157] 512.87us 512.91us
[4, 128, 319, 31] 26.944us 27.039us

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@AnnaTrainingG AnnaTrainingG reopened this Mar 30, 2021
Xreki
Xreki previously approved these changes Mar 31, 2021
Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@zhangting2020 zhangting2020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhangting2020 zhangting2020 merged commit 4490e8a into PaddlePaddle:develop Apr 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants