Add OpFunctor and replace cast, scale,clip, bce_loss and abs_grad with elementwise_no_broadcast #38500

AnnaTrainingG · 2021-12-27T12:53:02Z

PR types

Others

PR changes

OPs

Describe

Add OpFunctor and replace cast, scale, full, clip, bce_loss and abs_grad with elementwise_no_broadcast
cast 当前在pten中修改不触发benchmark，现补充性能测试：

cast	case	type	old / us	new / us	speed up
case0	[16, 1785]	bool	1.328	1.413	0.94
case1	[16, 1]	int32->int64	1.27	1.239	1.03
case2	[16, 1, 513, 513]	int32->float32	43.09	42.229	1.02
case3	[30522, 1024]	float16->float32	246.57	236.28	1.04
case4	[1]	in64->float32	1.276	1.29	0.99
case5	[16, 16, 1]	float32->float16	1.269	1.312	0.97
case6	[16, 16, 1024]	float16->float32	2.025	1.842	1.10

case0 出现性能下降，主要原因是case规模比较小，机器波动影响占比较大。

paddle-bot-old · 2021-12-27T12:54:01Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

MingMingShangTian

LGTM for cast kernel

paddle/pten/kernels/gpu/cast_kernel.cu

MingMingShangTian

LGTM for cast cuda kernel.

JamesLim-sy

I agree with this pr, but I think it better to code like this :

ScaleFunctor(InT scale_data, InT bias_data, bool is_bias_after_sacle) :
          : bias(bias_data), scale(scale_data), bias_after_scale(is_bias_after_sacle) {}

AnnaTrainingG changed the title ~~Add OpFunctor and replace cast, scale, full, clip, bce_loss and abs_grad with elementwise_no_broadcast~~ Add OpFunctor and replace cast, scale,clip, bce_loss and abs_grad with elementwise_no_broadcast Jan 4, 2022

xingfeng01 previously approved these changes Jan 4, 2022

View reviewed changes

Liu-xiandong previously approved these changes Jan 4, 2022

View reviewed changes

MingMingShangTian previously approved these changes Jan 4, 2022

View reviewed changes

AnnaTrainingG dismissed stale reviews from MingMingShangTian, Liu-xiandong, and xingfeng01 via f44ad4b January 4, 2022 07:51

AnnaTrainingG force-pushed the add_functor branch from f44ad4b to c7acf55 Compare January 4, 2022 07:54

MingMingShangTian reviewed Jan 4, 2022

View reviewed changes

paddle/pten/kernels/gpu/cast_kernel.cu Outdated Show resolved Hide resolved

MingMingShangTian reviewed Jan 4, 2022

View reviewed changes

paddle/pten/kernels/gpu/cast_kernel.cu Outdated Show resolved Hide resolved

MingMingShangTian reviewed Jan 4, 2022

View reviewed changes

paddle/pten/kernels/gpu/cast_kernel.cu Outdated Show resolved Hide resolved

update

3e1ad79

AnnaTrainingG force-pushed the add_functor branch from c7acf55 to 3e1ad79 Compare January 4, 2022 08:02

MingMingShangTian previously approved these changes Jan 4, 2022

View reviewed changes

MingMingShangTian dismissed their stale review via 3e1ad79 January 4, 2022 08:19

xingfeng01 approved these changes Jan 4, 2022

View reviewed changes

Liu-xiandong approved these changes Jan 4, 2022

View reviewed changes

JamesLim-sy approved these changes Jan 4, 2022

View reviewed changes

AnnaTrainingG merged commit 6eac06e into PaddlePaddle:develop Jan 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpFunctor and replace cast, scale,clip, bce_loss and abs_grad with elementwise_no_broadcast #38500

Add OpFunctor and replace cast, scale,clip, bce_loss and abs_grad with elementwise_no_broadcast #38500

AnnaTrainingG commented Dec 27, 2021 •

edited

Loading

paddle-bot-old bot commented Dec 27, 2021

MingMingShangTian left a comment

MingMingShangTian left a comment

JamesLim-sy left a comment

Add OpFunctor and replace cast, scale,clip, bce_loss and abs_grad with elementwise_no_broadcast #38500

Add OpFunctor and replace cast, scale,clip, bce_loss and abs_grad with elementwise_no_broadcast #38500

Conversation

AnnaTrainingG commented Dec 27, 2021 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Dec 27, 2021

MingMingShangTian left a comment

Choose a reason for hiding this comment

MingMingShangTian left a comment

Choose a reason for hiding this comment

JamesLim-sy left a comment

Choose a reason for hiding this comment

AnnaTrainingG commented Dec 27, 2021 •

edited

Loading