Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bf16] add bf16 kernel: layer_norm p_norm reduce_sum #39843

Merged
merged 14 commits into from
Mar 1, 2022

Conversation

zhangbo9674
Copy link
Contributor

@zhangbo9674 zhangbo9674 commented Feb 23, 2022

PR types

New features

PR changes

OPs

Describe

添加 layer_norm p_norm reduce_sum bf16 kernel.

对LayerNorm做性能测试,在embed_dim=1024的场景下分析call_1024_kernel在bf16数据类型下的计算性能:(call_1024_kernel的优化策略见PR39247

图片
图片

layer_norm前向及反向耗时:
cost_time fp32: 0.01763439178466797s
cost_time bf16 use 1024 kernel: 0.007885456085205078s
cost_time bf16 no use 1024 kernel: 0.008244991302490234s

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for op benchmark

Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhangbo9674 zhangbo9674 merged commit ce8ed97 into PaddlePaddle:develop Mar 1, 2022
@zhangbo9674 zhangbo9674 deleted the dev/bf16_op_9 branch March 2, 2023 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants