-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[XPU] add some bf16 ops for kl3 #59263
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -31,16 +31,17 @@ float GetAbsMax(const Context& dev_ctx, | |||
const float* input, | |||
float* buffer_xpu, | |||
int64_t numel) { | |||
float buffer_cpu[6]; | |||
int max_ptr_size = phi::backends::xpu::get_xpu_max_ptr_size(-1); | |||
float buffer_cpu[12]; // 12 is enough even for XPU3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XPU3还需要这个优化吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
显存足够的话很可能不需要了,但是也留着以防万一。不刷那个环境变量就不会启动。
PR types
New features
PR changes
OPs
Description
AdamW
里面有一个获得最大值的临时空间,在kl2和kl3下面findmax
函数返回的结果个数不同,因此换了写法。