[AUTOGENERATED] [release/2.5] Cherry-pick PR-1767 #1794

rocm-mici · 2024-12-13T18:44:25Z

Cherry-pick of #1767

… kernel (pytorch#140259) (#1767) It was raised that the backwards layer norm on AMD was slightly off the accuracy of the equivalent NVIDIA implementation. On AMD we call into a helper kernel `cuLoadWriteStridedInputs` which processes strided input and accumulates the partial gradients into shared memory. In this kernel (pytorch#87635) we truncated `mean` and `rstd` from T_ACC type to T which causes numerical issues in the warp buffers created in this kernel. This PR will use the correct accumulator type for mean and rstd. Note: Only AMD call into this call stack for backwards layer norm, so this was not an issue for NV. Pull Request resolved: pytorch#140259 Approved by: https://github.com/jianyuh (cherry picked from commit 001f736) Fixes #ISSUE_NUMBER

rocm-repo-management-api · 2024-12-13T18:55:49Z

Jenkins build for da78ffb5ae9c5d8d1b745237839e04bd4a1fd01b commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2024-12-17T19:55:41Z

Jenkins build for da78ffb5ae9c5d8d1b745237839e04bd4a1fd01b commit is in progress
Links: Blue Ocean view / Build artifacts

rocm-mici mentioned this pull request Dec 13, 2024

[release/2.2] [ROCm] Correct numerical issues in layer norm backwards kernel (#140259) #1767

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AUTOGENERATED] [release/2.5] Cherry-pick PR-1767 #1794

[AUTOGENERATED] [release/2.5] Cherry-pick PR-1767 #1794

rocm-mici commented Dec 13, 2024

rocm-repo-management-api bot commented Dec 13, 2024 •

edited

Loading

rocm-repo-management-api bot commented Dec 17, 2024

[AUTOGENERATED] [release/2.5] Cherry-pick PR-1767 #1794

Are you sure you want to change the base?

[AUTOGENERATED] [release/2.5] Cherry-pick PR-1767 #1794

Conversation

rocm-mici commented Dec 13, 2024

rocm-repo-management-api bot commented Dec 13, 2024 • edited Loading

rocm-repo-management-api bot commented Dec 17, 2024

rocm-repo-management-api bot commented Dec 13, 2024 •

edited

Loading