-
Notifications
You must be signed in to change notification settings - Fork 90
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[API change] `Scaled_dot_product_flash_attention_attributes`, `Scaled_dot_product_flash_attention_backward_attributes` now accepts K, V tensors instead of K-transpose and V-transpose. This is a deviation from the backend API. This change is made based on multiple customer feedback. [New API] Add `tensor_like` python API which accepts a DLPack-compstible tensor. This simplifies the cudnn tensor creation. [New Feature] Setting `CUDNN_FRONTEND_ATTN_DP_WORKSPACE_LIMIT` environment variable allows to choose between different optimized cudnn backend kernels. See docs/operations/mha for more details. [New Feature] Add RMSNorm and InstanceNorm forward and backward implementations. [New Feature] Add alibi, padding, layout support for attention bprop node. [New Feature] Introduce python bindings for plans. Allows validate graph, filter plans. [Bug Fix] Fix relative includes of filenames in cudnn_frontend headers. This resolves compilation issues in certain toolchains [Bug Fix] Fix Segfault when dropout was set for some scaled dot product flash attention nodes. [New samples] Add new samples for `apply_rope`, `layernorm forward and backward`, `rmsnorm forward and backward`
- Loading branch information
Showing
66 changed files
with
6,641 additions
and
3,372 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.