1.7.0-rc #111

Anerudhan · 2024-09-19T06:46:56Z

cudnn FE 1.7.0 Release notes:

New API

Kernel Cache support for dynamic graphs Added New APIs to enable kernel cache support for graphs with dynamic shapes. Please refer to documentation for API details.

Added examples Convolution fprop dynamic shape, CSBR Graph dynamic shape, Matmul dynamic shape and Bias + Matmul dynamic shape to showcase use of dynamic shapes and kernel cache.

Two new APIs to describe the plan in the form engine number and knobs are introduced.

error_t
get_plan_name(std::string &name) const;

error_t
get_plan_name_at_index(int64_t plan_index, std::string &name) const;

Note:
This name can be used later if you want to deselect_plan_by_name, if run into any potential errors.

Added an API to query tensor attributes from its UID in a graph. query_tensor_with_uid(int64_t const uid, Tensor_attributes &tensor) const;

Improvements

sdpa fp16 bprop node can now compute dbias when padding mask is enabled (requires cudnn 9.4.0 and above).
sdpa fp8 (forward and bprop) nodes now support optional bias, dropout and padding mask(requires cudnn 9.4.0 and above).
Matmul fp8 node can now accept M,N,K overrides.
Added new python notebooks for implementing BatchNorm and BatchNorm bprop using cuDNN.
Updated benchmark numbers with cudnn 9.4.0 for fp16 and fp8 datatypes.
Fixed compilation issues when NV_CUDNN_DISABLE_EXCEPTION is enabled.

Bug fixes

Fixed a crash when the output dimension of dgrad node is not specified. This now returns an error message instead.
Fixed incorrect SDPA stats stride inferencing.
Fixed a bug in sdpa test when sliding window attention is enabled and query sequence length (s_q) is greater than key length (s_kv). This case is now not supported.

## New API - Kernel Cache support for dynamic graphs Added New APIs to enable kernel cache support for graphs with dynamic shapes. Please refer to [documentation](docs/dynamic_kernel_cache.md) for API details. Added examples `Convolution fprop dynamic shape`, `CSBR Graph dynamic shape`, `Matmul dynamic shape` and `Bias + Matmul dynamic shape` to showcase use of dynamic shapes and kernel cache. - Two new APIs to describe the plan in the form engine number and knobs are introduced. ``` error_t get_plan_name(std::string &name) const; error_t get_plan_name_at_index(int64_t plan_index, std::string &name) const; ``` Note: This name can be used later if you want to deselect_plan_by_name, if run into any potential errors. - Added an API to query tensor attributes from its UID in a graph. `query_tensor_with_uid(int64_t const uid, Tensor_attributes &tensor) const;` ## Improvements - sdpa fp16 bprop node can now compute dbias when padding mask is enabled. - sdpa fp8 (forward and bprop) nodes now support optional bias, dropout and padding mask. - Matmul fp8 node can now accept M,N,K overrides. - Added new python notebooks for implementing BatchNorm and BatchNorm bprop using cuDNN. - Updated [benchmark numbers](benchmark) with cudnn 9.4.0 for fp16 and fp8 datatypes. - Fixed compilation issues when `NV_CUDNN_DISABLE_EXCEPTION` is enabled. ## Bug fixes - Fixed a crash when the output dimension of dgrad node is not specified. This now returns an error message instead. - Fixed incorrect SDPA stats stride inferencing. - Fixed a bug in sdpa test when sliding window attention is enabled and query sequence length (s_q) is greater than key length (s_kv). This case is now not supported.

Anerudhan force-pushed the 1.7.0-rc branch 2 times, most recently from faa82bf to ece77ca Compare September 19, 2024 21:22

Anerudhan force-pushed the 1.7.0-rc branch from ece77ca to 61100c8 Compare September 21, 2024 02:34

Anerudhan merged commit de355c7 into main Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.7.0-rc #111

1.7.0-rc #111

Anerudhan commented Sep 19, 2024 •

edited

Loading

1.7.0-rc #111

1.7.0-rc #111

Conversation

Anerudhan commented Sep 19, 2024 • edited Loading

cudnn FE 1.7.0 Release notes:

New API

Improvements

Bug fixes

Anerudhan commented Sep 19, 2024 •

edited

Loading