Skip to content

Commit

Permalink
cudnn frontend v1.3 release notes. (#72)
Browse files Browse the repository at this point in the history
[New API] Added new operations `sdpa_fp8_forward` and `sdpa_fp8_backward` to perform scaled dot prodcut attention of fp8 tensors. See more details in the `docs/operations/Attention.md` and cpp sample in `samples/cpp/mha.cpp`. Pybinds for the fp8 nodes are also added.

[New API] Added new operation for resample forward operation. Add a new sample `samples/cpp/resample.cpp` to show its usage.

[New API] Add a new API `deselect_engines(std::vector<std::string> const &engine_names)` which blocks certain engine configs from running.

[New API] Add new APIs `select_numeric_notes` and `select_behavior_notes` to allow user select engine configs which have the selected numeric and behavior notes respectively.

[Python API] Added a custom exception `cudnnGraphNotSupportedException` to the python API to distinguish between graphs that are actually not supported as compared to programming errors.

[Python API] Added a new `backend_version_string` which returns the backend version in canonical form (eg. 9.1.0) instead of a version number.

[Bug Fix] Updated the workspace computation for sdpa fprop node. Previously, workspace was calculated for alibi slopes irrespective of whether alibi mask was turned on or not.

[Bug Fix] Fixed deserialization of pass by values of half precision.
  • Loading branch information
Anerudhan authored Apr 10, 2024
1 parent af9bc9e commit 1b0b5ea
Show file tree
Hide file tree
Showing 50 changed files with 3,466 additions and 599 deletions.
6 changes: 5 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cmake_minimum_required(VERSION 3.17)

project(cudnn_frontend VERSION 1.2.1)
project(cudnn_frontend VERSION 1.3.0)

option(CUDNN_FRONTEND_SKIP_NLOHMANN_JSON "Defines whether FE should not include nlohmann/json.hpp." OFF)
option(CUDNN_FRONTEND_BUILD_SAMPLES "Defines if samples are built or not." ON)
Expand Down Expand Up @@ -39,6 +39,10 @@ target_link_libraries(

target_compile_features(cudnn_frontend INTERFACE cxx_std_17)

# Make PCH for targets to link against
add_library(_cudnn_frontend_pch INTERFACE)
target_precompile_headers(_cudnn_frontend_pch INTERFACE ${PROJECT_SOURCE_DIR}/include/cudnn_frontend.h)

if (CUDNN_FRONTEND_BUILD_SAMPLES)
add_subdirectory(samples)
endif()
Expand Down
38 changes: 22 additions & 16 deletions README.FE.1.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,21 +29,24 @@ The steps involved in building and running a cudnn graph are as follows:
## APIs
FE v1.0 API follows a functional style of building a graph. Operations take in input tensors and return output tensors. This also allows composition of operations.

| Purpose | C++ API | Python API |
| --- | --- | --- |
| Create tensor | tensor | tensor |
| [Convolution Fprop](docs/operations/Convolutions.md) | conv_fprop <br>Conv_fprop_attributes | conv_fprop |
| [Convolution Dgrad](docs/operations/Convolutions.md) | conv_dgrad <br>Conv_dgrad_attributes | conv_dgrad |
| [Convolution Wgrad](docs/operations/Convolutions.md) | conv_wgrad <br>Conv_wgrad_attributes | conv_wgrad |
| [Matrix Multiplication](docs/operations/Matmul.md) | matmul <br> Matmul_attributes | matmul |
| [Pointwise Operations](docs/operations/Pointwise.md) | pointwise <br> Pointwise_attributes | - add<br>- bias<br>- rqsrt<br>- sub<br>- mul<br>- scale<br>- relu<br>- elu<br>- gelu<br>- cmp_gt |
| [Batch Normalization](docs/operations/Normalizations.md) | batchnorm <br>Batchnorm_attributes | batchnorm |
| [Batch Norm bprop](docs/operations/Normalizations.md) | batchnorm_backward <br>Batchnorm_backward_attributes | batchnorm_backward |
| Generate stats of output| genstats <br>Genstats_attributes | genstats |
| BN Finalize of stats | bn_finalize <br>BN_finalize_attributes | bn_finalize |
| Dbn weight | dbn_weight <br>DBN_weight_attributes | dbn_weight |
| [Scale dot product attention](docs/operations/Attention.md) | sdpa<br> SDPA_attributes | sdpa |
| [Scale dot product attention backward](docs/operations/Attention.md) | sdpa_backward<br> SDPA_backward_attributes | sdpa_backward |
| Purpose | C++ API | Python API |
|--------------------------------------------------------------------------|------------------------------------------------------|--------------------------------------------------------------------------------------------------|
| Create tensor | tensor | tensor |
| [Convolution Fprop](docs/operations/Convolutions.md) | conv_fprop <br>Conv_fprop_attributes | conv_fprop |
| [Convolution Dgrad](docs/operations/Convolutions.md) | conv_dgrad <br>Conv_dgrad_attributes | conv_dgrad |
| [Convolution Wgrad](docs/operations/Convolutions.md) | conv_wgrad <br>Conv_wgrad_attributes | conv_wgrad |
| [Matrix Multiplication](docs/operations/Matmul.md) | matmul <br> Matmul_attributes | matmul |
| [Pointwise Operations](docs/operations/Pointwise.md) | pointwise <br> Pointwise_attributes | - add<br>- bias<br>- rqsrt<br>- sub<br>- mul<br>- scale<br>- relu<br>- elu<br>- gelu<br>- cmp_gt |
| [Batch Normalization](docs/operations/Normalizations.md) | batchnorm <br>Batchnorm_attributes | batchnorm |
| [Batch Norm bprop](docs/operations/Normalizations.md) | batchnorm_backward <br>Batchnorm_backward_attributes | batchnorm_backward |
| Generate stats of output | genstats <br>Genstats_attributes | genstats |
| BN Finalize of stats | bn_finalize <br>BN_finalize_attributes | bn_finalize |
| Dbn weight | dbn_weight <br>DBN_weight_attributes | dbn_weight |
| [Resampling](docs/operations/Resampling.md) | resample <br>Resample_attributes | resample |
| [Scale dot product attention](docs/operations/Attention.md) | sdpa<br> SDPA_attributes | sdpa |
| [Scale dot product attention backward](docs/operations/Attention.md) | sdpa_backward<br> SDPA_backward_attributes | sdpa_backward |
| [Scale dot product attention FP8](docs/operations/Attention.md) | sdpa_fp8<br> SDPA_fp8_attributes | sdpa_fp8 |
| [Scale dot product attention backward FP8](docs/operations/Attention.md) | sdpa_fp8_backward<br> SDPA_fp8_backward_attributes | sdpa_fp8_backward |

### Create Graph
Instantiate an object of class `cudnn_frontend::graph::Graph` which will house tensors and operations.
Expand Down Expand Up @@ -141,9 +144,12 @@ cudnn_frontend::Graph::build_plan_at_index(


### Filter plans (optional)
Users can filter out plans against numerical, behavioral notes, or plans that do not provide desired functional correctness.
Users can filter plans on numerical, behavioral notes, or plans that do not provide desired functional correctness.

```
cudnn_frontend::graph::Graph& cudnn_frontend::graph::Plans::select_numeric_notes(std::vector<cudnn_frontend::NumericalNote_t> const&);
cudnn_frontend::graph::Graph& cudnn_frontend::graph::Plans::select_behavior_notes(std::vector<cudnn_frontend::BehaviorNote_t> const&);
cudnn_frontend::graph::Graph& cudnn_frontend::graph::Plans::deselect_numeric_notes(std::vector<cudnn_frontend::NumericalNote_t> const&);
cudnn_frontend::graph::Graph& cudnn_frontend::graph::Plans::deselect_behavior_notes(std::vector<cudnn_frontend::BehaviorNote_t> const&);
cudnn_frontend::graph::Graph& cudnn_frontend::graph::Plans::deselect_workspace_greater_than(int64_t const workspace);
Expand Down
Loading

0 comments on commit 1b0b5ea

Please sign in to comment.