v1.5.0 release
[New feature] With cudnn backend 9.2.0 and above, Graph::check_support can determine support check for runtime engines without invoking the nvrtc compiler. This allows users to check the support surface of cudnn without invoking the nvrtc compilation.
[New feature] Python pip wheel now contains the necessary c++ development headers.
[New feature] Sliding window attention is now supported as an attribute to the sdpa forward and bprop node. Usage:
sdpa_attributes.set_sliding_window_length(window_length)
[New feature] Bottom right aligned causal masking is now supported as an attribute to the sdpa forward and bprop node. Usage: sdpa_attributes.use_causal_mask_bottom_right(true)
[New feature] SDPA bprop attributes can choose deterministic algorithm using the use_deterministic_algorithm API.
[New feature] Allow users to filter candidate execution plans of graph by its shared memory usage in cudnn 9.2.0 and later.
[Bug fix] A runtime error if chosen execution plan candidate is incorrectly set in the backend has been fixed. This would happen when check_support does not correctly filter by the workspace size.
[Bug fix] selecting/deselecting by behavior and numerical notes has now been fixed and works as intended.
[Debugging] A new tool for easy reproduction of a failure using the json representation of the graph can be found here.
[Samples] Restructured the cpp samples into categories for easier navigation.
[Samples] Added a sample to showcase how different plans can be built in parallel in separate threads.
[Compilation enhancement] Added a new macro
CUDNN_FRONTEND_SKIP_NLOHMANN_JSON as compilation flag to not have nlohman::json as compilation dependency. Users lose access to certain API functions like print, key, serialize, deserialzie that depend on the library.
[Enhancement] Serialization of resample operation is now supported.
[Enhancement] Bug template has been added for new github issues