Releases: NVIDIA/cudnn-frontend
v0.4 release
[New API] : Added a new function get_heuristics_list which accepts a list of heuristics mode and returns a concatenated list of the engine heuristics.
[New Feature]: New mode of heuristic (HEUR_MODE_FALLBACK] added to the backend. Sample updated to use that and provides a generic way to access the fallback engines. FallbackEngineList is retained as a way to add custom engines in the frontend.
[New Feature]: Added support to set vectorization dimension and vectorization count attributes in the tensor descriptor.
[Rename]: setDataType in OperationBuilder deprecated and replaced with more clear setComputePrecision()
[CleanUp] : cudnnFindPlan and cudnnGetPlan takes L-value operationGraph rather than previously R-value.
[CleanUp] : cudnnFindPlan and time_sorted_plan return executionPlans_t (which is a vector plans) instead of executionOptions_t (which is a vector of struct containing plan and time). This is to achieve compatibility with the cudnnGet.
[Samples]: New sample added for DP4A.
[Samples]: ConvBiasScaleRelu sample|
[Bug fix]: Errata filter was erroneously filtering out unspecified engines.
MR for quick fix for graceful exit
[Maintenance] Adding status check on the cudnnBackendExecute during warm up.
[Maintenance] Adding status check on json_handle when loading from a file
v0.3
[New feature] Support reduction operation in the frontend.
[New feature] Add engine runtime compilation filter in the frontend as a behavior filter.
[New feature] Adding fallback list for convBiasAct
[New feature Beta] Adding Errata filter with an sample.
[Samples] Add ConvBnstats and ConvColReduction tests
[Bug Fix] Clamp upper_clip for float compute type to float max for pointwise descriptor when computeType is float.
[Bug Fix] Compilation fix for newer gcc toolchain (gcc 9+).
[Bug Fix] Add operation tag to the Plan generated by cudnnFind and cudnnGet
[Maintenance] Added default fallback lists to newer versions of cudnn.