Release v2.7.2 · oneapi-src/oneDNN

This is a patch release containing the following changes to v2.7.1:

Fixed segfaults in deconvolution backpropagation with ACL on AArch64-based processors (f02e6f3)
Fixed code generation issues in Intel AVX2 convolution implementation (2ba2523, b60633f, 844326b, 2009164)
Fixed correcteness issues and runtime errors in deconvolution with binary post-ops on Intel GPUs (dd54d39)
Improved performance of convolutions with small number of channels and large spatial sizes on systems with Intel AMX (26f97dc, 4cb648d)
Fixed runtime error in int8 convolutions with groups on Xe architecture based GPUs (e5a70f4)
Improved inner product weight gradient performance on Xe architecture based GPUs (9e9b859, 12ec4e3)
Improved batch normalization performance with threadpool threading (4fd5ab2)
Improved inner product performance with binary post-ops in broadcast mode on Intel CPUs (d43c70d, 49ca4e1)
Fixed segfaults and correctness issues in sum primitive with threadpool threading (ee7a321)
Extended persistent cache API to cover engine objects (58481d6, 5f69dad, 16c0a95, 068071b)
Added support for newer versions of Intel GPU drivers (7144393)
Updated ITT API version to 3.23.0 (d23cc95)
Fixed convolution correctness issue on Intel Data Center GPU Flex Series (365ac20)
Fixed fp64 convolution correctness issue on Intel Data Center GPU MAX Series (9d4bf94, 6705403)
Fixed correctness issues in reduction primitive with binary post-op on Intel GPUs (ae9d075, e3b80c5)
Improved convolution performance on on Intel Data Center GPU MAX Series (90be8d5, caf4863)
Fixed build errors with ONEDNN_ENABLE_PRIMITIVE_GPU_ISA build option (de2db04)
Fixed correctness issues in convolution with per-tensor binary post-ops on Intel CPUs (9cf9c18)
Improved convolution performance on Intel Data Center GPU Flex Series (8b08a07)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.7.2