Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix overflow in add_dpbusd_epi32x2 (#268)
* Fix overflow in add_dpbusd_epi32x2 This patch fixes 16bit overflow in *_add_dpbusd_epi32x2 functions, that can be triggered in rare cases depending on the NNUE weights. While the code leads to some slowdown on affected architectures (most notably avx2), the fix is simpler than some of the other options discussed in #4394 Code suggested by Sopel97. Result of "bench 4096 1 30 default depth nnue": | Architecture | master | patch (gcc) | patch (clang) | |---------------------|-----------|-------------|---------------| | x86-64-vnni512 | 762122798 | 762122798 | 762122798 | | x86-64-avx512 | 769723503 | 762122798 | 762122798 | | x86-64-bmi2 | 769723503 | 762122798 | 762122798 | | x86-64-ssse3 | 769723503 | 762122798 | 762122798 | | x86-64 | 762122798 | 762122798 | 762122798 | Following architectures will experience ~4% slowdown due to an additional instruction in the middle of hot path: * x86-64-avx512 * x86-64-bmi2 * x86-64-avx2 * x86-64-sse41-popcnt (x86-64-modern) * x86-64-ssse3 * x86-32-sse41-popcnt official-stockfish/Stockfish@2c36d1e * Unify type alias declarations The commit unifies the declaration of type aliases by replacing all typedefs with corresponding using statements. closing #4412 No functional change official-stockfish/Stockfish@564456a # Conflicts: # source/misc.cpp # source/misc.h # source/movepick.h # source/position.h # source/usi.h
- Loading branch information