You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
) based on compile-time presence of the avx512f target feature.
When compiling for general x86-64 targets but doing runtime feature detection for avx512f, the current approach will generate suboptimal code for avx512, because it will use the full_masks.rs implementation instead of the bitmask.rs implementation.
I'd like to be able to use bitmasks in avx512 even when using runtime feature detection. It's probably out of scope (and, I think, undesirable) for std::simd to be responsible for runtime feature detection, so one option would be for std::simd to be refactored to expose bothMask and Bitmask as two separate types -- perhaps both implementing the same set of traits. This way, callers could choose between a full-mask implementation and a bitmask implementation based on whatever logic they choose, including e.g. runtime feature detection.
The text was updated successfully, but these errors were encountered:
Actually looking through the implementation of bitmask.rs, I see that in avx512 mode, Mask<T, LANES> is just a wrapper for LANES::BitMask, which is already accessible to users. So I guess the answer to my request is already available: it is "use LANES::BitMask if you want a guarantee of single-bit-per-lane". Perhaps some additional documentation is all that is required.
It's worth noting that when compiling for instruction sets that use bitmasks, the compiler often optimizes these lane-width masks to bitmasks (as far as LLVM is concerned, all masks are truncated to bitmasks anyway). The layout matters less than you might think--only when writing masks to memory or general purpose registers to use them outside of SIMD operations, which is relatively rare.
The choice between
full_masks.rs
andbitmask.rs
seems to be made at compile time (portable-simd/crates/core_simd/src/masks.rs
Line 5 in 9bd30e7
avx512f
target feature.When compiling for general x86-64 targets but doing runtime feature detection for
avx512f
, the current approach will generate suboptimal code for avx512, because it will use thefull_masks.rs
implementation instead of thebitmask.rs
implementation.I'd like to be able to use bitmasks in avx512 even when using runtime feature detection. It's probably out of scope (and, I think, undesirable) for
std::simd
to be responsible for runtime feature detection, so one option would be forstd::simd
to be refactored to expose bothMask
andBitmask
as two separate types -- perhaps both implementing the same set of traits. This way, callers could choose between a full-mask implementation and a bitmask implementation based on whatever logic they choose, including e.g. runtime feature detection.The text was updated successfully, but these errors were encountered: