Support avx512 bitmasks with dynamic feature detection #332

reinerp · 2023-03-07T22:05:21Z

The choice between full_masks.rs and bitmask.rs seems to be made at compile time (

portable-simd/crates/core_simd/src/masks.rs

#[cfg_attr(

) based on compile-time presence of the avx512f target feature.

When compiling for general x86-64 targets but doing runtime feature detection for avx512f, the current approach will generate suboptimal code for avx512, because it will use the full_masks.rs implementation instead of the bitmask.rs implementation.

I'd like to be able to use bitmasks in avx512 even when using runtime feature detection. It's probably out of scope (and, I think, undesirable) for std::simd to be responsible for runtime feature detection, so one option would be for std::simd to be refactored to expose both Mask and Bitmask as two separate types -- perhaps both implementing the same set of traits. This way, callers could choose between a full-mask implementation and a bitmask implementation based on whatever logic they choose, including e.g. runtime feature detection.

The text was updated successfully, but these errors were encountered:

reinerp · 2023-03-08T01:55:27Z

Actually looking through the implementation of bitmask.rs, I see that in avx512 mode, Mask<T, LANES> is just a wrapper for LANES::BitMask, which is already accessible to users. So I guess the answer to my request is already available: it is "use LANES::BitMask if you want a guarantee of single-bit-per-lane". Perhaps some additional documentation is all that is required.

calebzulawski · 2023-03-09T22:11:51Z

It's worth noting that when compiling for instruction sets that use bitmasks, the compiler often optimizes these lane-width masks to bitmasks (as far as LLVM is concerned, all masks are truncated to bitmasks anyway). The layout matters less than you might think--only when writing masks to memory or general purpose registers to use them outside of SIMD operations, which is relatively rare.

reinerp · 2023-03-13T02:41:49Z

That's a really great point Caleb, and I think it pretty much entirely addresses my concern. Thanks!

reinerp added the C-feature-request Category: a feature request, i.e. not implemented / a PR label Mar 7, 2023

reinerp linked a pull request Mar 8, 2023 that will close this issue

Document alternatives to Mask<T, LANES> that guarantee layout. Fixes #332 #333

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support avx512 bitmasks with dynamic feature detection #332

Support avx512 bitmasks with dynamic feature detection #332

reinerp commented Mar 7, 2023

reinerp commented Mar 8, 2023

calebzulawski commented Mar 9, 2023

reinerp commented Mar 13, 2023

Support avx512 bitmasks with dynamic feature detection #332

Support avx512 bitmasks with dynamic feature detection #332

Comments

reinerp commented Mar 7, 2023

reinerp commented Mar 8, 2023

calebzulawski commented Mar 9, 2023

reinerp commented Mar 13, 2023