Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support avx512 bitmasks with dynamic feature detection #332

Open
reinerp opened this issue Mar 7, 2023 · 3 comments · May be fixed by #333
Open

Support avx512 bitmasks with dynamic feature detection #332

reinerp opened this issue Mar 7, 2023 · 3 comments · May be fixed by #333
Labels
C-feature-request Category: a feature request, i.e. not implemented / a PR

Comments

@reinerp
Copy link

reinerp commented Mar 7, 2023

The choice between full_masks.rs and bitmask.rs seems to be made at compile time (

) based on compile-time presence of the avx512f target feature.

When compiling for general x86-64 targets but doing runtime feature detection for avx512f, the current approach will generate suboptimal code for avx512, because it will use the full_masks.rs implementation instead of the bitmask.rs implementation.

I'd like to be able to use bitmasks in avx512 even when using runtime feature detection. It's probably out of scope (and, I think, undesirable) for std::simd to be responsible for runtime feature detection, so one option would be for std::simd to be refactored to expose both Mask and Bitmask as two separate types -- perhaps both implementing the same set of traits. This way, callers could choose between a full-mask implementation and a bitmask implementation based on whatever logic they choose, including e.g. runtime feature detection.

@reinerp reinerp added the C-feature-request Category: a feature request, i.e. not implemented / a PR label Mar 7, 2023
@reinerp
Copy link
Author

reinerp commented Mar 8, 2023

Actually looking through the implementation of bitmask.rs, I see that in avx512 mode, Mask<T, LANES> is just a wrapper for LANES::BitMask, which is already accessible to users. So I guess the answer to my request is already available: it is "use LANES::BitMask if you want a guarantee of single-bit-per-lane". Perhaps some additional documentation is all that is required.

@calebzulawski
Copy link
Member

It's worth noting that when compiling for instruction sets that use bitmasks, the compiler often optimizes these lane-width masks to bitmasks (as far as LLVM is concerned, all masks are truncated to bitmasks anyway). The layout matters less than you might think--only when writing masks to memory or general purpose registers to use them outside of SIMD operations, which is relatively rare.

@reinerp
Copy link
Author

reinerp commented Mar 13, 2023

That's a really great point Caleb, and I think it pretty much entirely addresses my concern. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: a feature request, i.e. not implemented / a PR
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants