-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚧 POC - support NaNs for SSE & AVX2 f32 #18
Conversation
CodSpeed Performance ReportMerging #18 Summary
Benchmarks breakdown
|
@varon I just wrote out very quickly what I pursued in this PR (is more a POC than a proper PR). P.S.: I'll improve the phrasing tomorrow - just wanted to quickly push & share the code 🙃 |
When I run the benchmarks on my local machine I notice only a 3-4% regression 🤔
|
Is this superseded by #21 ? |
Yes! This was some sort of a proof of concept, showcasing the utility of |
This PR aims at handling Nans, #16.
For the scalar implementation I check whether the scalar value is not equal to itself. This check will only be true if the scalar value is NAN, as the following is correct in Rust:
For the SIMD implementation I used a transformation similar to the one used in #1 - this transformation projects the NANs to integer values that are either higher / lower than the "real" floating point values. The transformation leverages the 2-complement https://observablehq.com/@rreusser/half-precision-floating-point-visualized
Some remarks:
+ inf
&- inf
will get projected as well=> this is indeed the case - see plot ⬇️
11111
and the fraction should be non-zero. Thus the sign bit may be 1 or 0 -> resulting in half of the NaNs getting projected above and the other half below the "real" floating point values. Thus, only 1 of the 2 checks (either > or <) will fire & thus detect theNaN
. This might be a problem when we want to implementargmin
andargmax
as separate functions..Paths I looked into but did not seem worthwhile: