-
-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement sum and difference of products using fma #8
Comments
Is this still a thing? I am wondering whether I want to give my time to create a PR for this, but as far as I understood it will take one additional instruction per method call:
vs
It will have higher precision/less errors, I am not sure however whether this should (if introduced) be introduced into other API calls. EDIT: First call could be |
It shouldn't replace that pattern in other code wholesale, just providing methods like |
As a side note: |
There's a few reasons that Rust doesn't do that automatically. Foremost, because it provides a different answer and the Rust compiler will never optimize something that changes the result. Also, the default Rust target does not include the fma feature so it will compile to a call to libc instead of to an fma instruction directly. As such, when compiling not for a specific target, fma can actually make your code significantly slower. Even in the case where it does successfully compile to an FMA, it's not a set in stone perf benefit. It depends on architecture and also the surrounding algorithm being used. If you have too many FMAs, you can saturate the limited number of FMA units on the ALU and get worse performance because you've stalled the pipeline. Also I suspect there might be side chain reasons that converting to FMA provided that much of a speed up in your case, even in the best case, just going to FMA shouldn't provide a 2x perf improvement. Perhaps the auto vectorizer was able to figure out your code better because of its new structure with FMAs and vectorize some hot code. |
Nice insights above! Thank you for your input :-) I noticed that ultraviolet already uses |
It used to use fma pretty much everywhere possible but ultimately it's not
worth it overall... Doing so and then benchmarking is how I discovered the
above insights ;D
…On Sat, Mar 27, 2021, 3:03 PM joeftiger ***@***.***> wrote:
Nice insights above! Thank you for your input :-)
I noticed that ultraviolet already uses fma in matrices for determinant
e.g.
Would it make sense to use fma for vector's mag_sq() function as well?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGYXH4QP4L4J5KTGWA2BP3TFZIZ3ANCNFSM4JI64RNQ>
.
|
https://pharr.org/matt/blog/2019/11/03/difference-of-floats.html
The text was updated successfully, but these errors were encountered: