Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMD] Map PreciseSqrtOp and PreciseDivFOp to LLVM instructions #3369

Merged
merged 1 commit into from
Mar 14, 2024

Conversation

jayfurmanek
Copy link
Contributor

Here we map PreciseSqrtOp and PreciseDivOp to LLVM instructions for the AMD backend.

These "Precise" ops are currently defined as round-to-nearest-even which is the default rounding mode in the LLVM instructions for the AMD backend.
Alternatively we could call into the AMD ocml.bc. This works for sqrt but __ocml_div_{rm}_f32 is currently unimplemented.
If further "Precise" math ops are added with different rounding modes or otherwise don't map to LLVM ops, we can revisit this.

@jayfurmanek jayfurmanek requested a review from ptillet as a code owner March 13, 2024 18:41
@jayfurmanek jayfurmanek changed the title [AMD] Map PreciseSqrtOp and PreciseDivOp to LLVM instructions [AMD] Map PreciseSqrtOp and PreciseDivFOp to LLVM instructions Mar 13, 2024
@ThomasRaoux ThomasRaoux merged commit 62893c1 into triton-lang:main Mar 14, 2024
4 checks passed
htyu pushed a commit to htyu/triton that referenced this pull request Mar 20, 2024
…n-lang#3369)

Here we map PreciseSqrtOp and PreciseDivOp to LLVM instructions for the
AMD backend.

These "Precise" ops are currently defined as round-to-nearest-even which
is the default rounding mode in the LLVM instructions for the AMD
backend.
Alternatively we could call into the AMD `ocml.bc`. This works for sqrt
but `__ocml_div_{rm}_f32` is currently unimplemented.
If further "Precise" math ops are added with different rounding modes or
otherwise don't map to LLVM ops, we can revisit this.
karupayun pushed a commit to openxla/triton that referenced this pull request Apr 3, 2024
…n-lang#3369)

Here we map PreciseSqrtOp and PreciseDivOp to LLVM instructions for the
AMD backend.

These "Precise" ops are currently defined as round-to-nearest-even which
is the default rounding mode in the LLVM instructions for the AMD
backend.
Alternatively we could call into the AMD `ocml.bc`. This works for sqrt
but `__ocml_div_{rm}_f32` is currently unimplemented.
If further "Precise" math ops are added with different rounding modes or
otherwise don't map to LLVM ops, we can revisit this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants