Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add __divtf3 #622

Merged
merged 5 commits into from
Sep 24, 2024
Merged

Add __divtf3 #622

merged 5 commits into from
Sep 24, 2024

Conversation

tgross35
Copy link
Contributor

@tgross35 tgross35 commented May 25, 2024

Division for f128. Still pretty buggy

@tgross35
Copy link
Contributor Author

tgross35 commented May 26, 2024

Unfortunately this algorithm has an exponent rounding error with subnormals for f128, llvm/llvm-project#93401.

__divtf3(0x00000000000000000000000000000001, 0x0001ffffffffffffffffffffffffffff): std: 0x3f8e0000000000000000000000000001, builtins: 0x3f8f0000000000000000000000000001

@tgross35
Copy link
Contributor Author

tgross35 commented Jul 27, 2024

The f128 issue is proving pretty tricky to fix. To help slim down the surface area, I am going to split the generic refactor to another PR.

@tgross35
Copy link
Contributor Author

tgross35 commented Aug 4, 2024

f128 division is working with full iterations 🎉 I need to figure out what is wrong with the half-width iterations, and do some more documentation.

@tgross35
Copy link
Contributor Author

It looks like there are some platform-specific issues with the system libraries:

  • Incorrect NaN handling on i686 Linux, __divtf3(0x00000000000000000000000000000000, 0x80000000000000000000000000000000): std: 0x00000000000000000000000000000000, builtins: 0x7fff8000000000000000000000000000. 0/-0 should be NaN, the system symbols are reporting 0.
  • Incorrect handling of subnormals on aarch64 Linux, __divtf3(0x00000000000000000000000000000001, 0x0001ffffffffffffffffffffffffffff): std: 0x3f8f0000000000000000000000000001, builtins: 0x3f8e0000000000000000000000000001. Builtins is correct here, the system symbol gets the exponent wrong.

@beetrees
Copy link
Contributor

beetrees commented Aug 17, 2024

The i686 issue appears to be an ABI issue: GCC aligns f128 arguments to 16 bytes on the stack, whereas Clang/LLVM does not. Looking at the 32-bit x86 System V ABI specification, GCC's behaviour appears to be correct:

Padding may be needed to increase the size of each parameter to enforce alignment according to the values in Table 2.1.

I believe the relevant LLVM issue is llvm/llvm-project#77401.

@tgross35
Copy link
Contributor Author

Interesting, I am surprised that doesn't show up for the other functions.

I suspect that the aarch64 division issue is another case of llvm/llvm-project#91840 that is still waiting on your fix to be merged.

@tgross35
Copy link
Contributor Author

I think this is ready for a look when you get the chance @Amanieu.

f128 seems to require one more half-width iteration than expected, and this still isn't as tidy as I would like. However, tests pass so as-is seems reasonable enough to unblock division in std as well as #614. I plan to cycle back for the other fixes.

@tgross35 tgross35 marked this pull request as ready for review August 19, 2024 21:16
@tgross35 tgross35 requested a review from Amanieu August 19, 2024 21:16
@tgross35
Copy link
Contributor Author

Even with the extra iteration performance isn't bad. Not sure why we are that much faster here.

div_f128/compiler-builtins
                        time:   [261.10 µs 262.32 µs 264.06 µs]
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) high mild
  9 (9.00%) high severe
div_f128/system         time:   [329.93 µs 330.71 µs 331.47 µs]
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

f32 and f64 are essentially unchanged.

Copy link
Member

@Amanieu Amanieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, though I didn't check the algorithm in detail.

src/int/big.rs Outdated Show resolved Hide resolved
Float division requires some shift operations on big integers; implement
right shift here.
Add some bounds to integer types that allow our function trait bounds to
be slightly less verbose. Also clarify documentation and remove a
redundant operation.
Float division currently has a separate `div32` and `div64` for `f32`
and `f64`, respectively. Combine these to make use of generics. This
will make it easier to support `f128` division, and reduces a lot of
redundant code.

This includes a simplification of division tests.
Use the new generic division algorithm to expose `__divtf3` and
`__divkf3`.
@tgross35
Copy link
Contributor Author

Thanks for taking a look! I'll have some more improvements to do here but at least this gets us off the ground.

@tgross35 tgross35 enabled auto-merge September 24, 2024 16:39
@tgross35 tgross35 merged commit e86ef27 into rust-lang:master Sep 24, 2024
24 checks passed
@tgross35 tgross35 deleted the f128-div branch September 24, 2024 16:40
@tgross35 tgross35 restored the f128-div branch September 24, 2024 16:40
@tgross35 tgross35 deleted the f128-div branch September 24, 2024 16:40
tgross35 added a commit to tgross35/rust that referenced this pull request Sep 24, 2024
tgross35 added a commit to tgross35/rust that referenced this pull request Sep 27, 2024
This includes the following, which adds `__divtf3`

- rust-lang/compiler-builtins#622
- rust-lang/compiler-builtins#692

The `cc` bump [1] was previously included but was reverted due to
problems updating.

[1]: rust-lang/compiler-builtins#690
tgross35 added a commit to tgross35/rust that referenced this pull request Sep 28, 2024
This includes the following which add `__divtf3` and `__powtf2`, and do
some feature cleanup:

- rust-lang/compiler-builtins#622
- rust-lang/compiler-builtins#692
- rust-lang/compiler-builtins#614
- rust-lang/compiler-builtins#694

The `cc` bump [1] was previously included but was reverted due to
problems updating.

[1]: rust-lang/compiler-builtins#690
tgross35 added a commit to tgross35/rust that referenced this pull request Sep 28, 2024
This includes the following which add `__divtf3` and `__powtf2`, and do
some feature cleanup:

- rust-lang/compiler-builtins#622
- rust-lang/compiler-builtins#692
- rust-lang/compiler-builtins#614
- rust-lang/compiler-builtins#694

The `cc` bump [1] was previously included but was reverted due to
problems updating.

[1]: rust-lang/compiler-builtins#690
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 28, 2024
Update compiler-builtins to 0.1.130

This includes the following which add `__divtf3` and `__powtf2`, and do some feature cleanup:

- rust-lang/compiler-builtins#622
- rust-lang/compiler-builtins#692
- rust-lang/compiler-builtins#614
- rust-lang/compiler-builtins#694

The `cc` bump [1] was previously included but was reverted due to problems updating.

[1]: rust-lang/compiler-builtins#690
cberner pushed a commit to cberner/rust that referenced this pull request Sep 28, 2024
This includes the following which add `__divtf3` and `__powtf2`, and do
some feature cleanup:

- rust-lang/compiler-builtins#622
- rust-lang/compiler-builtins#692
- rust-lang/compiler-builtins#614
- rust-lang/compiler-builtins#694

The `cc` bump [1] was previously included but was reverted due to
problems updating.

[1]: rust-lang/compiler-builtins#690
RalfJung pushed a commit to RalfJung/miri that referenced this pull request Oct 3, 2024
Update compiler-builtins to 0.1.130

This includes the following which add `__divtf3` and `__powtf2`, and do some feature cleanup:

- rust-lang/compiler-builtins#622
- rust-lang/compiler-builtins#692
- rust-lang/compiler-builtins#614
- rust-lang/compiler-builtins#694

The `cc` bump [1] was previously included but was reverted due to problems updating.

[1]: rust-lang/compiler-builtins#690
cberner pushed a commit to cberner/rust that referenced this pull request Oct 13, 2024
This includes the following which add `__divtf3` and `__powtf2`, and do
some feature cleanup:

- rust-lang/compiler-builtins#622
- rust-lang/compiler-builtins#692
- rust-lang/compiler-builtins#614
- rust-lang/compiler-builtins#694

The `cc` bump [1] was previously included but was reverted due to
problems updating.

[1]: rust-lang/compiler-builtins#690
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants