Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for total_cmp (on f32/f64) #72599

Closed
6 tasks done
golddranks opened this issue May 26, 2020 · 33 comments
Closed
6 tasks done

Tracking Issue for total_cmp (on f32/f64) #72599

golddranks opened this issue May 26, 2020 · 33 comments
Assignees
Labels
A-floating-point Area: Floating point numbers and arithmetic B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@golddranks
Copy link
Contributor

golddranks commented May 26, 2020

This is a tracking issue for the APIs f32::total_cmp, f64::total_cmp ( #72568 )
The feature gate for the issue is #![feature(total_cmp)].

Overview of the API

  • Implements method total_cmp on f32 and f64. This method implements a float comparison that, unlike the standard partial_cmp, is total (defined on all values) in accordance to the IEEE 754 (rev 2008) §5.10 totalOrder predicate.
  • The method has an API similar to cmp: pub fn total_cmp(&self, other: &Self) -> crate::cmp::Ordering { ... }.

Steps

  • Implement the RFC
  • FCP
  • Stabilization PR
    • Update docs to not promise more than we can.

Unresolved Questions

  • The exact ordering required by IEEE 754, revisions 2008 and 2019, isn't followed on some Tier 2 platforms, namely, older MIPS systems. The problem is that on those platforms, the meaning of the "signaling NaN" and "quiet NaN" bit is reversed. This means the order of those NaNs is reverse on those platfroms. Newer MIPS chips have a flag for the user to choose the behaviour. Should we add a special mention about this in the documentation, or just ignore the problem, as it's 1) not very common (being Tier 2 and all) 2) minor (reversal of order of sNaN and qNaN is unlikely to matter in real-life situations, and in those kinds of situations, the user is usually aware of the problem) 3) going away. (The newer systems addressing this problem.)

Implementation history

#72568

@golddranks golddranks added the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label May 26, 2020
@JohnTitor JohnTitor added B-unstable Blocker: Implemented in the nightly compiler and unstable. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels May 26, 2020
@hanna-kruppe
Copy link
Contributor

Another reason to ignore the "signaling bit is flipped on older MIPS" issue: as far as I know, there is essentially no way to actually tell signaling and quiet NaNs apart in current or near-future Rust. You can of course inspect the bit in question with to_bits, but nothing tells you how that bit is interpreted by the hardware. Debug printing, f32::classify, etc. do not distinguish sNaN from qNaN, and Rust neither exposes the distinction between quiet/signaling comparisons not any way to check for floating point exceptions those predicates may raise depending on sNaN/qNaN operands.

@sfackler
Copy link
Member

Yeah, I think that the best we can do is just stick something in the docs noting the issue.

@est31
Copy link
Member

est31 commented Jun 14, 2020

The exact ordering required by IEEE 754, revisions 2008 and 2019, isn't followed on some Tier 2 platforms, namely, older MIPS systems. The problem is that on those platforms, the meaning of the "signaling NaN" and "quiet NaN" bit is reversed

There are multiple bit pattern definitions of "signaling NaN" and "quiet NaN":

  • The "flipped" definition of older MIPS standards
  • The definitions of IEEE 754-2008 and IEEE 754-2019

The two definitions disagree, and no implementation of totalOrder can be conformant with both. But the current implementation of totalOrder is conformant with the IEEE definition. Why should you use the MIPS standard definition to determine conformance to an IEEE standard?

I'd filed a PR in 2017 to add some docs on NaN to the {from,to}_bits functions but closed it because I thought it wouldn't matter due to those devices being phased out. #46246

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Jun 15, 2020

At present, the behavior of this comparator is unspecified, since the bits inside of a NaN are not guaranteed to be preserved across operations (including loads and stores) by LLVM (#73328). This includes the sign bit as well (#55131). We will need to fix these issues this before stabilizing total_cmp, and we should call this out explicitly in the documentation.

@KodrAus KodrAus added A-floating-point Area: Floating point numbers and arithmetic Libs-Tracked Libs issues that are tracked on the team's project board. labels Jul 29, 2020
@RalfJung
Copy link
Member

RalfJung commented Sep 9, 2020

The exact ordering required by IEEE 754, revisions 2008 and 2019, isn't followed on some Tier 2 platforms, namely, older MIPS systems. The problem is that on those platforms, the meaning of the "signaling NaN" and "quiet NaN" bit is reversed. This means the order of those NaNs is reverse on those platfroms.

What does this mean? Does the same bit pattern compare differently? Or is the ordering, when implemented uniformly across platforms, specified in terms of whether a NaN is signalling or not, and thus that cross-platform implementation is inconsistent with the signaling vs quiet distinction on that platform?

Basically, I am wondering which spec is being violated when we just ignore the fact that these MIPS systems have a "weird" definition of signaling NaNs.


Cc rust-lang/unsafe-code-guidelines#237 as the place where I am trying to collect all open questions around floating-point semantics.

@RalfJung
Copy link
Member

RalfJung commented Sep 9, 2020

Oh I see... the order is specified in terms of whether NaNs are signaling or not, but implemented disregarding whatever "signalling NaN" means on the current platform (assuming the standardized meaning instead).

So I guess this technically means that the implementation is not quite conforming (but this seems rather minor compared to how badly behaved i686 floats are^^ see #72327, #73288). Not sure what would be more surprising, documenting this as "not fully spec-compliant on that platform" or trying to fix the implementation. Does any other language try to work around this by changing the implementation on that platform?

@est31
Copy link
Member

est31 commented Sep 9, 2020

Oh I see... the order is specified in terms of whether NaNs are signaling or not, but implemented disregarding whatever "signalling NaN" means on the current platform (assuming the standardized meaning instead).

So I guess this technically means that the implementation is not quite conforming

Not conforming to which spec? All IEEE 754 spec versions that define totalOrder (versions 2008 and onwards) also define what signaling NaN means. If you see a definition foo in a document A reference another definition bar, did they mean the definition bar from document B, or did they mean the definition bar that's also contained in A? I'd say the latter. Note that to my knowledge, there is no totalOrder definition in the document B (older versions of the MIPS spec in this instance). If that were the case, one could arguably say that Rust's implementation conflicts the platform.

So I believe that Rust is always spec compliant on older mips platforms, but yes there is possible confusions for readers of the spec of totalOrder.

@clarfonthey
Copy link
Contributor

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

@leonardo-m
Copy link

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

An alternative is to add f64/f32 methods like:

fn total_fp_ord(self) -> i64 {
    let bi = self.to_bits() as i64;
    bi^ (((bi >> 63) as u64) >> 1) as i64
}

With this method you can use max_by_key instead of max_by.
But the wrapper idea could be better.

@scottmcm
Copy link
Member

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

That was brought up in my original PR: #53938 (comment)

@l0calh05t
Copy link

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

I implemented something similar a while ago: https://github.com/l0calh05t/totally-ordered-rs

@l0calh05t
Copy link

l0calh05t commented Jun 10, 2021

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

An alternative is to add f64/f32 methods like:

fn total_fp_ord(self) -> i64 {
    let bi = self.to_bits() as i64;
    bi^ (((bi >> 63) as u64) >> 1) as i64
}

With this method you can use max_by_key instead of max_by.
But the wrapper idea could be better.

I just benchmarked this approach (and a few others) and transforming the keys themselves instead of providing a comparison is much more efficient when using sort_by_cached_key (even when compared to sort_unstable_by, as there is no equivalent sort_unstable_by_cached_key):

Sort f32/Partial/65536  time:   [3.2772 ms 3.2940 ms 3.3112 ms]
Sort f32/Total/65536    time:   [4.6182 ms 4.6388 ms 4.6584 ms]
Sort f32/Cached key/65536
                        time:   [2.7210 ms 2.7433 ms 2.7684 ms]
Sort f32/In place key/65536
                        time:   [1.7014 ms 1.7027 ms 1.7042 ms]
Sort f64/Partial/65536  time:   [2.8154 ms 2.8275 ms 2.8414 ms]
Sort f64/Total/65536    time:   [4.8437 ms 4.8712 ms 4.8938 ms]
Sort f64/Cached key/65536
                        time:   [3.0629 ms 3.0803 ms 3.0989 ms]
Sort f64/In place key/65536
                        time:   [1.8246 ms 1.8481 ms 1.8689 ms]

Partial is the standard values.sort_unstable_by(|a, b| a.partial_cmp(b).unwrap_or(std::cmp::Ordering::Equal));.
Total uses a wrapper / total_cmp.
Cached key uses total_fp_ord with sort_by_cached_key (all other sorts are unstable).
In place key transforms the keys in place, reinterprets the slice as a slice of i32/i64, runs a standard unstable integer sort, then transforms them back (this approach is the fastest, but providing a safe/clean interface is complicated, especially in a sort_by_key context).

@golddranks
Copy link
Contributor Author

@l0calh05t Do you have the benchmark code uploaded somewhere?

@l0calh05t
Copy link

l0calh05t commented Jun 10, 2021

@RalfJung RalfJung changed the title Tracking Issue for total_cmp Tracking Issue for total_cmp (on f32/f64) Jun 10, 2021
@scottmcm
Copy link
Member

(even when compared to sort_unstable_by, as there is no equivalent sort_unstable_by_cached_key):

That's because sort_by_cached_key actually calls sort_unstable_by internally -- it basically builds a Vec<(Key, usize)>, and having that index means that it's stable because there are no different-but-equivalent things being compared.

@l0calh05t
Copy link

(even when compared to sort_unstable_by, as there is no equivalent sort_unstable_by_cached_key):

That's because sort_by_cached_key actually calls sort_unstable_by internally -- it basically builds a Vec<(Key, usize)>, and having that index means that it's stable because there are no different-but-equivalent things being compared.

Yeah, I wondered if that was the reason why and checked the code after posting. TBH, I would have expected a sort with elements in two separate Vecs/slices, but that would best be implemented on writable random-access zip iterators and I don't think we have anything like that yet, but that is a discussion for somewhere else.

@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting.

We agreed that the old MIPS bug is not an issue; we came to the same conclusion regarding to_bits and from_bits.

(However, we'd ask that the documentation be updated to not guarantee the ordering of different kinds of NaNs.)

We also agreed that #73328 shouldn't be a blocker.

@rfcbot merge

@rfcbot
Copy link

rfcbot commented Jan 26, 2022

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jan 26, 2022
@scottmcm
Copy link
Member

(However, we'd ask that the documentation be updated to not guarantee the ordering of different kinds of NaNs.)

Maybe give it a platform-specific caveat, or something? That ordering is part of the IEEE behaviour of the operation, so it'd be nice to keep it for the usual case.

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Jan 26, 2022
@joshtriplett
Copy link
Member

@scottmcm If it's only an issue on specific targets, we could document that it's an issue on those targets, sure.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 30, 2022
…lett

review the total_cmp documentation

The documentation has been restructured to split out a brief summary
paragraph out from the following elaborating paragraphs.

I also attempted my hand at wording improvements and adding articles
where I felt them missing, but being non-native english speaker these
may need more thorough review.

cc rust-lang#72599
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 31, 2022
…lett

review the total_cmp documentation

The documentation has been restructured to split out a brief summary
paragraph out from the following elaborating paragraphs.

I also attempted my hand at wording improvements and adding articles
where I felt them missing, but being non-native english speaker these
may need more thorough review.

cc rust-lang#72599
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Feb 5, 2022
@rfcbot
Copy link

rfcbot commented Feb 5, 2022

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Feb 10, 2022
@scottmcm scottmcm added the E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. label Feb 26, 2022
@scottmcm
Copy link
Member

scottmcm commented Feb 26, 2022

Looks like, with #93403 now merged, this just needs someone to make a stabilization PR!

Feel free to @rustbot claim if you'd like to work on that.

@golddranks
Copy link
Contributor Author

golddranks commented Feb 26, 2022

A big thank you ya'll!!!

As the contributor of the initial PR and this issue two years back, (and unfortunately, not having much time to work on Rust in between) I'd very much like to push it over the goal line! I'll get back to this tomorrow.
@rustbot claim

@golddranks
Copy link
Contributor Author

golddranks commented Mar 22, 2022

Just a note, I've been extremely busy at work for some weeks now for deadlines at end of this week. I'll get back to this next week. I want to do a thorough review about the points discussed about the documented / to be documented guarantees before sending the PR. @RalfJung also had some concerns; I don't believe that this API is a bigger cause for concern than some other API's (I'm thinking of to_bits/from_bits) we've already stabilized, but I want to review them nevertheless to be sure of the full picture.

@golddranks
Copy link
Contributor Author

(As a further note the reason we want to be careful is because the semantics of floats in general are a complicated interplay at least between:

  • guarantees / lack of guarantees (guarantees by Rust, guarantees by LLVM, guarantees by target platforms)
  • compiler optimizations
  • platform-specific hardware implementation differences
  • differences between runtime and const evaluation)

@RalfJung
Copy link
Member

My concerns are mostly around what is mentioned at the top -- that some platforms are not spec-compliant. I don't quite see how that relates to to/from_bits.

However, since you mention to/from_bits -- total_cmp should not be a const fn for the same reason that to/from_bits are not. (Currently total_cmp is not even unstably const so this is fine.)

@golddranks
Copy link
Contributor Author

So, I read a bunch of GitHub threads and the existing documentation. Here's my current understanding:

About the behaviour of floats:

The runtime behaviour isn't always guaranteed with NaNs. (And because of LLVM-specific stuff, with subnormals too.) This can be because:

  • Differences in hardware implementations
  • LLVM not guaranteeing to keep the payload (the sign bit or the signaling bit or the "data" payload)
    • Especially LLVM doing const folding and other optimizations might affect different code paths differently
    • Some arithmetic operations (sign flipping comes to mind) don't care preserving NaN signs to achieve efficient implementation; not wanting to separate logic for NaNs

What do we do about it?

While unfortunate, the LLVM shenanigans haven't historically prevented Rust from stabilizing operations that expose NaN payloads: to/from_bits and is_sign_negative are earlier examples, where the payloads are exposed. total_cmp is just another such an example. However, they DO prevent making them const, where determinism and platform-independence are requirements, but that is out of consideration here. (So, this was a point I was worried about when I said I want to review the points discussed, but it turns out that only the constification is considered a problem.)

As for the differences in hardware implementations, we want to be sure to document them to not over-promise and reduce surprises.

What's still a bit unclear for me, how strongly we should warn ALSO against the indeterminism introduced by LLVM in the docs of operations that easily expose the payload? The issue #73328 is about this.

About the current documentation:

The current documentation mentions about platform differences, but only in some items (namely, from_bits). However, even with a cursory read, I identified multiple items that could benefit at least a mention about the issue, redirecting to the explanation of from_bits for further explanation.

The current documentation doesn't mention about issues caused by LLVM. Again, that's #73328 .

My proposal:

  • The current blocker for stabilization of total_cmp is updating the documentation WRT the platform differences, so first of all, I'm just going do that to get this in to Rust 1.61. from_bits already provides a good reference, so I think that referring mostly to that is going to be fine for the "main" explanation. However, I'm going to add, how the issue affects total_cmp. Expect a documentation PR later today.
  • As I said, some other items could benefit from a similar explanation. I'm going to open an issue, and then work on that.
  • After that, we should tackle Document guarantees (or lack thereof) regarding sign, quietness, and payload of NaNs #73328 . While not being overly specific about the details, an explanation about LLVM's non-guarantees of conserving/propagating NaN payload is definitely warranted. I now see that this was considered as a totally separate issue from stabilizing total_cmp. (Another point that wasn't clear to me), but I see it as worth solving, especially as we are adding operations that allow observing/poking the NaN payload bits.

@golddranks
Copy link
Contributor Author

Oops, so looking into that, it seems that #93403 was just that missing documentation PR! Sorry for the fuss :D I'll send a stabilization PR then.

@golddranks
Copy link
Contributor Author

golddranks commented Apr 4, 2022

There's not a lot of time left to merge the stabilization PR ( #95431 ) if we want to get total_cmp in 1.61. It seems like @dtolnay is busy, is there anyone here with merge rights that could review instead? Also, since there are people that might have insight about the guarantees and semantics around floats, I'd appreciate comments in #95483 too!

Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Apr 4, 2022
…cottmcm

Stabilize total_cmp

Stabilises `total_cmp` for Rust 1.61.0. Tracking issue: rust-lang#72599
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Apr 4, 2022
…cottmcm

Stabilize total_cmp

Stabilises `total_cmp` for Rust 1.61.0. Tracking issue: rust-lang#72599
@JohnTitor
Copy link
Member

Triage: I'm going to close this as the stabilization and doc improvement PRs were merged already 🎉

workingjubilee pushed a commit to tcdi/postgrestd that referenced this issue Sep 15, 2022
Stabilize total_cmp

Stabilises `total_cmp` for Rust 1.61.0. Tracking issue: rust-lang/rust#72599
@paulabrudanandrei
Copy link

What will happen to PartialCmp and PartialEq?

@golddranks
Copy link
Contributor Author

What will happen to PartialCmp and PartialEq?

Nothing. They continued existing happily ever after.

In case you are confused what this issue was about, it was about adding method total_cmp for floats. The existence of that method doesn't affect the semantics of PartialCmp and PartialEq whatsoever. The method is there if you explicitly decide to use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-floating-point Area: Floating point numbers and arithmetic B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests