-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of float inputs for PluralRules and FixedDecimal #191
Comments
Plural rules selection on floating points needs to know the number of trailing zeros (minimumFractionDigits). That quantity could come from the user, or it could come from locale data (such as for currencies). Let's revisit this in advance of v1 after NumberFormat is finished, and see if that covers all of the use cases. |
Related: #340 |
I want to touch base with you regarding my plans to benchmark It looks like the way to create a As we discussed in #810 we need to support some form of I am planning to set up three different benchmarks:
I think these numbers should provide an interesting perspective into the performance of I wanted to gather any other thoughts/advice you have about this plan. |
We discuss extensively in #166 the many reasons why we don't support floats in ICU4X. I would not be surprised if reinstating Zibi's Aside from the concerns I've raised about floats' lack of ability to represent trailing zeros, ECMA-402 formats floats to formatted decimals before performing plural rule selection (e.g., by applying |
I understand that. I'm not saying that if And importantly, (sorry to not be as clear about this) I'm interested in your opinion about the best ways to create As you mentioned, I wasn't around for all of the early |
Once #166 is fixed, there will be two ways to create FixedDecimal from float:
You can do (2) now; for example code, see #724 (I expect to migrate that code to a documentation example). Of course, this comes at the cost of adding a dependency on Option (1) almost works, except our |
That would be the holy grail. The problem is that it isn't actually possible, because floats are stored in base 2, but FixedDecimal is base 10 (and PluralRules needs base 10 as well), so we need to perform a base conversion. Converting to a string is the most common way to express a float in base 10, but crates like |
Thanks for all of that background! It's very helpful. |
We're also planning to benchmark NumberFormat in Gecko backed by FixedDecimalFormat. ICU4C NumberFormat supports formatting directly from a double, and that is the most common use case in Gecko, int64_t and string inputs are used for BigInt representations. I get that what NumberFormat does is separate from the concerns here about PluralRules, but I think we'll eventually want to support NumberFormat from double for performance reasons. |
To be clear: I support the idea of adding APIs for correctly supporting floats in principle, but there are technical challenges that are not yet resolved. We discussed those technical issues in our meeting last Thursday. See #166.
Ultimately there should not be a performance difference in the general case if the double-to-decimal conversion happens in userland or in our library. There is simply not a way to "just format a double". The standing conclusion of #166 is to leave it up to the caller to choose how they want to perform the conversion, such that they can tune their size/speed tradeoffs, etc. |
Sorry, I missed that ICU4C is first converting to an ascii format before doing anything locale specific, so you're right, it doesn't matter if ICU4x accepts a double. |
Right. More specifically, ICU4C uses https://github.com/google/double-conversion in the general case, which converts a double to an ASCII string, with a fast path for small doubles. I would consider the small double fast path to be in scope for ICU4X if we wanted it. |
It's still unclear to me what our final vision is for handling significant digits, fraction digits etc. using a Ryu doesn't look like it takes any inputs to Our current FFI doesn't exposes any functionality to change these properties. I don't see any documentations or discussions specifically about this (but sorry if I missed them). As you have brought up, we need to differentiate |
Let's try to discuss options for double-to-decimal in #166, and keep this thread focused on the problem as it relates specifically to PluralRules. |
@sffc am I correct that the only way to get PluralOperands from
We did not secure a route that wouldn't require stringification yet. It may involve waiting for some intermittent represtation that Does that match your mental model? |
#166 has most of the discussion on this, and it was fixed a few months ago by Manish. You can now do |
Ah, thank you! I'm wondering why Manish chose I also think that https://unicode-org.github.io/icu4x-docs/doc/fixed_decimal/decimal/struct.FixedDecimal.html#method.new_from_f64 and https://unicode-org.github.io/icu4x-docs/doc/fixed_decimal/decimal/enum.DoublePrecision.html would benefit from examples that show different result than |
Yeah, I'd like to go through and improve the docs. I don't remember why new_from_f64 was chosen over the alternatives. |
|
No reason
@sffc, thoughts? I'm against conversion from a tuple, it's not very Rusty (just use a function). Somewhat okay with TryFrom but unsure if we should be picking that default for users. Plus all of this is behind ryu and it's less noticeable if a conversion trait impl is behind a feature.
No strong reason. |
As expressed elsewhere on multiple occasions, I am strongly opposed to anything that involves making an implicit default value for |
Ah, sorry, I hadn't recalled it. Personally I have no strong opinion there. |
Understood. This refutes my drive to |
Given the emerging design for rounding in FixedDecimal, I'd like to propose a change to the Currently, the DoublePrecision enum is: pub enum DoublePrecision {
Integer,
Magnitude(i16, RoundingMode),
SignificantDigits(u8, RoundingMode),
Floating,
} I'd like to change it to: pub enum DoublePrecision {
IntegerExact,
Magnitude(i16),
SignificantDigits(u8),
Floating,
} Semantics:
This covers my main concern, which is that the first choice for people using floats should be that they specify the precision of the float. However, if they want behavior that isn't in the prescribed list, they can use |
This works for me. |
wfm too |
To discuss: which rounding mode to use implicitly on the Magnitude and SignificantDigits options. Three choices:
The code size and performance difference between "halfTrunc" and "halfEven" is likely small, but I haven't measured it. |
Another option to consider: we could split the function into four,
An advantage here is that |
Scratch that. There's a fast path for integer-valued floats that are less than MAX_SAFE_VALUE, but for integer-valued floats larger than that, we need ryu. The fast path is just casting the float to a u64, which clients can do externally. The fast path that we use in ICU is for f64 with fewer than ~12 significant digits. We could in the future add a ryu-less function or variant that implements the fast path and fixes the f64 to ~12 digits. |
|
|
In the original PR I first added and then removed handling of float inputs - f589e62
The rationale laid by @sffc is that float is a likely bad input for operands because it may loose fractional precision, and if someone has a float they should use
FixedDecimal
before they construct operands.I'd like to see how integration of
FixedDecimal
goes and see what we'll end up with in terms of ergonomics and performance.I'll test it on integration into
fluent-rs
where we haveFluentNumber
, plural rules etc.The (very mocky) code we have now is here: https://github.com/projectfluent/fluent-rs/blob/master/fluent-bundle/src/types/number.rs#L123-L236
Once we have this all plugged in, I'd like to try to switch to icu4x and see how it works.
The text was updated successfully, but these errors were encountered: