Add support for `e` oprand to PluralOperands #228

zbraniecki · 2020-09-04T17:45:16Z

In https://unicode-org.atlassian.net/browse/CLDR-12010 CLDR adds e operand. We should implement it as well.

The text was updated successfully, but these errors were encountered:

sffc · 2020-09-05T05:15:00Z

In FixedDecimal, I would like to add an exponent field, similar to how @echeran did it in ICU4J.

zbraniecki · 2020-12-03T21:55:45Z

I started looking into this, because it'll block our upgrade to CLDR-38.

In 38, the only use of e I found is in French:

e = 0 and i != 0 and i % 1000000 = 0 and v = 0 or e != 0..5 @integer 1000000, 1e6, 2e6, 3e6, 4e6, 5e6, 6e6, … @decimal 1.0000001e6, 1.1e6, 2.0000001e6, 2.1e6, 3.0000001e6, 3.1e6, …

Can I get some help in understand what is the intended behavior here?

Should we calculate e for an input value 1000 to be 3? Or should it only be calculated for an input value of "1e3"? @echeran can you help?

sffc · 2020-12-04T23:03:23Z

@echeran @macchiati What is the status of "c" versus "e" for the new operand?

echeran · 2020-12-05T00:00:48Z

Yeah, the operand e is only being used in French. It represents the suppressed exponent in compact numbers, so yes, for input value 1000, e = 3 and the formatted output is 1K. And for input value 120000, if the compact number becomes 120K, then e is still 3.

It's not quite ready for general use because of a couple of issues. First, I implemented it for scientific notation as well, which means for 1000, then e = 3 for formatted output 1e3. But for input value 120000, e = 5 for formatted output 1.2e5.

Here are the unit tests for computing the e plural operand, as well as applying a FormattedNumber with the correct e values to the plural rules to obtain the proper plural keyword:

	testCompactDecimalSuppresedExponent	testCompactDecimalPluralKeyword
ICU4J	link	link
ICU4C	link	link

Rule data in CLDR have only been added for French because it's not quite ready for general use for a couple of reasons. The first reason is that I implemented the suppressed exponent computation for scientific notation as well. And it turns out that French actually treats plurals on compact numbers differently than it does plurals for scientific notation. So when we write out the plural samples, and try to parse them into a number type rich enough to support e, we need to further disambiguate a compact number from a scientific number when we see something like 1e6 and go to parse. So it was proposed to add a c operand to represent compact number suppressed exponent (ex: 1c6), and leave e for scientific notation suppressed exponent (1e6).

The other reason/issue on the ICU4C/J side is making sure we use a rich enough number type to hold e and c -- ex: FormattedNumber -- consistently across ICU and CLDR code that is also fair game to expose publicly. In other words, we shouldn't continue to use doubles in ICU4C (ex: in plural sample parsing unit tests), since it is obviously not capable of encoding this extra info we need.

For the implementation of plural operand c, I haven't started, but I am not blocked on that.

As for the change of how ICU parses plural sample strings into numbers, that will cause an API change, so @macchiati and I need to draft that before proceeding on that work.

zbraniecki · 2020-12-05T00:34:57Z

@echeran am I correct to interpret your words and the tests that only string input (or FixedDecimal) can make e anything else than 0?

If the input (in Rust) is u64, isize, or f64, it will never need e and if it is 1000000 it will produce:

PluralOperands {
    n: 1000000,
    i: 1000000,
    v: 0,
    w: 0,
    f: 0,
    t: 0,
    e: 0
}

Am I correct?

In which case, the only operation plurals crate itself carries about is FromStr for PluralOperands and then separately FromStr for FixedDecimal.
Everything else sets e to 0, right?

sffc · 2020-12-05T01:02:05Z

Once the open questions are hammered out, FixedDecimal should be equipped with an e (or c) field. impl From<&FixedDecimal> for PluralOperands should not try to "guess" what e (or c) should be.

echeran · 2020-12-05T19:22:49Z

@echeran am I correct to interpret your words and the tests that only string input (or FixedDecimal) can make e anything else than 0?

Yep, that's right, strings/FixedDecimals/any other rich-enough-type are necessary to distinguish 120000.0 from 120K from 1.2e5 from 12e4, etc. If you start off with the knowledge of the differences in these representations, it's a lossy operation to use a u64/long/f64/double, which can only see all those representations as the same.

If the input (in Rust) is u64, isize, or f64, it will never need e

Yes, this is an assumption we must make when implementing the function. We must clearly publicize the distinction (warning? caveat?) in the doc strings for the function so it is clear to users what tradeoffs they are making by using it.

and if it is 1000000 it will produce:

PluralOperands {
    n: 1000000,
    i: 1000000,
    v: 0,
    w: 0,
    f: 0,
    t: 0,
    e: 0
}

Am I correct?

Yes, that's right. That's all that is safe for us to infer from i64/u64. And we will probably need a c operand, as well -- c will be the new e.

In which case, the only operation plurals crate itself carries about is FromStr for PluralOperands and then separately FromStr for FixedDecimal.
Everything else sets e to 0, right?

Yep! And perhaps you were already thinking about this, but if you already start with an implementation of the conversion FromStr for FixedDecimal, then FromStr for PluralOperands should be a simple composition of FromStr for FixedDecimal with From<&FixedDecimal> for PluralOperands.

zbraniecki added T-enhancement Type: Nice-to-have but not required A-scope Area: Project scope, feature coverage v1 C-pluralrules Component: Plural rules labels Sep 4, 2020

sffc assigned zbraniecki Sep 11, 2020

sffc added this to the 2020 Q4 milestone Sep 17, 2020

zbraniecki mentioned this issue Dec 4, 2020

Add support for E PluralOperand #407

Merged

sffc mentioned this issue Dec 4, 2020

NumberFormatter & co. scope (unified vs modular) #275

Open

zbraniecki closed this as completed in #407 Jan 9, 2021

sffc mentioned this issue Jan 28, 2021

Implement FixedDecimalFormat #467

Closed

sffc removed the v1 label Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for `e` oprand to PluralOperands #228

Add support for `e` oprand to PluralOperands #228

zbraniecki commented Sep 4, 2020

sffc commented Sep 5, 2020

zbraniecki commented Dec 3, 2020

sffc commented Dec 4, 2020

echeran commented Dec 5, 2020

zbraniecki commented Dec 5, 2020

sffc commented Dec 5, 2020

echeran commented Dec 5, 2020

Add support for e oprand to PluralOperands #228

Add support for e oprand to PluralOperands #228

Comments

zbraniecki commented Sep 4, 2020

sffc commented Sep 5, 2020

zbraniecki commented Dec 3, 2020

sffc commented Dec 4, 2020

echeran commented Dec 5, 2020

zbraniecki commented Dec 5, 2020

sffc commented Dec 5, 2020

echeran commented Dec 5, 2020

Add support for `e` oprand to PluralOperands #228

Add support for `e` oprand to PluralOperands #228