Implement InitializeDateTimeFormat #2072

NorbertGarfield · 2022-05-14T18:54:36Z

This Pull Request partially fixes #1562.

It changes the following:

Adds IsValidTimeZoneName
Adds CanonicalizeTimeZoneName
Implements BasicFormatMatcher
Implements BestFitFormatMatcher
Implements InitializeDateTimeFormat

codecov · 2022-05-14T19:06:28Z

Codecov Report

Merging #2072 (90a7a25) into main (bad686d) will increase coverage by 0.61%.
The diff coverage is 72.96%.

@@            Coverage Diff             @@
##             main    #2072      +/-   ##
==========================================
+ Coverage   43.81%   44.43%   +0.61%     
==========================================
  Files         217      217              
  Lines       19560    19978     +418     
==========================================
+ Hits         8571     8877     +306     
- Misses      10989    11101     +112

Impacted Files	Coverage Δ
boa_engine/src/context/mod.rs	`31.85% <ø> (+0.91%)`	⬆️
boa_engine/src/object/mod.rs	`20.43% <14.28%> (+0.32%)`	⬆️
boa_engine/src/context/icu.rs	`37.50% <20.00%> (-42.50%)`	⬇️
boa_engine/src/builtins/intl/mod.rs	`65.51% <53.84%> (+4.55%)`	⬆️
boa_engine/src/builtins/intl/date_time_format.rs	`78.11% <76.03%> (+21.86%)`	⬆️
boa_engine/src/builtins/function/mod.rs	`24.34% <0.00%> (-4.87%)`	⬇️
...ngine/src/syntax/parser/statement/try_stm/catch.rs	`43.20% <0.00%> (-2.47%)`	⬇️
...oa_engine/src/syntax/parser/statement/block/mod.rs	`41.46% <0.00%> (-2.44%)`	⬇️
boa_engine/src/syntax/parser/mod.rs	`36.76% <0.00%> (-2.30%)`	⬇️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bad686d...90a7a25. Read the comment docs.

raskad · 2022-05-15T02:24:25Z

VM implementation

Test result	main count	PR count	difference
Total	90,186	90,186	0
Passed	54,651	54,687	+36
Ignored	23,273	23,273	0
Failed	12,262	12,226	-36
Panics	0	0	0
Conformance	60.60%	60.64%	+0.04%

Fixed tests (36):

test/intl402/DateTimeFormat/constructor-options-order-timedate-style.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-order-timedate-style.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-style-conflict.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-style-conflict.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-timeStyle-invalid.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-timeStyle-invalid.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-numberingSystem-invalid.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-numberingSystem-invalid.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-throwing-getters-fractionalSecondDigits.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-throwing-getters-fractionalSecondDigits.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-dateStyle-invalid.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-dateStyle-invalid.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-order.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-order.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-fractionalSecondDigits-invalid.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-fractionalSecondDigits-invalid.js (previously Failed)
test/intl402/DateTimeFormat/timezone-invalid.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/timezone-invalid.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-calendar-invalid.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-calendar-invalid.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-throwing-getters.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-throwing-getters.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-order-fractionalSecondDigits.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-order-fractionalSecondDigits.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-timeZoneName-invalid.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-timeZoneName-invalid.js (previously Failed)
test/intl402/DateTimeFormat/constructor-calendar-numberingSystem-order.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-calendar-numberingSystem-order.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-order-dayPeriod.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-order-dayPeriod.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-dayPeriod-invalid.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-dayPeriod-invalid.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-throwing-getters-dayPeriod.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-throwing-getters-dayPeriod.js (previously Failed)
test/intl402/DateTimeFormat/constructor-options-throwing-getters-timedate-style.js [strict mode] (previously Failed)
test/intl402/DateTimeFormat/constructor-options-throwing-getters-timedate-style.js (previously Failed)

NorbertGarfield · 2022-05-15T10:16:12Z

@raskad, thanks for the update. I fixed these panicking tests locally, but I don't know how to gather that VM implementation report. My current method is to run failing script directly via cargo run -- /path/to/failing_test.js.

Also, Locale::from_bytes call panicked when parsing "i-klingon" locale, so I had to add a fallback to my previous implementation to separate extensions from the locale id. Please let me know if you have another idea.

raskad · 2022-05-15T12:39:07Z

@raskad, thanks for the update. I fixed these panicking tests locally, but I don't know how to gather that VM implementation report. My current method is to run failing script directly via cargo run -- /path/to/failing_test.js.

You are right, we should describe how to gather this report locally.
I always do it like this:

git checkout main
cargo run --release --bin boa_tester -- run -vv -o results/main
git checkout <your-branch>
cargo run --release --bin boa_tester -- run -vv -o results/<your-branch>
cargo run --release --bin boa_tester -- compare results/mainlatest.json results/<your-branch>/latest.json

You may want to redirect the output of the boa_tester run commands to some file, because printing ~200k lines to stdout can be really slow.

I will review the PR in a bit ;)

jedel1043 · 2022-05-15T18:34:31Z

@NorbertGarfield Do note, however, that if you're running boa_tester on Windows you'll probably have some additional panicking tests caused by #1570. You can just ignore them in the meantime 😅

boa_engine/src/builtins/intl/date_time_format.rs

zbraniecki

Hi.

First of all, this is so exciting for us. We're building ICU4X particularly to target ECMA-402 needs well and so far we've mostly been testing it in context of SpiderMonkey use in Firefox.
Your work on boa+ecma-402 is a perfect fit and I'd love us to collaborate closer to supply your needs.

For this particular PR, I left some drive-by comments, and my overarching feedback, as a maintainer of locid component is - you should never have to work with strings. Any place where you substr or look for -u or anything like that feels like either icu_locid should provide you that, or you're doing something wrong. Generally speaking I strongly believe the whole implementation should always operate on a Locale type and only stringify/parse on output/input.
If something is missing or the API we have is suboptimal, let's fix it. It's a perfect timing as we're warming up to 1.0 stability but still have wiggle room left.

boa_engine/src/builtins/intl/mod.rs

boa_engine/src/builtins/intl/tests.rs

jedel1043 · 2022-05-31T03:13:56Z

@zbraniecki Thank you for your review! As you can see, our Intl implementation using icu4x is still very lacking, but I hope we can be of help by being early adopters of the library!

On another note, we had a small problem with Locale when we implemented the BestAvailableLocale operation. The algorithm described there requires the steps:

b. Let pos be the character index of the last occurrence of "-" (U+002D) within candidate. If that character does not occur, return undefined.
c. If pos ≥ 2 and the character "-" occurs at index pos - 2 of candidate, decrease pos by 2.
d. Let candidate be the substring of candidate from position 0, inclusive, to position pos, exclusive.

If we translate this to a high level operation, the spec requires removing the last word from the locale. Unfortunately, Locale doesn't expose its internal data structures, so we have to convert to a String and parse again if we want to reduce the Locale by one word. So, would it be too much to ask for a pop_last_word method in Locale?

If that's not possible, then a way to deconstruct the internals of each extension would be good, because that would allow us to do something like: mem::take -> deconstruct -> modify -> recreate modified using unchecked_*.

zbraniecki · 2022-05-31T03:26:05Z

Uuh, that's really unfortunate. This seems to want you to operate on a stringified version of Locale, which i think we should avoid. There are other issues with it (like, "sr-Cyrl" should not be shortened to "sr"), but also, it makes the unreasonable attempt to cut out subtags from extensions.

I think this operation should instead operate on [variants, region, script, language] array removing each one in order after removing all extensions.

I don't think we should be adding APIs to handle treating Locale as a string and make semantic operations based on positions of subtags. I think this is a change we should apply to ECMA-402 spec. CC @sffc for validation and we can file an issue there if he doesn't point out that I am missing something obvious.

sffc · 2022-05-31T04:58:17Z

The spec text only requires that behavior be consistent with what would happen if you did the string operations on the locale string. You can implement it differently, so long as the result is the same. In this case, BCP47 guarantees that variant always comes after region which always comes after script which always comes after language, and you can build that assumption into your code.

jedel1043 · 2022-05-31T08:32:20Z

I don't think Language Identifiers are the problem per se; I can easily manipulate LanguageIdentifiers using the provided API, and a canonicalized Language Identifier cannot become invalid by following the spec algorithm. The problem are extensions, which could become invalid if that algorithm is followed as stated e.g. with hi-t-en-h0-hybrid it would try to match with hi-t-en-h0, which has an invalid t extension since the h0 key ought to have a value.

Support both '-u-' and 'u-' extension patterns.

zbraniecki · 2022-05-31T20:07:39Z

The problem are extensions, which could become invalid if that algorithm is followed as stated e.g. with hi-t-en-h0-hybrid it would try to match with hi-t-en-h0, which has an invalid t extension since the h0 key ought to have a value.

I claim with the authority given to me by the ultimate powers (Shane in this case) that you should, for all purposes, treat this as a LanguageIdentifier. It just that ICU4X is the first time we provide API surface to the spec that identifies LI as a separate thing from Locale.

I hereby recommend:

In this code, take LI out of Locale and in the algo just take out subtag after subtag and compare.
File an issue in ECMA-402 to request additional precision in the algo to cut out extensions

I'd particularly appreciate if you did (2) as I know it's easy to stop on (1) and move on :) Thank you!

jedel1043 · 2022-05-31T23:00:40Z

@zbraniecki I opened an issue in the spec and in the meantime we'll follow your suggestions and treat the input as a Language Identifier. Thank you for all your assistance!

Improve test semantics Fetch default timezone from the system

jedel1043

I think the main problem with the approach this PR takes is the many possible ways to lose correctness because of the types used in several definitions throughout the implementation.

For example, the DateTimeFormat type is defined as:

pub struct DateTimeFormat {
    initialized_date_time_format: bool,
    locale: JsString,
    calendar: JsValue,
    numbering_system: JsValue,
    time_zone: JsString,
    weekday: JsValue,
    era: JsValue,
    year: JsValue,
    month: JsValue,
    day: JsValue,
    day_period: JsString,
    hour: JsValue,
    minute: JsValue,
    second: JsValue,
    fractional_second_digits: JsString,
    time_zone_name: JsValue,
    hour_cycle: JsValue,
    pattern: JsString,
    bound_format: JsString,
    date_style: JsValue,
    time_style: JsValue,
}

So, what would happen if someone creates an instance of that type as follows?

pub struct DateTimeFormat {
    initialized_date_time_format: true,
    locale: "en-US".into(),
    calendar: "gregori".into() // Notice the typo here,
    numbering_system: "arab".into(),
    ...
}

We can't really tell what would happen in this case without scanning the whole module, and that's a loss for the maintainability of the code.

In the next days I'll implement an API that conflates the requirements for each Intl spec object into a collection of traits and types that preserves the correctness of our code.

zbraniecki · 2022-06-01T20:56:22Z

you may want to track my work on icu_preferences which are targeting this problem space - unicode-org/icu4x#1833

I had a design brainstorm session with Shane last week and will push new revision soon, but I think it may simplify your logic quite a bit.

jedel1043 · 2023-11-29T17:59:00Z

There's a new Intl API in place for implementing Intl services more easily. This is waiting on the author to decide if they want to rebase on that or if we want to do it from scratch.

NorbertGarfield · 2023-12-27T07:46:24Z

@jedel1043, sorry for the late response. I'll try to implement it from scratch. I'm closing this PR.

Razican reviewed May 15, 2022

View reviewed changes

boa_engine/src/builtins/intl/date_time_format.rs Outdated Show resolved Hide resolved

raskad requested changes May 16, 2022

View reviewed changes

NorbertGarfield force-pushed the garfield/init-date-time branch 2 times, most recently from cb12567 to a6fbb8a Compare May 16, 2022 09:05

NorbertGarfield requested review from Razican and raskad May 16, 2022 09:06

jedel1043 reviewed May 16, 2022

View reviewed changes

boa_engine/src/builtins/intl/date_time_format.rs Outdated Show resolved Hide resolved

NorbertGarfield force-pushed the garfield/init-date-time branch 4 times, most recently from 0dca742 to 6e5f442 Compare May 30, 2022 17:58

zbraniecki reviewed May 30, 2022

View reviewed changes

boa_engine/src/builtins/intl/mod.rs Outdated Show resolved Hide resolved

boa_engine/src/builtins/intl/mod.rs Outdated Show resolved Hide resolved

boa_engine/src/builtins/intl/tests.rs Outdated Show resolved Hide resolved

NorbertGarfield added 5 commits May 31, 2022 14:36

Implement InitializeDateTimeFormat

3e028ab

Fallback to fetch unicode extensions

04bb46e

Replace maps with pattern matching

025c5c8

Update UnicodeExtensionComponents

b530402

Support both '-u-' and 'u-' extension patterns.

Implement bounds for dyn BoaProvider

617c5f4

NorbertGarfield force-pushed the garfield/init-date-time branch from 6e5f442 to 617c5f4 Compare May 31, 2022 14:46

NorbertGarfield and others added 3 commits May 31, 2022 16:46

Switch to ICU for extension parsing

93b7ea9

Fix ResourceProvider impls for &dyn BoaProvider

b2764e5

Update test with supported styles

ae85769

NorbertGarfield requested a review from jedel1043 May 31, 2022 18:22

jedel1043 mentioned this pull request May 31, 2022

BestAvailableLocale operation requires additional clarifications/changes to handle extensions tc39/ecma402#685

Closed

Adjust dependencies to improve compilation times

90a7a25

Improve test semantics Fetch default timezone from the system

jedel1043 requested changes Jun 1, 2022

View reviewed changes

jedel1043 added the waiting-on-author Waiting on PR changes from the author label Nov 29, 2023

NorbertGarfield closed this Dec 27, 2023

jedel1043 removed the waiting-on-author Waiting on PR changes from the author label Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement InitializeDateTimeFormat #2072

Implement InitializeDateTimeFormat #2072

NorbertGarfield commented May 14, 2022

codecov bot commented May 14, 2022 •

edited

Loading

raskad commented May 15, 2022 •

edited

Loading

NorbertGarfield commented May 15, 2022

raskad commented May 15, 2022

jedel1043 commented May 15, 2022 •

edited

Loading

zbraniecki left a comment

jedel1043 commented May 31, 2022 •

edited

Loading

zbraniecki commented May 31, 2022

sffc commented May 31, 2022

jedel1043 commented May 31, 2022 •

edited

Loading

zbraniecki commented May 31, 2022

jedel1043 commented May 31, 2022

jedel1043 left a comment

zbraniecki commented Jun 1, 2022

jedel1043 commented Nov 29, 2023

NorbertGarfield commented Dec 27, 2023

Implement InitializeDateTimeFormat #2072

Implement InitializeDateTimeFormat #2072

Conversation

NorbertGarfield commented May 14, 2022

codecov bot commented May 14, 2022 • edited Loading

Codecov Report

raskad commented May 15, 2022 • edited Loading

VM implementation

NorbertGarfield commented May 15, 2022

raskad commented May 15, 2022

jedel1043 commented May 15, 2022 • edited Loading

zbraniecki left a comment

Choose a reason for hiding this comment

jedel1043 commented May 31, 2022 • edited Loading

zbraniecki commented May 31, 2022

sffc commented May 31, 2022

jedel1043 commented May 31, 2022 • edited Loading

zbraniecki commented May 31, 2022

jedel1043 commented May 31, 2022

jedel1043 left a comment

Choose a reason for hiding this comment

zbraniecki commented Jun 1, 2022

jedel1043 commented Nov 29, 2023

NorbertGarfield commented Dec 27, 2023

codecov bot commented May 14, 2022 •

edited

Loading

raskad commented May 15, 2022 •

edited

Loading

jedel1043 commented May 15, 2022 •

edited

Loading

jedel1043 commented May 31, 2022 •

edited

Loading

jedel1043 commented May 31, 2022 •

edited

Loading