Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostics icu4x based list formatting. #104047

Merged
merged 8 commits into from
Nov 19, 2022

Conversation

crlf0710
Copy link
Member

@crlf0710 crlf0710 commented Nov 6, 2022

This adds a new kind of DiagnosticArg and add the ability to convert it to a FluentValue::Custom. When emitting fluent output, it makes use of ListFormatter from icu4x project to convert it to localized version of list following the fluent locale, as a kind of eager translation.

Tested locally with locales like en, ja, etc, and they work fine. Though neither zh-CN nor zh-Hans works correctly, it seems they fallback to und locale somehow, emitting only comma-based list. I believe this is an internal issue of icu4x itself.(Works fine after #104047 (comment))

Would love to hear what others think.

r? @davidtwco
cc @Manishearth

@rustbot rustbot added T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) A-testsuite Area: The testsuite used to check the correctness of rustc A-translation Area: Translation infrastructure, and migrating existing diagnostics to SessionDiagnostic S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 6, 2022
@rustbot
Copy link
Collaborator

rustbot commented Nov 6, 2022

rustc_error_messages was changed

cc @davidtwco, @compiler-errors, @JohnTitor, @estebank, @TaKO8Ki

@rust-log-analyzer

This comment has been minimized.

@crlf0710
Copy link
Member Author

crlf0710 commented Nov 6, 2022

Mmmm, seems there's non-thread-safe things within icu_list::ListFormatter... @Manishearth

@Manishearth
Copy link
Member

@crlf0710 you need the sync feature on the icu_provider crate.

@Manishearth
Copy link
Member

Manishearth commented Nov 6, 2022

So I think you should simply generate for zh and zh-Hant, since zh-Hans is the default for zh. Unsure why script fallback doesn't take care of that though, cc @sffc

@Manishearth
Copy link
Member

Also, cc @zbraniecki: do you think this is the right way to use list formatter with fluent? This is only for lists of untranslated placeholders.

@crlf0710
Copy link
Member Author

crlf0710 commented Nov 6, 2022

@crlf0710 you need the sync feature on the icu_provider crate.

Thanks! In the two recent commits, I updated the features flags to include sync for parallel compiler. And also included zh into datagen locale selection.

Copy link
Member

@davidtwco davidtwco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak to any of the icu4x usage or the baked data stuff, but the integration into DiagnosticArg and other rustc infrastructure looks good to me.

@zbraniecki
Copy link

Also, cc @zbraniecki: do you think this is the right way to use list formatter with fluent? This is only for lists of untranslated placeholders.

LGTM so far, but I'll wait to see how it's hooked into Fluent (this PR only hooks in Memoizable)

Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ICU4X usage seems good too

@@ -49,6 +51,7 @@ impl<'source> Into<FluentValue<'source>> for DiagnosticArgValue<'source> {
match self {
DiagnosticArgValue::Str(s) => From::from(s),
DiagnosticArgValue::Number(n) => From::from(n),
DiagnosticArgValue::StrListSepByAnd(l) => fluent_value_from_str_list_sep_by_and(l),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zbraniecki this is where it gets hooked in to fluent

@Manishearth
Copy link
Member

@zbraniecki it's already hooked into Fluent; the lists eventually produce a FluentValue

@sffc
Copy link

sffc commented Nov 7, 2022

The fallback algorithm visits zh instead of zh-Hans since Hans is the default script for zh.

Upstream issues to improve documentation and tooling around this: unicode-org/icu4x#2683, unicode-org/icu4x#2243, unicode-org/icu4x#834

@davidtwco davidtwco mentioned this pull request Nov 8, 2022
84 tasks
@bors

This comment was marked as resolved.

@crlf0710
Copy link
Member Author

Rebased forward.

Copy link
Member

@davidtwco davidtwco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me after last comment

compiler/rustc_baked_icu_data/src/lib.rs Outdated Show resolved Hide resolved
@crlf0710 crlf0710 force-pushed the icu_based_list_formatting branch 2 times, most recently from 177cfbd to d75c76d Compare November 16, 2022 02:52
@Manishearth
Copy link
Member

@bors r=davidtwco

@bors
Copy link
Contributor

bors commented Nov 16, 2022

📌 Commit d75c76d has been approved by davidtwco

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 16, 2022
Manishearth added a commit to Manishearth/rust that referenced this pull request Nov 16, 2022
…, r=davidtwco

Diagnostics `icu4x` based list formatting.

This adds a new kind of `DiagnosticArg` and add the ability to convert it to a `FluentValue::Custom`. When emitting fluent output, it makes use of `ListFormatter` from `icu4x` project to convert it to localized version of list following the fluent locale, as a kind of eager translation.

Tested locally with locales like `en`, `ja`, etc, and they work fine. <del>Though neither `zh-CN` nor `zh-Hans` works correctly, it seems they fallback to `und` locale somehow, emitting only comma-based list. I believe this is an internal issue of `icu4x` itself.</del>(Works fine after rust-lang#104047 (comment))

Would love to hear what others think.

r? `@davidtwco`
cc `@Manishearth`
Manishearth added a commit to Manishearth/rust that referenced this pull request Nov 18, 2022
…, r=davidtwco

Diagnostics `icu4x` based list formatting.

This adds a new kind of `DiagnosticArg` and add the ability to convert it to a `FluentValue::Custom`. When emitting fluent output, it makes use of `ListFormatter` from `icu4x` project to convert it to localized version of list following the fluent locale, as a kind of eager translation.

Tested locally with locales like `en`, `ja`, etc, and they work fine. <del>Though neither `zh-CN` nor `zh-Hans` works correctly, it seems they fallback to `und` locale somehow, emitting only comma-based list. I believe this is an internal issue of `icu4x` itself.</del>(Works fine after rust-lang#104047 (comment))

Would love to hear what others think.

r? ``@davidtwco``
cc ``@Manishearth``
@Manishearth
Copy link
Member

@bors rollup=iffy

this will break naïve rollups because it introduces a new dep

Manishearth added a commit to Manishearth/rust that referenced this pull request Nov 18, 2022
…, r=davidtwco

Diagnostics `icu4x` based list formatting.

This adds a new kind of `DiagnosticArg` and add the ability to convert it to a `FluentValue::Custom`. When emitting fluent output, it makes use of `ListFormatter` from `icu4x` project to convert it to localized version of list following the fluent locale, as a kind of eager translation.

Tested locally with locales like `en`, `ja`, etc, and they work fine. <del>Though neither `zh-CN` nor `zh-Hans` works correctly, it seems they fallback to `und` locale somehow, emitting only comma-based list. I believe this is an internal issue of `icu4x` itself.</del>(Works fine after rust-lang#104047 (comment))

Would love to hear what others think.

r? `@davidtwco`
cc `@Manishearth`
@Manishearth
Copy link
Member

@bors r=davidtwco rollup=maybe

lockfile problems were inherent to this PR after merge, did a force push with fixes

@bors
Copy link
Contributor

bors commented Nov 18, 2022

📌 Commit 2721e40 has been approved by davidtwco

It is now in the queue for this repository.

bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 18, 2022
…earth

Rollup of 8 pull requests

Successful merges:

 - rust-lang#102977 (remove HRTB from `[T]::is_sorted_by{,_key}`)
 - rust-lang#103378 (Fix mod_inv termination for the last iteration)
 - rust-lang#103456 (`unchecked_{shl|shr}` should use `u32` as the RHS)
 - rust-lang#103701 (Simplify some pointer method implementations)
 - rust-lang#104047 (Diagnostics `icu4x` based list formatting.)
 - rust-lang#104338 (Enforce that `dyn*` coercions are actually pointer-sized)
 - rust-lang#104498 (Edit docs for `rustc_errors::Handler::stash_diagnostic`)
 - rust-lang#104556 (rustdoc: use `code-header` class to format enum variants)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 3181006 into rust-lang:master Nov 19, 2022
@rustbot rustbot added this to the 1.67.0 milestone Nov 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc A-translation Area: Translation infrastructure, and migrating existing diagnostics to SessionDiagnostic S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants