Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal(i18n): drop global and local string refs into it; use local IcuMessage objs instead #10614

Closed
brendankenny opened this issue Apr 20, 2020 · 2 comments
Assignees
Labels

Comments

@brendankenny
Copy link
Member

brendankenny commented Apr 20, 2020

Google doc for commenting: https://docs.google.com/document/d/1NbernqIn2wJRavpjjov_DN2JgugXMNBB030c6FiUA1k/edit

Goals:

  • Drop global state which can cause unexpected behavior and undesirable interactions between multiple LH runs
  • Allow for easy serialization of localizable strings at any LH stage (i18n strings in artifacts from -G throw errors when run through -A #9269)
  • Simplify i18n pipeline
  • Simplifications allow easier-to-understand (possible) extensibility in the future
  • No changes for LHR consumers
  • No changes for audits

See @exterkamp's ideas around appending and nesting messages :)

Current pipeline

We currently have ~3 paths for localizing strings:

  1. Get a string reference, store it in the LHR, and it's automatically replaced with a localized string at the end of the run via i18n.replaceIcuMessageInstanceIds(). This is used for e.g. all localized audit results.
  2. Get a string reference, call i18n.getFormatted() on it directly. This is used for e.g. when we want to put a localized audit name in a log message.
  3. Use a serialized form of a message ID and replacement values to replace a string in an already-localized LHR, leaving the existing string in place if the message ID and/or values are no longer recognized by the running version of Lighthouse. This is swap-locale.js.

(3) works great, so if we want to get rid of global state, let's just use that and get rid of the first two :)

Proposal:

  • Continue to use the UIStrings, const str_ = i18n.createMessageInstanceIdFn(__filename, UIStrings), and messages identified by ${__filename} | ${varname}.
  • str_ now returns an IcuMessage instead of a string:
  • type IcuMessage = {
      /** The id for this message in the locale json files. */
      id: string;
      values?: Record<string, string | number>;
      /** The original string given in 'UIStrings', optionally used as a backup if no locale message can be found. */
      uiStringMessage?: string,
    };
  • Wherever possible, use the IcuMessage type where expected directly. Use the type divide we already have in several places between initial and final values of things to also represent pre- and post-localization (e.g. LH.Config.Json vs LH.Config, LH.Audit.Product, vs LH.Audit.Result).
  • Where not possible or really inconvenient, use an automatic type transform that allows (or removes) IcuMessage anywhere a string is allowed, e.g. LH.Icu<LH.Audit.Result>, LH.FormattedIcu<LH.Audit.Product>, etc. (turns out this is very rarely needed).

Pros/cons of _icuMessageInstanceMap

There are two main benefits to the global object (_icuMessageInstanceMap) that I'm aware of:

  1. Localized messages are deduped, allowing sharing of a string when there are no replacement values or when the replacement values are exactly the same.
  2. The reference to the localizable string returned from i18n.js is also a string, making it easy on the type system to mix e.g. core audits in different stages of being localized and different states of things being localized (e.g. an LHR has the same type before and after i18n.replaceIcuMessageInstanceIds() localizes any strings in it)

There may be other benefits that I'm forgetting, but the i18n pipeline has also evolved considerably since we first designed it so other advantages may have fallen by the wayside since then.

OTOH:

  1. Localized messages are deduped: this doesn't save much memory in practice. We don't have that many strings to localize and V8 is already aggressively deduping strings, which are the major source of bytes in there.
  2. It can be annoying dealing with the same type except strings have been replaced with IcuMessages (or vice versa) and you have to check which is which before using, but:
    • for the most part our code follows a very predictable path (yay) from non-localized to localized
    • audits are all localized today, and
    • you could make a good argument that representing localized vs not yet localized with the type system (and having that checked) is actually pretty useful. This also lets us decide to do things in the future like force all new core audits to be localized via the type system.

fixes #9269
closes #10520

@connorjclark
Copy link
Collaborator

connorjclark commented Apr 20, 2020

Drop global state which can cause unexpected behavior and undesirable interactions between multiple LH runs

Can you say more on this? I get that the global icu map retains information from past runs (if you run more than run LH audit in a process), but since the only time the map is read from is by using an ID that is generated specific to the run, I don't see how this creates actual problems.

Allow for easy serialization of localizable strings at any LH stage (#9269)

Not arguing against the other pros you listed, but this is the largest problem IMO. And it's worth mentioning that #10520 fixes this problem with lesser complexity.

@brendankenny
Copy link
Member Author

fixed in #10630

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants