Attendees: Daniel Ehrenberg (DE) Felipe Balbontin (FB) Frank Tang (FT) Leo Balter (LB) Myles C. Maxfield (MM) Nathan Hammond (NH) Romulo Cintra (RC) Shane Carr (SC) Steven Loomis (SL) Zibi Braniecki (ZB) Jeff Walden (JW)
- ECMA-402 editing and management
- Welcome Leo! ECMA-402 editor?
- LB: Experience in test262, tc39… I’ve confirmed I have at least 1 day a week set all year long for ecma402 as editor. My plans for this editorship is to help to get spec clean. Help everyone merge things in. Give reports to Ecma402 and general TC39 meetings, keeping track of the progress being done for the ecma402 work.
- ECMA-402 chair?
- SC: The chair position is for organizing these meetings, agendas, giving the update to TC39, making sure the meetings are well-attended, that we hear everyone's perspectives, organizing the agenda, moving through the backlog, making sure reviews happen when necessary. These are all responsibilities that I'm happy to contribute as much as I can in 2019. I did a lot of work on the agenda for this meeting; Frank sent the PR, I went through and revised it a lot. I'm excited to take the lead on those responsibilities.
- Dan will be able to stick around in 2019, but doesn't need a title.
- Thanks to support from Google i18n team
- Division of responsibilities
- Welcome Leo! ECMA-402 editor?
- Overflow from Last Meeting (see inline)
- Bugs/PRs against ECMA-402
- Stage 3 proposals:
- Intl.Locale:
- Intent to Ship approved in Chrome
- FT: I haven't flipped the bit yet, but the plan is to do this now, in M74. My assumption is that we're not going to change the spec, maybe bug fixing. Does anyone have concerns about the spec?
- DE: Sounds like no concerns
- Intl.Segmenter:
- Switch to break iteration (PRs)
- Richard Gibson seems to be absent; we have had trouble getting in touch with him
- DE: Do we have consensus in this group to switch to a break-based API, as Mark Davis recommended? It sounds like I need to take this on as a champion, as we don't seem to be getting this contribution. Note that this is after removing line breaking; we're only talking about word, grapheme and sentence breaks.
- FT: Is this the only remaining issue? There are those other PRs, like #57 from Richard Gibson, and #61. Are all the issues addressed, or are we only talking about changing this?
- DE: It's tied in together. We need to fix RG's issues, and when we change to a break-based API, those issues will change too.
- FT: So you're thinking that there is one big issue but several different aspects?
- SL: Where can I find a summary of the rationale for removing line breaking?
- MM: I'll give a summary of last month. To do line breaking, you need to know where the breaks are, but also geometry. So that makes sense to do in Houdini.
- SL: So is Houdini adding its own API for line break?
- MM: We'll be discussing it next week at the Houdini/CSS F2F.
- Last month's notes: https://docs.google.com/document/d/1bK5G3een3eDxvcDI9v2CQGAKY9M_CX3NaV6AorSZmE8/edit
- DE: Back to the issues, at what point to we resolve them?
- FT: I'm not sure what the next steps are. I would like to see someone send a PR. What is RG thinking about? We should talk to him.
- DE: I'll take this as an action item, to talk to him and make sure we have a solution proposed.
- SC: It's unfortunate that the changes are coming when Intl.Segmenter is in Stage 3, but it's the way it is. Dan was the original champion, he can work this out with Frank and Richard.
- Unified NumberFormat:
- Add method to get list of supported units?
- Tabled for next week
- Discussion: How important to this proposal are (a) unit long names and (b) obscure units? More precisely, to what extent are they worth the increased data size in the browser engine?
- How feasible is shipping this feature in various browsers given the size increase. Pre-meeting discussion:
- SGN: Note: There's still ongoing discussions about the acceptable binary size increase in Chrome -- I don't think we have a conclusive answer yet to be presented here.
- DE: I hope we can hear from other browsers about this as well, or at least get them thinking about it. Maybe we should file an issue about it as well.
- SC: There has been pushback within Chrome against increasing the data size in V8 when implementing this feature. It's not clear what the next steps are with this feature, in terms of making the case on to what extend we want to add additional features. I wanted to discuss, what is our position, as specification writers. I'd like to hear about SpiderMonkey's position here, and consider whether we should make changes
- MM: How much extra size is it?
- SC: Approximately 600kb
- FT: Depends how many locales we include. For Chrome's current locales, it would be 750kb. We have 3-4 styles of the string, and 140 units. There are so many units there
- DE: Is this data not already included in mobile operating systems?
- MM: I'd have to check how much of this is already available on iOS.
- ZB: Are some of these already supported, and some of the new units that we'd be adding?
- SC: CLDR doesn't provide any particular hierarchy of whether units are important or not. One path is removing "obscure units", and another path is, within the units themselves, cutting out the long form and just keeping the short and narrow forms. The long form has a larger data size footprint. To go back to Zibi's question, CLDR approves units by release; units need a user who requests them, but it doesn't decide which ones are important. If we categorize them, we'd be the first to include this.
- ZB: Do we include all of the CLDR units in the spec?
- SC: Yes.
- FT: If we don't expose things, it's a bug. There man be some discrepancies. In my opinion, the long form is not useful. The long form is like "meters", which is more like creating a sentence than exposing a united value.
- SL: Would it be useful for selection of units?
- MM: I think long name is more useful than obscure units.
- FT: This name would not be appropriate for selection, more for printing like "5 meters"; it's not meaningful standalone
- ZB: Jeff may provide more feedback. It seems to be that Mozilla will have a problem adding 750kb, but to cut the size, excluding some styles or narrowing down the list of units. I think there are 4-5 major units that are most important, and others are not a priority.
- MM: I agree with ZB, and I also wanted to say that based on what iOS includes, we would not be OK with increasing the iOS footprint. Let's make the specification vague enough to give implementers the choice of which data to ship.
- ZB: Could we fall back to the unit identifier string?
- SC: The unit identifier string is not intended to be human-readable. Maybe we could find something in ISO that would be a reasonable fallback. But the root bundle is compact and lightweight and could be a reasonable fallback.
- ZB: Some units serve as their own fallback. So if some units don't have data in the locale, we can fall back to the unit itself.
- DE: That doesn't make sense; we should fall back to english.
- ZB: I meant to use the ISO.
- DE: I don't think there's a standard…
- SC: We discussed this in depth previously; our conclusion was that there was no reasonable fallback outside of locale data. Implementations can use the data in the root bundle if they need a fallback. Can you open a GitHub issue about ISO?
- SL: An implementation needs to be able to provide the subset that it can support and decide on. If there's a hierarchy, we should maintain it on the CLDR side and not not in ECMA. As far as the names, I looked into the GNU units library, and it was not an exact match, but that could be a worthwhile comparison.
- FT: A good text to speech engine would read the short forms well.
- ZB: I also wanted to discuss implementations including their own units. But I think it's important for us to be able to implement a subset of the units. I would prefer not to have a freeflow. The second is if we allow browsers to go on their own list, Chrome will be the default and Mozilla will receive bug reports if we lack units that Chrome has.
- MM: Having a set that is required is a good idea.
- SC: If we do include long names, how should we decide on which units to include? We can agree that digital units and distance units are important, but I'm not comfortable with deciding on the rest of them. We should think hard on how we do this; I would leave this to CLDR.
- SL: We could make the specification include the long form, but just have implementations fall back to the short form. If it's too burdensome, we could leave it out of the spec.
- MM: Data size constraints should affect the spec.
- ZB: I understand SC's concern about selecting units, but implementers will have an uphill battle and will have to justify the units being added. So I think we will hit this whether we like it or not. So we should have answers sooner rather than later.
- DE: How big would the data be if we omit long data?
- SC: It got down to 300kb, if we omit long names. We could probably remove even more if we get rid of the display names, which aren't exposed anyway, and we are in the process of that experiment.
- ZB: We are affected differently, since we use this for our UI and use it in 100 locales. I think we could select the long tail of locales and exclude units for them, while including other locales' data.
- FT: The long form occupies more than half of the size.
- DE: Seems like a lot of this depends on whether 300kb is acceptable and whether OSs have this data already. Do Android and iOS have this data? Maybe we can follow up on this in an issue.
- ZB: For Mozilla, it's easier for us to expand the data gradually over time.
- FT: Until your boss asks you to put Firefox on a doorbell!
- SC: I'll post a link to an issue where we can follow up about these questions.
- Link: tc39/proposal-unified-intl-numberformat#39
- How feasible is shipping this feature in various browsers given the size increase. Pre-meeting discussion:
- Add method to get list of supported units?
- Intl.RelativeTimeFormat:
- Update about ICU 64 implementation to support Intl.RelativeTimeFormat from sffc.
- Intl.ListFormat:
- Update about ICU 64 implementation to support Intl.ListFormat from sffc.
- BigInt.prototype.toLocaleString and Intl.NumberFormat
- Should overloading between BigInt and Number be supported? (discussion)
- DE: I made a case that we should try to split out the two methods. But then we ran out of time. So I will have to wait for next meeting.
- FT: So we are blocked until the next TC39?
- DE: Yes. But it is fine to ship BigInt.prototype.toLocaleString.
- FT: But the way you wrote the PR, it will impact NumberFormat.
- DE: Yes, but it won't impact BigInt.prototype.toLocaleString. Would you like me to split that out?
- FT: If the PR gets shipped, it means that the NumberFormat will be able to take a BigInt.
- DE: Yeah, the two features are grouped. We should ship part of the PR but not the other part of the PR.
- DE: Does anyone here have a strong opinion on overloading? (no)
- MB: FWIW, I’ve come around to being okay with a separate Intl.BigIntFormat. It’s a bit unfortunate for the NumberFormat case IMHO, but having separate BigInt and Number classes is the most consistent way forward across the platform.
- SC: I don't think Intl.BigIntFormat has been proposed. The current question is whether to add Intl.NumberFormat.prototype.formatBigInt or whether to overload Intl.NumberFormat.prototype.format.
- MB: I stand corrected; s/class/method/.
- Intl.Locale:
- Stage 2 proposals:
- intl-datetime-style:
- v8 implement "January 25, 2019" spec behind flag --harmony-intl-datetime-style Design Doc
- DE: I will propose this for Stage 3 in March
- DateTimeFormat formatRange:
- Update about ICU 64 implementation to support DateTimeFormat.formatRange from sffc.
- Thanks Shane!
- Overflowed rangeCollapse option
- FB: (summarizes issue linked above)
- DE: What are the imagined use cases?
- FB: The idea is if, for some reason, you still want to display the month twice.
- DE: Is there a Google demand for the rangeCollapse option?
- FB: This is a new API so we don't have internal users right now.
- SC: Alignment (spreadsheet layout) is a use case.
- MM: If there is no existing implementation of something like this, it’s a strong signal it isn’t necessary. People have been getting by without it since the beginning of software engineering
- DE: I agree; I would omit this unless there is a clear use case.
- FB: The ICU implementation is easy to make, but it would be better to add it to TR35 first.
- FB: Is everyone OK with leaving this out for now?
- SL: Yes, I think it's important to have the use case.
- Reviewers for stage 3?
- Leo (as editor), Sathya Gunasekaran, Kevin Smith, Nathan Hammond
- We intend to propose this for Stage 3
- SC: Is the spec text finished?
- FB: I just need to update it based on some decisions we made here and then it will be done.
- SC: I'd make those changes and notify them.
- MM: When designing this API, was any consideration given to being able to implement it on top of NSDateIntervalFormatter?
- FB: So I have looked at that API and it seems this can be implemented over it, but I did not take a very close look. I can look in more detail.
- NH: NSDateIntervalFormatter has some deviations from ICU: For date ranges <24h, it will reorder your arguments for you, >24h it will delegate down to ICU. You can use dateStyle: none, timeStyle: none and end up with… Nothing. If you specify dateStyle: none or timeStyle: none, and it ends up having this in the output, it will end up using it. I think it's possible that we could get people to fix it. But I do not see that happening on a short timescale. The other thing is it is kind of an enumerable set of things that's weird, so even if we don't ship something that works 100% of the time, we can write in edge case behavior in JavaScriptCore to work around it.
- SC: Does NSDateFormatter have support for formatToParts?
- NH: Yes, but limited. It does not identify the source of the date.
- FB: That's the previous ICU behavior as well; Shane just fixed it. So in theory could this feature be added to NSDateFormatter?
- NH: Hypothetically; I don't write NSDateFormatter. I wouldn't know who to talk to.
- FT: Peter Edberg is the person to talk to at Apple (in the ICU TC).
- NH: I can say, with confidence, we want this. I have tons of use cases for this myself. I would love to have this. So I am volunteering to review this.
- FB: What should be the way forward with respect to your concerns about NSDateIntervalFormatter?
- NH: I don't think that we should adjust the spec; we should use the ICU behavior, and not build the spec around the NSDateIntervalFormatter. Treat that as an unfortunate deviation.
- FB: It's true that ICU will not enforce a particular order in the dates. SL and MD said that this was an overlook. We decided last meeting to throw a RangeError if the start date is after the end date in time. That's the most conservative option. Is that fine?
- NH: Yes that's fine.
- SL: I agree with making it an error. CLDR may want this fixed/specified as well. We should file a ticket on CLDR for that.
- LB: Let's make sure reviews are ideally within 2 weeks before the TC39 meetings.
- Update about ICU 64 implementation to support DateTimeFormat.formatRange from sffc.
- intl-datetime-style:
- Stage 1 proposal:
- Intl.DisplayNames:
- Promoted to Stage 1 in January, hooray!
- Changed to use option "type" to and reduce to one common method of()
- Passing array of codes, and return array of names.
- Type: "language", "region", "script", "currency", "dateField", "dateSymbol"
- Possible to add other types
- Plan to propose to move to Stage 2 in March 26-28 TC39.
- Question: Which strings should be included?
- Is this data available in various OSes? Will the feature be shippable in practice?
- How are we deciding what's included and excluded?
- SC: For Chrome, we won't be shipping any more data, since we already send this.
- MM: What is different about this kind of data than any other translation result?
- FT: This is just a mapping to a particular thing
- SL: It's like the units; mapping fixed things rather than units
- FT: Translate means you're doing many fancy things. This doesn't do machine learning.
- ZB: This is exposing data that we already have. You can get this by parsing the result of Intl.DateTimeFormat; we want a more basic API to expose just the display names. This is a way to expose the data we already have without hacking around it.
- MM: Websites will have to translate all the strings they have. Why not have them translate these tokens also?
- FT: If you read the vision statement in the proposal: It will reduce the download size of applications. One big thing I had to fight, seven years ago, is how to reduce the size. All the language names was huge. If it were built into the browser, it wouldn't need to be downloaded. Instead, there would be a standardized name. It will also help to reduce their translation cost.
- RC: In my opinion, this is a great benefit for developers. Internationalized websites, keeping track of all of these common things, made it kind of difficult sometimes. Having this util will be a be a big benefit. Not just currencies but country lists, long forms that we need to send. It's difficult to maintain these.
- FB: We have difficulty creating a reusable language picker because of the data size. This will help to create a widget like that.
- SL: So, it can bring translation avoidance to do rework of other translated content that's not specific to the application. So each application doesn't have to repeat the translation of the regions.
- ZB: There is some thought about adding a language picker as an HTML widget.
- DE: Is that active?
- ZB: More as Twitter conversations
- MM: I'm not sure if it's worth adding and supporting this forever, but I'm not going to hold it back.
- SL: This is one thing that applications already have the burden. If you look at applications, this is a major item that's common to translated applications, carrying the list of names. These are things that change commonly, whereas the browser can keep this up to date.
- MM: How often does this change? A language picker would be only the supported languages.
- SL: E.g., FRYOM -> North Macedonia . Maintainers of operating systems, browsers, etc, will keep themselves up to date.
- RC: We have many forms for updating customers, and we have to keep these forms up to date. It's a big burden.
- FT: For each page downloaded, there's an opportunity cost of latency. Everyone's thinking about, how can I reduce my download cost, so it's the latency of every page load.
- MM: We've done performance analysis on many web pages, and we haven't noticed language pickers as a big issue.
- ZB: The thing that makes this relevant to provide as a core API, and not other names, is because it is a common thing that many websites will translate the same way. The list of languages and regions is a clear win, even if it increases the size a bit. (The W3C recommends providing a language picker.)
- FT: HTTP has Accept-Language:, and browsers have to have the localization anyway. For regions, there are only 240, so it's a pretty closed set. Currency has to be there already for NumberFormat.
- MM: My concern was not about number of bytes in the binary, but rather the human cost of supporting a new API forever.
-
- FT: Another question: Which data should we include. I excluded emoji name (as Dan suggested), as this would have a lot of data and there's no ICU API for this right now.
- MM: I am worried that people will use this as a way to reimplement date formatter, where we already have better tools for that. Many of the arguments we made on this call are not applicable to dateSymbol, as they don't change over time.
- FT: If you want to create an HTML calendar widget. We care less about how people could misuse it and more about how can people properly use it.
- MM: You mentioned Google Calendar. This site could include their strings for the days of the week, just like they find any other localized strings. Worrying about misuse is a philosophical difference, where the WebKit team worries a lot about this, as we've seen many cases of misuse in the past.
- SL: This provides a basis for libraries to implement things. This is a low-level primitive.
- MM: Low-level primitives make sense when there's a lot of possibilities in the ecosystem that could happen on top of it. I'm not sure that this thing matches that criteria. I'm not sure there are many different things that this API could be used for, besides a picker for calendars and trying to reimplement a date formatter.
- DE: I feel like I'm missing something; given how this is a very widely reused form/widget, do you think we should have an HTML element for this purpose?
- MM: I think the data should be shipped with the application.
- SL: Websites shouldn't replicate data which is always the same.
- MM: That makes sense, but the other side is--for dateSymbol, this is how websites are doing it today, and it's not a lot of data.
- SL: But it is a lot of data across languages and across calendars.
- MM: Presumably, the user picks a localization and then downloads data just for that localization.
- SL: But you're saving effort for managing all that data.
- MM: They have to manage lots of other kinds of data as well.
- DE: Maybe we could broaden the pool of web developers that we talk to about this API. We have one on this call, who said that it would be very useful, but we could talk to more to understand whether their needs match SL's or MM's estimate
- MM: That sounds like a good idea.
- SL: With the Japanese era change, there will be a big roll-out that will be hard to manage. We could centralize this.
- MM: Is era name included in this proposal?
- FT: Not here.
- SL: Dialect? https://github.com/tc39/proposal-intl-displaynames/issues/20 nl-BE - Flemish , not (Dutch in Belgium)
- SL: I think we need an option for this, whether you want to see "Flemish" or "Dutch (Belgium)"
- Intl.DisplayNames:
- Other topics: 2. Polyfills 1. Progress on Globalize.js wrappers? 2. Note, excellent work with https://github.com/wessberg/intl-list-format and https://github.com/wessberg/intl-relative-time-format !
- Future meetings
- March 21th, 2019, 16:00 UTC