Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of all ICU calendars #1245

Merged
merged 14 commits into from
Feb 8, 2021

Conversation

justingrant
Copy link
Collaborator

@justingrant justingrant commented Jan 7, 2021

This PR adds initial implementations for all supported ICU non-ISO calendars, and proposes a few possible changes to the Temporal API to address issues found while building those calendars. The goal of this PR is to validate the Temporal calendar API and to identify potential issues we may need to discuss and/or resolve. To set expectations, the calendar implementations themselves pass basic tests and match the output of Intl.DateTimeFormat, but are not intended to have production-ready accuracy nor performance. Also, this PR has some gaps and unfinished parts, but I wanted to get it into GitHub so we can start discussion and resolve open issues. On a personal note, getting this PR into GitHub was helpful to keep my mind off the chaos in Washington today! Fixes #1210. FIxes #1349.

At a high level, here's what's in this PR:

  • Calendar implementations for all ICU non-ISO calendars: buddhist, chinese, coptic, dangi, ethioaa, ethiopic, gregory, hebrew, indian, islamic, islamic-umalqura, islamic-tbla, islamic-civil, islamic-rgsa, islamicc, japanese, persian, roc. The implementations themselves are kinda hacky. Conversion of ISO->calendar parses the output of DateTimeFormat.formatToParts. But calendar->ISO is even hackier: the code tries to roughly estimate an ISO date from the calendar date, then roundtrips back to calendar and then bisects the difference between estimate and actual values until we end up with the right values. This is inefficient and slow, but it guarantees that this PR's results will always match DateTimeFormat's results and it's probably good enough for an interim implementation before a real production polyfill is built later this year.
  • New fields for Temporal objects:
    • monthCode this is a year-independent, string-valued property that describes the month. For ICU calendars, it follows the iCalendar convention of ${month} for non-leap months, and is ${previousMonth}L for leap months (e.g. 'M5L' for Hebrew Adar I)
    • eraYear - the era-relative year, as a number
  • Implements the month field as the 1-based numeric index of the month in the current year. For, example, in a Chinese leap year the leap month with monthCode M7L will have month === 8. Unlike monthCode, month is not guaranteed to refer to the same month name in every year for lunisolar calendars.
  • Implements the year field as a signed value relative to an "anchor era" for each calendar, e.g. "AD" for Gregorian. This avoids potential confusion and security issues that could result from years counting backwards for years in eras like BC or the ROC calendar's Before R.O.C. era.
  • Enables calendars to control validation of input fields. This enables developers to use year or an eraYear/era pair to specify the year in from or with, and it enables developers to use either month or monthCode to specify the month.
  • Adds basic tests for each calendar, including conversion, field access, addition, and until. Note that the tests will be somewhat brittle because they will break as soon as a few calendar-related Chromium ICU bugs are fixed. (see the tests for bug links).

Here's a summary of issues related to this PR. I marked those that already contain possible fixes in this PR. There's quite a few issues here, so even if the code here can fix some of these issues, it may make sense to split those fixes out into separate PRs. For questions about specific issues, please discuss in those GH issues to keep our discussions in one spot!

Here's smaller issues that I expect to be are relatively non-controversial:

These issues I expect to require more discussion to resolve.

The items below were handled in separate PRs:

  • Documentation updates for new getters like monthCode and for changes to the Calendar API

@codecov
Copy link

codecov bot commented Jan 7, 2021

Codecov Report

Merging #1245 (57b83f7) into main (4811056) will increase coverage by 0.19%.
The diff coverage is 94.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1245      +/-   ##
==========================================
+ Coverage   95.06%   95.26%   +0.19%     
==========================================
  Files          19       19              
  Lines        9695    11121    +1426     
  Branches     1526     1824     +298     
==========================================
+ Hits         9217    10594    +1377     
- Misses        470      522      +52     
+ Partials        8        5       -3     
Flag Coverage Δ
test262 48.33% <10.83%> (-5.97%) ⬇️
tests 90.98% <89.77%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
polyfill/lib/calendar.mjs 94.11% <94.13%> (+5.07%) ⬆️
polyfill/lib/ecmascript.mjs 95.87% <100.00%> (+<0.01%) ⬆️
polyfill/lib/plaindate.mjs 94.60% <0.00%> (+1.24%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4811056...57b83f7. Read the comment docs.

@justingrant
Copy link
Collaborator Author

justingrant commented Jan 7, 2021

Calendars 101

While building this PR I had to learn a lot about various calendars. Here's a summary of what I've learned and some links and notes in case you'd like to learn more. I may update this comment with more information as needed. We may want to adapt this comment into the Temporal docs at some point.

Calendar Types

  • Solar - Examples: gregory, persian, ethiopic. These calendars have the same months every year, and the months are aligned with solar seasons. Months are typically 30-31 days long. A single leap day is added every 4 years, with some possible exceptions to the 4-year rule to better align with the solar year being ~365.2425 days long. In ICU solar calendars, leap days are always added to the same month, e.g. February for Gregorian. Some solar calendars (e.g. ethiopic, coptic, ethioaa) have twelve 30-day months and a special 5-day (6 in leap years) thirteenth month. All other ICU solar calendars have only 12 months.
  • Lunar - Calendars have months aligned with lunar cycles. Months are 29 or 30 days long. Islamic calendars (islamic, islamic-umalqura, islamic-tbla, islamic-civil, islamic-rgsa, islamicc) are the only ICU calendars that are lunar. Because there are 12.4 lunar cycles per year, these calendars' months will "move around" the solar year. For example, the Islamic month of Ramadan can happen in any solar season. A lunar calendar is used as the default civic calendar of one country (Saudi Arabia) and is widely used in the Islamic world for many purposes.
  • Lunisolar - Calendar months are aligned with lunar cycles (like lunar calendars) and have a leap month inserted every few years the "months move around the solar year" problem of lunar calendars. Leap months add a lot of complexity because the number of months per year is variable. This means that you can't cleanly express a lunisolar month as a single ordinal index and expect that index to be the same in every year. For example, the Hebrew month Elul is the 12th month in non-leap years and is the 13th month in leap years. The way that leap months are allocated is different depending on the calendar. For example, Hebrew always adds its leap month, called "Adar I", between the normally-fifth month of Shevat and the normally-sixth month of Adar, which is called "Adar II" in leap years. Much of the complexity in this PR involves dealing with lunisolar calendars. Perhaps because of this complexity, there is no country in the world that currently uses a lunisolar calendar for civic or commercial purposes. This is important because almost all software is written to support civic or commercial use cases! Instead, lunisolar calendars are typically used as an adjunct to the Gregorian calendar. The Gregorian calendar is typically used for civic and economic life while the lunisolar calendar is used for religious, cultural, or ceremonial purposes (e.g. to determine dates of Hebrew religious festivals, or to determine the astrological impact of one's birthday in the Chinese zodiac.) This implies, IMHO, that lunisolar calendars should occupy a lower-priority role for Temporal and 402-- still supported, but they're likely to be used an order of magnitude less than lunar calendars, which in turn will be used in software several orders of magnitude less than solar calendars.

Eras

All ICU calendars except hebrew, chinese, and dangi (the Korean calendar that's essentially identical to Chinese) use eras. In this PR, each era is identified by a string "era code".

An "era" is defined by the following metadata:

  • The ISO and date where year numbering starts. For example, the Gregorian AD era started in the ISO year 1. Note that the starting date may be mid-year in ISO and/or calendar dates. For example, May 1, 2019 was the start of the Reiwa era in the japanese calendar, but April 30 was the last day of the previous Heisei era.
  • A localizable name for the era, e.g. "Anno Domini" or "A.D."
  • Whether numbering starts with 0 (like buddhist calendars) or 1 (like Gregorian and most other calendars)
  • Whether year numbering goes up as time gets later (like AD) or goes down as time gets later (like BC or the roc calendar's "Before R.O.C" era).
  • How to handle negative years. In calendars that lack a backwards-counting era, negative year numbers must be accepted for the oldest era.

BTW, determining this metadata was a useful exercise in this PR. Even though the implementation may be thrown away, the structure of this metadata may be helpful long-term.

The most common pattern in ICU calendars is for calendars to either have a single era that accepts negative years, or one era for positive years and another era for negative years. The only ICU exceptions are japanese (where a new era is typically started for each emperor's reign) and ethiopic where there are two eras: one era that starts from the creation of the world in 5493 BCE, and another era starting in 8 CE). This means that for all ICU calendars but these two, it's obvious which era to use as the "anchor era" used to determine the value of year.

This PR proposes two ways to specify a year: a signed value relative to a per-calendar "anchor era", and a pair of values era (the era code) and eraYear (the era-relative year). Only japanese has any significant ambiguity about which era should be the anchor. Initial feedback is that using the Meiji era (started in 1868 CE) as the anchor may be a good idea, because that era typically marks the beginning of modern Japanese history. Other ICU calendars have an obvious anchor era to use.

It's an open issue whether we should accept eraYear without era. My initial inclination is "no"-- both should be required. If the user wants era-less years, they should use the year field instead.

ICU Supported Calendars

Calendar Type Notes / Quirks
gregory solar Same as ISO, but with a BC era for negative years. The only potential complexity is that the Gregorian calendar was introduced in 1582. For dates before then, should it apply the same rules? (This is called "proleptic")
buddhist solar Despite the ID name, I believe the ICU implementation is the "Thai Solar Calendar", not the buddhist lunisolar calendar
chinese, dangi lunisolar Leap months are duplicated versions of regular months (e.g. "second eigth month"). Any month other than the first or last (12th) can be duplicated. The Korean dangi calendar seems to be identical to the chinese calendar except the ID.
ethioaa, ethiopic, coptic Solar These three calendars are all based on Orthodox/Coptic calendars and behave identically except for eras. The ethioaa calendar's only era starts on -005492-07-17. The ethiopic calendar adds another era starting at +000008-08-27. The coptic calendar has a normal era starting at 0284-08-29 and counts negative years backwards before that. (The other two use negative years before the start of the oldest era.) These calendars all have 12 30-day months and a 5-6 day 13th month. Leap days are added every 4 years, like the Julian calendar, without exceptions for century years like the Gregorian calendar. This means that Ethiopic months will slowly creep around the solar year at a rate of about 3 days per 400 years.
islamic, islamic-umalqura, islamic-tbla, islamic-civil, islamic-rgsa, islamicc Solar These calendars are identical in behavior (and implementation in this PR) except they use slightly different ways to determine when months start and end based on where in the world the sightings of the moon were made. Islamic lunar calendars all have twelve, 29 or 30 day months. Because it's a lunar calendar that's not synced with the solar year, "leap years" don't really exist-- although in the "tabular" implementation of islamic calendars, "leap days" are sometimes added to the 12th month to keep the calendar in sync with the cycles of the moon.
hebrew lunisolar This calendar is used to date religious festivals and Jewish history. There are 12 regular months and a leap month inserted every few years in a 19-year cycle. The leap month, called "Adar I" is inserted between the 5th and 6th month. In leap years, the normally-6th month is called "Adar II". The fact that the name of a non-leap month changes in a leap year is important for localization purposes, but not for Temporal purposes. Months in this calendar are generally the same length in every year (29 or 30 days, depending on the month), but two months (Tishri and Heshvan) are either 29 or 30 days long depending on the year.
indian solar This is a simple, straightforward calendar. There are 12 months. Months 2-6 are 31 days long, months 7-12 are 30 days long, and the first month is 30 or 31 days depending on whether it's a leap year or not.
persian solar This, like indian, is a straightforward calendar. It's used as a civic calendar in Afghanistan and Iran. Months are 31 or 30 days long, except the 12th month may have 29 days in non-leap years and 30 days in a leap year.
japanese Solar The Japanese calendar is identical to Gregorian except there are over 200 different eras, each mostly corresponding to the rule of a particular emperor. This makes it the only ICU calendar with more than 2 eras. This PR only implements a few most-recent eras, starting with the Meiji era beginning in 1868. Also, Japanese eras don't start on Jan 1, as discussed above in the "eras" section. So a date in January and a date in December of the same calendar year can have different eras. Also unlike other ICU calendars, Japanese is the only ICU calendar which is likely to add more eras in the next few decades.
roc Solar The Taiwanese calendar is very simple: it's identical to Gregorian except that year numbering starts in 1912 CE instead of 1 CE

Hindu Calendar

One notable calendar that's not currently supported by ICU is the Hindu Calendar. This is a lunisolar calendar that is by far the most challenging calendar that I researched. http://cldr.unicode.org/development/development-process/design-proposals/chinese-calendar-support#TOC-5.-Hindu-luni-solar-calendar-old-or-new-with-several-variants-: has a good overview of the complexity. The TL;DR is that leap months are not simply added like Chinese or Hebrew. They can also be removed, or in rare cases combined! So a particular month can either be:

  • normal
  • missing
  • added
  • combined - e.g. the normally-3rd and normally-4th months are now considered one month

This calendar is a good stress test for any month-handling scheme that we may use. If it works for Hindu, it should work anywhere! This calendar is one of the reasons I'd suggest using free-form strings for month codes instead of being more proscriptive about the string format.

@justingrant
Copy link
Collaborator Author

justingrant commented Jan 7, 2021

General Principles

Here's some general principles underlying the design of this PR:

  • The API should make it easy to write cross-calendar code, even for developers who don't know much about non-ISO-calendar novices. The best way to make it easy to write cross-calendar code is to minimize the differences between ISO and non-ISO code and to maximize the number of assumptions that ISO calendar developers make that will also apply to non-ISO calendars. For example, the reason why this PR proposes making year into a signed number is to match ISO behavior and therefore make it easier and more reliable to write cross-calendar code.
  • The API should try to avoid complicating ISO code. Most developers will never work with non-ISO calendars, so adding an ergonomics tax to their work should be avoided wherever possible. This is the reason why, for example, I recommend making month a numeric index (like ISO-using and non-lunisolar developers will expect) and monthCode a lunisolar-compatible string. If month were a string, then (few) lunisolar developers would benefit while (100x more) non-lunisolar developers would have to do extra work and would encounter bugs like '1' !== 1 that don't occur with a numeric month.
  • {year, month, day, calendar} (or {year, monthCode, day, calendar} or {era, eraYear, monthCode, day, calendar} or {era, eraYear, monthCode, day, calendar}) should all be sufficient for round-trip serialization of any date in any calendar. This makes non-ISO serialization compatible with ISO serialization and avoids cross-calendar bugs. Additional properties may be supported too but but should not be required.
  • {month, day, calendar, year} or {monthCode, day, calendar} or {month, day, calendar, era, eraYear} should be sufficient for round-trip serialization of any PlainMonthDay in any calendar. This makes non-ISO serialization compatible with ISO serialization and avoids cross-calendar bugs. Additional properties may be supported too but should not be required. (Note that the year is a reference year here.)
  • Additional optional properties should be supported for both input (from and with) and output (e.g. getters). However, per above, optional properties should never be required to initialize a Temporal object. Note that this is a change from the current Temporal implementation of the Japanese calendar that requires the era field.
  • era, eraYear, and monthCode properties should be offered by all supported calendars, including ISO, to make it easier to write cross-calendar code using these properties. In ISO, chinese, hebrew, and dangi, era should be undefined.
  • Lunisolar calendars and backwards-counting eras represent a tiny fraction of software that will be written. We should not optimize the API around these unusual cases. Instead, we should make these cases possible and make it easy for developers who aren't thinking of these unusual cases to still write code that works with them.
  • We should be aware that non-ISO calendars may be a vector for security issues and attacks. We should anticipate how attackers may introduce non-ISO data into apps and try to design the API defensively. This is another reason why ISO-like behavior is a good idea wherever possible.
  • Leverage existing standards around calendar data where possible, notably the iCalendar month-code string format (e.g. "5L" for Adar I, "7L" for a repeated seventh month in the Chinese calendar, or "4" for April).

@Louis-Aime
Copy link

Louis-Aime commented Jan 7, 2021

A short but important note about the gregory calendar:

Same as ISO, but with a BC era for negative years. The only potential complexity is that the Gregorian calendar was introduced in 1582. For dates before then, should it apply the same rules? (This is called "proleptic")

In my opinion, the gregory calendar should be as specified by Gregory XIII in 1582, that is (in short):

  • from ISO 1582-10-15 on, dates are as of ISO 8601 (except for week numbering);
  • before that date, the Julian calendar was used with the era defined by Dionysos Exiguus, so that the day before ISO 1582-10-15 was 4 Oct. 1582, and the epoch of this calendar, 1 Jan. 1, is ISO 0000-12-30
  • the proleptic Gregorian calendar should be rendered as iso8601.

With Temporal, thanks to the custom calendars, it is possible to define any calendar historically used in Western Europe, with a switching date between 1582-10-15 (Rome and Spain) and 1752-09-14 (U.K.) or even later.

@ptomato
Copy link
Collaborator

ptomato commented Jan 12, 2021

In my opinion, the gregory calendar should be as specified by Gregory XIII in 1582, that is (in short):

  • from ISO 1582-10-15 on, dates are as of ISO 8601 (except for week numbering);
  • before that date, the Julian calendar was used with the era defined by Dionysos Exiguus, so that the day before ISO 1582-10-15 was 4 Oct. 1582, and the epoch of this calendar, 1 Jan. 1, is ISO 0000-12-30
  • the proleptic Gregorian calendar should be rendered as iso8601.

I strongly disagree with this, sorry. It's not within scope for Temporal to specify how any of the calendars other than iso8601 behave, but in this proof of concept the gregory calendar should follow whatever rules are currently used for the gregory calendar in Intl.DateTimeFormat, which is in fact the proleptic Gregorian calendar:

> new Date(Date.parse('1582-10-15T00:00Z') - 1).toLocaleString('en', {timeZone: 'UTC', calendar: 'gregory'})
'10/14/1582, 11:59:59 PM'

It would be very bad if the results differed by 10 days depending on whether you used Date.toLocaleString or one of the Temporal toLocaleString methods.

@cjtenny
Copy link
Collaborator

cjtenny commented Jan 12, 2021

Justin, thanks for the awesome extra background and notes around this. Very helpful for diving in.

Re: new fields,

  • monthCode: A no-brainer to me, looks very reasonable.
  • eraYear: Makes sense, though I'm ambivalent about the name.
  • monthType: I think there's some room to simplify this from language and information perspectives. Linguistically, "leap" vs "regular" might not be the best distinction - something boolean like "isIntercalaryMonth" could help. Informationally, it seems unbalanced to me to have "isLeapYear" and "isIntercalaryMonth". Especially given that calendars can have intercalary months, days, and weeks. Maybe an "isIntercalary" field could help, though I'm not sure that's the friendliest name. Alternatively, since it's redundant with monthCode, it could be removed. What use cases do you see for it?
  • regularMonth: Agree w/ removal, seems redundant esp. if keeping both code & type.

edit: @ptomato linked me to #1235, which had slipped my mind. Will comment on role? separately.

I think the anchor era should be an observable property, otherwise it will be a magic number/string/other in documentation.

@justingrant
Copy link
Collaborator Author

In my opinion, the gregory calendar should be as specified by Gregory XIII in 1582, that is (in short):

  • from ISO 1582-10-15 on, dates are as of ISO 8601 (except for week numbering);
  • before that date, the Julian calendar was used with the era defined by Dionysos Exiguus, so that the day before ISO 1582-10-15 was 4 Oct. 1582, and the epoch of this calendar, 1 Jan. 1, is ISO 0000-12-30
  • the proleptic Gregorian calendar should be rendered as iso8601.

I strongly disagree with this, sorry. It's not within scope for Temporal to specify how any of the calendars other than iso8601 behave, but in this proof of concept the gregory calendar should follow whatever rules are currently used for the gregory calendar in Intl.DateTimeFormat, which is in fact the proleptic Gregorian calendar:

@Louis-Aime - To clarify, @ptomato isn't saying your feedback is bad, only that there's a division of responsibility. Temporal is responsible for the Calendar API and for defining requirements and invariants that all calendars should follow. But others are responsible for the actual calculations of each built-in calendar, including decisions like whether 'gregory' is proleptic or not. For this reason, I intentionally avoided making calculation decisions in this PR-- I simply used the calculations already available in Intl.DateTimeFormat.formatToParts. This doesn't mean your feedback is bad (thanks for submitting it!), but that the right place to send similar feedback is upstream to Temporal.

That said, if any of your feedback may have impact on the Temporal API, then it's definitely welcome to file here. For example, your feedback above got me wondering if developers would be able to successfully pass calendar-specific options (e.g. a Gregorian->Julian transition date) from PlainDate.from through to the calendar. It turns out that this is not possible today, so I filed #1253 to discuss adding this capability. This is an example of the kind of feedback that's very useful because it helps stress-test the Temporal API and identify gaps. Keep it coming! ;-)

@sffc, if @Louis-Aime (or others) has feedback about the 'gregory' calendar's calculations, where's the right place to file an issue? If it was just a bug report, I'd point him at the Chromium bug reporting site, but in this case it sounds like more of a design issue so I wasn't sure.

@sffc
Copy link
Collaborator

sffc commented Jan 13, 2021

The "gregory" calendar should be proleptic Gregorian, as in ICU. It should be straightforward to implement a Julian-Gregorian calendar with a change date in userland, building on top of the built-in "julian" and "gregory".

Copy link
Collaborator

@ptomato ptomato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a lot of work, I think it meets the goal of validating that the current design of the calendar API is flexible enough to accommodate all the different calendars, with a few minor modifications. That is an important milestone.

I would like to make sure that we make these modifications with as few API additions as possible.

Separately, I would like to start merging parts of this so that we don't have the whole thing blocked on the resolution of the open discussions. It seems that #1227 and #1229 would be good places to start on that. Would you be able to start by splitting those out or is it something that @cjtenny or I should do?

polyfill/lib/calendar.mjs Outdated Show resolved Hide resolved
fields = ES.ToRecord(fields, [
['day'],
['era', undefined],
['eraYear', undefined],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ISO calendar should not observably access era, eraYear, or monthCode, since those fields have no meaning in ISO calendar space (and would not be included in implementations without Intl.) Maybe this part should move to impl['gregory']?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, let's discuss this in context of your comment below about whether to offer these properties on the ISO calendar.

return undefined;
},
eraYear(date) {
if (!HasSlot(date, ISO_YEAR)) date = ES.ToTemporalDate(date, GetIntrinsic('%Temporal.PlainDate%'));
return GetSlot(date, ISO_YEAR);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of these should be undefined in the ISO calendar. (assuming we have fields month and monthCode in non-ISO calendars; if not, then things might need to be different)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it'd be better to have valid values for the ISO calendar, because it'd make it easier to write cross-calendar code. Otherwise, writing cross-calendar libraries would require a lot more special casing and would introduce more bugs. This is especially true for monthCode which is required for PlainMonthDay in lunisolar calendars.

Here's what I had in mind:

  • monthCode - always ${month}
  • monthType - always 'regular'
  • eraYear - always year
  • era - undefined (just like era-less calendars like Hebrew and Chinese)

What do you think?

polyfill/lib/calendar.mjs Outdated Show resolved Hide resolved
const val1 = GetSlot(this, slot);
const val2 = GetSlot(other, slot);
if (val1 !== val2) return false;
// optimization: avoid calling getters if slots are equal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in #1239 I don't think this should change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss #1239 in this week's meeting. Seems like there are a number of questions in that issue (and perhaps a dependency on #1203 and perhaps other issues too), and I suspect the answer to those questions will resolve what to do here too. Cool?

@ptomato
Copy link
Collaborator

ptomato commented Jan 13, 2021

In my opinion, the gregory calendar should be as specified by Gregory XIII in 1582, that is (in short):

  • from ISO 1582-10-15 on, dates are as of ISO 8601 (except for week numbering);
  • before that date, the Julian calendar was used with the era defined by Dionysos Exiguus, so that the day before ISO 1582-10-15 was 4 Oct. 1582, and the epoch of this calendar, 1 Jan. 1, is ISO 0000-12-30
  • the proleptic Gregorian calendar should be rendered as iso8601.

I strongly disagree with this, sorry. It's not within scope for Temporal to specify how any of the calendars other than iso8601 behave, but in this proof of concept the gregory calendar should follow whatever rules are currently used for the gregory calendar in Intl.DateTimeFormat, which is in fact the proleptic Gregorian calendar:

@Louis-Aime - To clarify, @ptomato isn't saying your feedback is bad, only that there's a division of responsibility. Temporal is responsible for the Calendar API and for defining requirements and invariants that all calendars should follow. But others are responsible for the actual calculations of each built-in calendar, including decisions like whether 'gregory' is proleptic or not. For this reason, I intentionally avoided making calculation decisions in this PR-- I simply used the calculations already available in Intl.DateTimeFormat.formatToParts. This doesn't mean your feedback is bad (thanks for submitting it!), but that the right place to send similar feedback is upstream to Temporal.

That said, if any of your feedback may have impact on the Temporal API, then it's definitely welcome to file here. For example, your feedback above got me wondering if developers would be able to successfully pass calendar-specific options (e.g. a Gregorian->Julian transition date) from PlainDate.from through to the calendar. It turns out that this is not possible today, so I filed #1253 to discuss adding this capability. This is an example of the kind of feedback that's very useful because it helps stress-test the Temporal API and identify gaps. Keep it coming! ;-)

@sffc, if @Louis-Aime (or others) has feedback about the 'gregory' calendar's calculations, where's the right place to file an issue? If it was just a bug report, I'd point him at the Chromium bug reporting site, but in this case it sounds like more of a design issue so I wasn't sure.

Apologies if I came across badly in my comment.

@justingrant
Copy link
Collaborator Author

Thanks, this is a lot of work, I think it meets the goal of validating that the current design of the calendar API is flexible enough to accommodate all the different calendars, with a few minor modifications. That is an important milestone.

Agreed. It seems like the Calendar API is on the right track with a few mods.

I would like to make sure that we make these modifications with as few API additions as possible.

Me too!

Separately, I would like to start merging parts of this so that we don't have the whole thing blocked on the resolution of the open discussions. It seems that #1227 and #1229 would be good places to start on that. Would you be able to start by splitting those out or is it something that @cjtenny or I should do?

After spending so much time on this PR, I'm backed up on other things so would love help! Feel free to pull any subset from this PR that you think would be useful to split out. #1227 is simple and self-contained so is a great candidate. #1229 and #1235 may be somewhat inter-related so it may make sense to tackle both in the same PR.

@ptomato
Copy link
Collaborator

ptomato commented Jan 13, 2021

Separately, I would like to start merging parts of this so that we don't have the whole thing blocked on the resolution of the open discussions. It seems that #1227 and #1229 would be good places to start on that. Would you be able to start by splitting those out or is it something that @cjtenny or I should do?

After spending so much time on this PR, I'm backed up on other things so would love help! Feel free to pull any subset from this PR that you think would be useful to split out. #1227 is simple and self-contained so is a great candidate. #1229 and #1235 may be somewhat inter-related so it may make sense to tackle both in the same PR.

#1227 is separated out and merged now. If you don't mind, I will rebase this PR every time we merge something so that we don't accumulate merge conflicts.

@justingrant
Copy link
Collaborator Author

#1227 is separated out and merged now. If you don't mind, I will rebase this PR every time we merge something so that we don't accumulate merge conflicts.

Cool! Rebase after every PR sounds good, but I'm working on some bug fixes now (the current code doesn't handle overflow correctly in some cases) that I was planning to commit later today, so I can save both of us some work by rebasing as part of that checkin. Sound OK?

Also, #1253 might be another candidate for splitting out. It's similar to #1229 and #1235, except for options instead of fields.

@justingrant
Copy link
Collaborator Author

I just pushed a commit that should fix behavior of the overflow option. Also rebased and addressed some review feedback. Added a bunch of tests. Notes about expected behavior:

  • Non-lunisolar calendars: monthCode must be a numeric string. '4L' will throw regardless of overflow.
  • Chinese/Dangi calendars: if monthCode ends in L but it's not a real month, then constrain will clamp to the last day of the previous month. 1L/12L/13L always throws because those months never exist in any years.
  • Hebrew - if monthCode is 5L (Adar I) in a non-leap year, then constrain will clamp to the last day of the previous month (30 Av). Any other leap codes will always throw.

@cjtenny
Copy link
Collaborator

cjtenny commented Jan 14, 2021

I think this may not address #1229 after all - though it adds new coercion types for monthCode and eraYear and changes default values to enable other calendars to work. I can't figure out how it would allow a custom calendar to receive those fields pre-coercion or to do the validation on its own. I've got a commit for #1229 that does do that, which is almost done and I'll push it out in the morning (PST). It doesn't overlap with these changes much at all so it should be a simple rebase, which @ptomato or I can tackle (I don't yet have push bits on this repo, so I can't rebase here directly).

Edit: I believe it will also address #1253 , will take a closer look in the morning.

@justingrant
Copy link
Collaborator Author

I can't figure out how it would allow a custom calendar to receive those fields pre-coercion or to do the validation on its own.

AFAIK, the data flow in this PR works like this:

  1. from() on a particular type will call the calendar's fields method is called to get the field names for that type. The calendar can add/remove field names.
  2. Then ES.ToRecord is called using that list of field names (using undefined defaults for all built-in fields) which will coerce and alpha-order built-in fields and then append custom fields to the end of the record without coercion.
  3. The calendar's xxxFromFields method is called with the record from (2). That xxxFromFields method is responsible for validation/coercion on the input before calling the type constructor.

Which step above is problematic?

Here's an example of that workflow (from ES.ToTemporalDate).

      const fieldNames = ES.CalendarFields(calendar, ['day', 'month', 'year']);
      const fields = ES.ToTemporalDateFields(item, fieldNames);
      return ES.DateFromFields(calendar, fields, constructor, overflow);

Copy link
Collaborator

@cjtenny cjtenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments part 2; stepping out for lunch and a pickup trip to the library then I'll be back later this afternoon for part 3.

polyfill/lib/calendar.mjs Outdated Show resolved Hide resolved
polyfill/lib/calendar.mjs Show resolved Hide resolved
}
}
if (monthCode === undefined) monthCode = this.getMonthCode(year, month);
return { ...calendarDate, day, month, monthCode, year, eraYear };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this lead to inconsistent era output re previous comment? As far as I can tell it doesn't matter, but trying to understand why an undefined era is inserted above with an eraYear. Ideally both or neither should be present if possible, even though I think this is all impl-internal.

Also, should the hebrew calendar have constant era 'am' or 'ah'?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't cause any harm. For internal-only processing, I generally followed the rule of always filling in eraYear === year for non-era-using calendars in order to avoid special-case code in cases where eraYear is used. As you noted, I I don't think this is observable behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should the hebrew calendar have constant era 'am' or 'ah'?

I assumed that hasEra should be true for calendars whose default .toLocaleDateString('en-US-u-ca-...') output contains an era, and false for calendars where it doesn't. The latter group is only ISO, Hebrew, and Chinese/Dangi. So that's why there's no era in Hebrew. But this is also really easy to change. @sffc, @Manishearth - do you guys have any opinions about which calendars should and should not use eras?

polyfill/lib/calendar.mjs Show resolved Hide resolved
polyfill/lib/calendar.mjs Outdated Show resolved Hide resolved
polyfill/lib/calendar.mjs Outdated Show resolved Hide resolved
const matchingEra =
era === undefined ? undefined : this.eras.find((e) => e.name === era || e.genericName === era);
if (!matchingEra) throw new RangeError(`Era ${era} (ISO year ${eraYear}) was not matched by any era`);
if (eraYear < 1 && matchingEra.reverseOf) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this properly reject era: 'ce', eraYear: 0?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention I used (at least I think I did!) was the same as what legacy Date does with era-less input: out of range values are allowed and evaluated relative to the epoch of the era. So era: 'ce', eraYear: 0 is treated as ISO year 0 and will be stored as 1 BCE. An exception is reverse eras, which I require to be positive because -1 BCE is clearly a bug.

We could also choose to reject 0 CE inputs. I don't feel strongly either way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think if we're in a 'constrain' context that makes sense, but a calendar should reject an eraYear that's not valid for the context of a given era in a 'restrict' context. It seems inconsistent to coerce eraYear 0 to CE 1 otherwise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't clear. The current behavior for all eras is not to constrain eraYear, but rather to evaluate eraYear as a signed value in context of the era's epoch. So 0 CE => 1 BCE, and Meiji 100 => Showa 42.

That's why I'm a little skeptical about constrain behavior because it's a little ambiguous about what it should do: constrain to first/last day or era? Or current behavior?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, one reason why the current behavior is as-is is to make it cleaner to deal with dates in years where eras end mid-year. For example, Jan 1 Reiwa 1 is technically in the previous era because Reiwa started May 1. By always treating eraYear as a signed value without constraints, it includes handling the Jan 1 Reiwa 1 case too. I think there was discussion of this in another issue but forget which.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think this is tricky. There's no clear temporal spec for this, but it does seem in the spirit of constrain vs reject that Meiji 100 would be a RangeError when treated strictly, and Showa 42 when treated loosely. Same for 0 CE vs 0 BCE, for example. My preference would be not to evaluate eraYear as a signed value in the context of an epoch; that's what year is for. This could lead to developer mistakes, too, though it's admittedly a little obscure. Let me see if I can find that other discussion after reviewing the tests, before I suggest any changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If OK with you, I'd prefer to handle this question as a follow-up PR, because this case seems ambiguous enough that we may want to discuss it with a wider group before committing to a course of action. To summarize, there are a few questions to answer:

  1. How to handle out-of-bounds eraYear with reject?
  2. How to handle out-of-bounds eraYear with constrain?
  3. How to handle out-of-bounds months/days (but in-bounds eraYear) with reject?
  4. How to handle out-of-bounds months/days (but in-bounds eraYear) with constrain?

AFAIK, we already decided (3) and (4) in #1300, which was to accept those values without exceptions. So unless there's new information I'd prefer to leave those closed decisions closed.

polyfill/lib/calendar.mjs Outdated Show resolved Hide resolved
polyfill/lib/calendar.mjs Show resolved Hide resolved
polyfill/lib/calendar.mjs Show resolved Hide resolved
Changes in response to review feedback by @cjtenny:
* Indian calendar estmates are now exact
* Exact estimates are more efficient now
* Fixed some wrong comments
Copy link
Collaborator

@cjtenny cjtenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part 3, now onto the tests.


const helperJapanese = ObjectAssign(
{},
// NOTE: For convenience, this hacky implementation only supports the most
Copy link
Collaborator

@cjtenny cjtenny Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll file a (non-blocking, separate) issue to mirror Intl's supported eras in the polyfill tomorrow, after someone decides on the THIS-IS-NOT-A-SPEC-CHANGE issue tag

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually an old comment that I inherited from @ptomato but the reason for only showing 5 eras is out of date. In our meeting with the ICU4X folks last week, we decided that @sffc and @Manishearth would do more research and choose between two options:

  • only supporting Meiji and newer eras, and use ce/bce for older dates
  • supporting all CLDR eras

The reason why neither of these is an obviously best winner is that older eras are apparently both infrequently used and are also frequently subject to revision by historians... and in some cases are not precisely defined down to the month/day/year. This makes it hard to use older eras for Temporal features that require eras to start and end on precise dates. Let's use #526 for more discussion. If @sffc and @Manishearth decide that "use all the CLDR eras" is the way we should go, then there's definitely a work item needed to populate those into the polyfill, but it's not a guarantee that this is needed.

In the meantime I'll clarify this comment to match the latest state of things.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

polyfill/lib/calendar.mjs Outdated Show resolved Hide resolved
polyfill/lib/calendar.mjs Show resolved Hide resolved
polyfill/lib/calendar.mjs Show resolved Hide resolved
polyfill/lib/calendar.mjs Show resolved Hide resolved
polyfill/lib/calendar.mjs Show resolved Hide resolved
Copy link
Collaborator

@cjtenny cjtenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm so sorry to ask a little more of you here, but... could you add just a few test cases for Temporal.PlainMonthDay.from that cover #1239 / #1316 ? I think that'd be enough tests for now. Thank you so much again for doing all of this work, sorry to pick over it with such a fine-tooth comb.

polyfill/test/intl.mjs Show resolved Hide resolved
polyfill/test/intl.mjs Show resolved Hide resolved
@justingrant justingrant force-pushed the hebrew-calendar branch 4 times, most recently from 4ea9823 to ba854d2 Compare February 5, 2021 07:47
* Fix Indian calendar tests in Node 12/14
* Add new tests for Indian leap year and Indian ICU bugs
* Refactor ICU bug checking to be more generic
* Update outdated comments
* Remove redundant validation
@justingrant
Copy link
Collaborator Author

I'm so sorry to ask a little more of you here, but... could you add just a few test cases for Temporal.PlainMonthDay.from that cover #1239 / #1316 ? I think that'd be enough tests for now. Thank you so much again for doing all of this work, sorry to pick over it with such a fine-tooth comb.

No worries, this is a great review. I'll add these tests tomorrow.

Also, sorry for all the commit spam tonight. It's painful when bugs only repro in Node 12 or 14 and the only place I can repro them is in CI!

* Ensure that invalid `month` is constrained or rejected per options
* Ensure that invalid `monthCode` is always rejected (fixes tc39#1349)
* Add tests for PlainMonthDay to verify the above and to ensure that
  leap days and leap months are constrained accurately
@justingrant
Copy link
Collaborator Author

I'm so sorry to ask a little more of you here, but... could you add just a few test cases for Temporal.PlainMonthDay.from that cover #1239 / #1316 ? I think that'd be enough tests for now. Thank you so much again for doing all of this work, sorry to pick over it with such a fine-tooth comb.

No worries, this is a great review. I'll add these tests tomorrow.

Done! At this point, all tests are passing and I believe that all review feedback has either been addressed or I've suggested to punt the feedback to a follow-up. So should be ready to merge if you like what you see!

BTW, adding those PlainMonthDay tests was a really good suggestion, because adding these tests exposed a few bugs, including that I wasn't constraining/rejecting month and monthCode properly. These tests also helped me find a bug with the ISO calendar (#1349) which the latest commit fixes too. Let me know if you'd like me to pull that fix out into a separate PR, because AFAIK it's the only observable change to ISO behavior in this PR.

@justingrant
Copy link
Collaborator Author

Ah, crap I spoke too soon. One of the Test262 tests are failing. I'll push a fix later this afternoon-- my laptop is about to run out of battery.

@justingrant
Copy link
Collaborator Author

I fixed the last failing 262 test, which was a weird one: PlainMonthDay({month: Infinity, day}) was failing because the implementation of ES.ToTemporalMonthDay was creating the month code as MInfinity which is obviously wrong and was caught by the new validation logic I added to catch mismatches. See the git comments of the last commit and its code comments for more info. This change required some tweaks to ecmascript.mjs which should definitely be reviewed, as I'm not sure if the solution I picked is a good-enough one.

Anyway, LMK if any more changes are required.

@cjtenny
Copy link
Collaborator

cjtenny commented Feb 7, 2021

Awesome Justin, thanks for all the work here. I'll be on tomorrow morning and creating follow up issues, hopefully merging!

The previous commit failed the`PlainMonthDay({month: Infinity, day})`
test because 'MInfinity' is obviously invalid. This commit changes the
"calendar omitted" path of ToTemporalMonthDay (which previously
added a `monthCode`) to add a `year` (1972) instead which won't
cause the MInfinity problem.

While we're in the neighborhood, this commit also adds one more test
for round-tripping PlainMonthDay values to/from ISO strings.
@justingrant
Copy link
Collaborator Author

I fixed the last failing 262 test, which was a weird one: PlainMonthDay({month: Infinity, day}) was failing because the implementation of ES.ToTemporalMonthDay was creating the month code as MInfinity which is obviously wrong and was caught by the new validation logic I added to catch mismatches.

I realized this morning that there's a much simpler solution (only a 2-line change in ecmascript.mjs) to this problem, so I just replaced last night's commit with this simpler fix.

Copy link
Collaborator

@cjtenny cjtenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for all your work here, @justingrant ! I'm sorry for the late-in-the-weekend review, my brain was feeling like molasses all weekend so I wanted to take a few extra passes. I've gone over the changes enough times now that I'm confident they're correct :) Congratulations on this beastly change.

@cjtenny cjtenny merged commit c6df297 into tc39:main Feb 8, 2021
@justingrant
Copy link
Collaborator Author

Thanks, glad to help! @cjtenny I assume you're going tp open follow-up issues? Just wanted to make sure that I'm not on the hook for those. ;-)

While this approach is nicer than dd913bdda6e160bfdad3b13cd0124ae54147ab9d, either requires a one-line spec change to 3.j.i. I think it's fine to keep for now and not block merging, I'll follow up in another issue to decide if it meets the bar for spec changes - or if it also can be fixed only in the case of intl. Great catch on finding this.

Cool, thanks. FWIW, this was actually an easy problem to find because it was just pure luck that obviously-bogus MInfinity didn't cause test failures previously. Once better validation was added this failure popped right up without adding new tests. The hard part was figuring out how to fix it given that "missing calendar" (where monthCode is optional) and "ISO calendar used explicitly" (where monthCode is required) look exactly the same from the perspective of the ISO calendar implementation. The year trick is kinda hacky but seems like the least intrusive way to deal with it.

I assume there's already a spec change needed to add the M prefix to monthCode (e.g. 12.1.25 ResolveISOMonth and 12.1.31 ISOMonthCode) so maybe this spec change could be combined with that one?

@cjtenny
Copy link
Collaborator

cjtenny commented Feb 8, 2021

Sounds good on both counts. Yeah, I'll take care of them :) I've been irrationally waiting to turn my emacs buffer of issues into GH issues in the hopes that someone else will come up with a tag for really-not-changing-for-stage3 issues, but I need to bite the bullet 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invalid ISO month codes (e.g. 'M15') don't throw, but should invalid calendar identifier chinese
6 participants