Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on valid inputs for date pattern selection #605

Closed
sffc opened this issue Apr 3, 2021 · 4 comments
Closed

Thoughts on valid inputs for date pattern selection #605

sffc opened this issue Apr 3, 2021 · 4 comments
Assignees
Labels
C-datetime Component: datetime, calendars, time zones R-fixed-elsewhere Resolution: issue was fixed outside the core repo
Milestone

Comments

@sffc
Copy link
Member

sffc commented Apr 3, 2021

CC @gregtatum

This is not an actionable issue; it's a way for me to write down some thoughts to answer the question of how we determine a valid input to date time selection.

NB: I am not dealing with Weeks in this post.

Lengths

There are four lengths I would like to propose:

  1. Long = full name, no abbreviations.
  2. Medium = full name, abbreviations allowed.
  3. Short = short name, possibly numeric.
  4. Short-Lossy = as short as possible, even if ambiguities are possible. Known as "Narrow" in CLDR in components where it is supported.

Here are examples in en-US in each of the four components:

  • Date
    1. Long: April 2, 2021
    2. Medium: Apr 2, 2021
    3. Short: 4/2/2021
    4. Short-Lossy: 4/2/21
  • Time
    • I actually can't think of cases where length is useful for time. It seems like the only variation is whether the hour has leading zero, but that seems like a localizer decision, not a programmer decision.
  • Weekday
    1. Long: Thursday
    2. Medium: Thu
    3. Short: Th
    4. Short-Lossy: T (ambiguous with Tuesday)
  • Time Zone
    1. Long: Central Daylight Time
    2. Short: CDT
    • It seems like we'd only need two lengths for time zone.

Field Filters

The valid field filters for each component might be:

  1. Date
    1. Era, Year, Month, Day
    2. Year, Month, Day (era may be optionally added by ICU depending on calendar system)
    3. Era, Year, Month
    4. Year, Month (optional era)
    5. Era, Year
    6. Year (optional era)
    7. Era (standalone)
    8. Month, Day
    9. Month (standalone)
    10. Day (standalone -- unclear if this is needed)
  2. Time
    1. Hour, Minute, Second, Fractional Second (day period may be optionally added by ICU depending on user preference for hour cycle)
    2. Hour, Minute, Second (optional day period)
    3. Hour, Minute (optional day period)
    4. Hour (optional day period)
    • Is there a valid use case for displaying a time of day without hours? Maybe these are all the valid inputs.
  3. Weekday
    1. Weekday displayed
  4. Time Zone
    1. Time zone displayed

Cartesian Product and Compression

Naively, the cartesian product is:

((4 date lengths) * (10 date field filters) + (1 if date disabled)) * ((1 time length) * (4 time field filters) + (1 if time disabled)) * ((4 weekday widths) * (1 weekday field filters) + (1 if weekday disabled)) * ((2 time zone widths) * (1 time zone field filters) + (1 if time zone disabled)) = 3075 valid inputs.

Now, we obviously don't want to ship 3075 patterns per locale. We could employ techniques along the lines of what we described yesterday as CLDR's "compression algorith", like glue patterns (#585), or we could come up with our own compression technique in ICU4X.

@sffc sffc added question Unresolved questions; type unclear C-datetime Component: datetime, calendars, time zones labels Apr 3, 2021
@gregtatum
Copy link
Member

Would the Rust API for the components be an enumeration of the combinations for each component? This seems like it would simplify the process of building a valid components bag.

e.g. Something like...

struct ComponentsBag {
  date: Option<DateComponents>,
  time: Option<TimeComponents>,
  weekday: Option<Length>,
  time_zone: Option<Length>
}

enum DateLength {...}

enum DateComponents {
  EraYearMonthDay(DateLength),
  YearMonthDay(DateLength),
  EraYearMonth(DateLength),
  YearMonth(DateLength),
  EraYear(DateLength),
  Year(DateLength),
  Era(DateLength),
  MonthDay(DateLength),
  Month(DateLength),
  Day(DateLength),
}

Date fields

I think time is missing "Minute, Second", which is in the CLDR data. I would also think that "Second, Fractional Second", and "Minute, Second, Fractional Second" would be valid as well.

@sffc
Copy link
Member Author

sffc commented Apr 6, 2021

Would the Rust API for the components be an enumeration of the combinations for each component?

That's one way to do it, yes. I like that. Or I guess conceptually, I was thinking of DateComponents and DateLength as two separate options in the bag, rather than nesting them.

I think time is missing "Minute, Second", which is in the CLDR data. I would also think that "Second, Fractional Second", and "Minute, Second, Fractional Second" would be valid as well.

We need those for durations, but do we need them for clock times?

@gregtatum
Copy link
Member

Initial stab at the JSON backer for this proposal.

{
  "preferred_hour_cycle": "H11H12",
  "glue": {
    "weekday-time_zone": "{0} {1}", // It's technically possible to generate this Bag :-/
    "date-time-long": "{1} 'at' {0}",
    "date-time-medium": "{1} 'at' {0}",
    "date-time-short": "{1}, {0}",
    "date-time-shortLossy": "{1}, {0}",
  },
  "time": {
    "glue": null,
    "h11_h12": {
      "glue": null,
      "components": {
        "hourMinuteSecondFractionalSecond": null, // Fractional seconds isn't in CLDR?
        "hourMinuteSecond": "h:mm:ss a",          // "2:05:00 PM"
        "hourMinute": "h:mm a",                   // "2:05 PM"
        "hour": "h a",                            // "2:05 PM"

        // Optional specializations. These only get matched against if there is no Date
        // component.
        "weekdayHourMinuteSecondFractionalSecond": null,
        "weekdayHourMinuteSecond": null,
        "weekdayHourMinute": null,
        "weekdayHour": null
      }
    },
    "h23_h24": {
      "glue": null,
      "components": {
        "hourMinuteSecondFractionalSecond": null, // Fractional seconds isn't in CLDR?
        "hourMinuteSecond": "HH:mm:ss",           // "Tue 14:05:00"
        "hourMinute": "",
        "hour": "h a",

        // Optional specializations. These only get matched against if there is no Date
        // component.
        "weekdayHourMinuteSecondFractionalSecond": null,
        "weekdayHourMinuteSecond": null,
        "weekdayHourMinute": null,
        "weekdayHour": null
      }
    }
  },
  "long": {
    "date": {
      "glue": {
        // Use the abbreviated version for glued patterns.
        "era": "{1} G"
      },
      "components": {
        // Required fields:
        "yearMonthDay": "MMMM d, y", // "January 20, 2020" (note that CLDR only contains "yMMMd",
                                     // but that the field expansion yields this)
        "yearMonth": "MMMM y",       // "January 2020"
        "year": "Y",                 // 2020
        "era": "GGGG",               // "Anno Domini", this will be used as appends.
        "monthDay": "MMMM d",        // "January 20"
        "month": "MMMM",             // "January",
        "day": "d",                  // "20"

        // Era is relying on appends, but could be customized.
        "eraYearMonthDay": null,
        "eraYearMonth": null,
        "eraYear": null,

        // These are additional date customizations available.
        "weekdayEraYearMonthDay": null,
        "weekdayYearMonthDay": null,
        "weekdayEraYearMonth": null,
        "weekdayYearMonth": null,
        "weekdayEraYear": null,
        "weekdayYear": null,
        "weekdayEra": null,
        "weekdayMonthDay": null,
        "weekdayMonth": null,
        "weekdayDay": null
      }
    },
    // Stand-alone weekday.
    "weekday": "EEEE" // "Tuesday"
    // What customization should we tie in here?
    "time_zone" {
      // TODO
    }
  },

  // These are just copies of the same patterns above, but show a complete example
  "medium": {
    "date": {
      "glue": {},
      "components": {
        "yearMonthDay": "MMMM d, y",
        "yearMonth": "MMMM y",
        "year": "Y",
        "era": "GGGG",
        "monthDay": "MMMM d",
        "month": "MMMM",
        "day": "d"
      }
    },
    "weekday": "EEEE",
    "time_zone": {}
  },
  "short": {
    "date": {
      "glue": {},
      "components": {
        "yearMonthDay": "MMMM d, y",
        "yearMonth": "MMMM y",
        "year": "Y",
        "era": "GGGG",
        "monthDay": "MMMM d",
        "month": "MMMM",
        "day": "d"
      }
    },
    "weekday": "EEEE",
    "time_zone": {}
  },
  "shortLossy": {
    "date": {
      "glue": {},
      "components": {
        "yearMonthDay": "MMMM d, y",
        "yearMonth": "MMMM y",
        "year": "Y",
        "era": "GGGG",
        "monthDay": "MMMM d",
        "month": "MMMM",
        "day": "d"
      }
    },
    "weekday": "EEEE",
    "time_zone": {}
  }
}

@sffc sffc added the R-fixed-elsewhere Resolution: issue was fixed outside the core repo label Oct 21, 2021
@gregtatum
Copy link
Member

The design doc for this discussion is here: https://docs.google.com/document/d/18v9fQcDvHDkG_7Hx6rDt1r3Mq6_JMecgORgOH4yXAWU/edit

I will follow-up with filing new actionable issues.

@sffc sffc removed the question Unresolved questions; type unclear label Oct 21, 2021
@sffc sffc removed the v1 label Apr 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-datetime Component: datetime, calendars, time zones R-fixed-elsewhere Resolution: issue was fixed outside the core repo
Projects
None yet
Development

No branches or pull requests

2 participants