Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should identifiers that resolve to UTC always be canonicalized? #21

Open
justingrant opened this issue May 19, 2023 · 3 comments
Open

Comments

@justingrant
Copy link
Collaborator

The core proposition of this proposal is that ECMAScript shouldn't change user-inputted identifiers. If the user provides Asia/Calcutta, then they should get Asia/Calcutta back. Ditto for Asia/Kolkata.

But should this rule also apply to time zone identifiers that resolve to UTC? There's 14 aliases for UTC. Most are obscure, but Etc/UTC (and perhaps Etc/GMT) will probably show up frequently, especially in inputs from other systems that follow IANA's practice of making "Etc/UTC", not "UTC" the canonical identifier for this Zone.

Here's a few reasons I'd expect users to check whether a time zone is UTC:

  1. Programs may want to behave differently in UTC, e.g. hide the local time in a UI because UTC doesn't represent a real place on Earth.
  2. Programs may check for non-UTC identifiers as a signal of whether they need to do additional processing, e.g. to check if there might have been a time zone transition.
  3. Programs may check for UTC identifiers as a signal that they can use a "fast path" in their code, e.g. using Instant instead of ZonedDateTime in calculations or output.

I think we have these options:

  1. Canonicalize, because === 'UTC' is a frequently-used pattern in existing code and (unlike location-based zones) we never have to worry about it being broken in future TZDB releases.
  2. Don't canonicalize. This makes UTC-resolving IDs consistent with all others, so that users don't get in the habit of using === to compare any IDs, including UTC.
  3. Don't canonicalize, but recognize UTC's special role by creating another TimeZone method like TimeZone.p.isUTC to check to see whether an identifier or time zone object is UTC. It's tz.isUTC() would be shorthand for tz.equals('UTC'), which doesn't save much ergonomically, but would be much more discoverable in IDE auto-complete and in docs.

My current preference is for option (2) because it's simplest and most consistent, and then we can always add (3) later if too many users are still getting tripped up by using === 'UTC'.

But I wanted to get others' feedback too.

Here's the identifiers that resolve to 'UTC' in ECMAScript:

Etc/GMT+0    
Etc/GMT-0    
Etc/GMT0     
Etc/Greenwich
Etc/UCT      
Etc/Universal
Etc/Zulu     
GMT+0        
GMT-0        
GMT0         
Greenwich    
UCT          
UTC          
Universal    
Zulu         
@ljharb
Copy link
Member

ljharb commented May 19, 2023

2 seems best for the reasons indicated.

@ptomato
Copy link
Contributor

ptomato commented Jun 20, 2023

Just to be clear, this is separate from the special treatment of Etc/UTC, Etc/GMT, and GMT which would need to remain for backwards compatibility? (https://tc39.es/ecma402/#sec-canonicalizetimezonename)

@justingrant
Copy link
Collaborator Author

Just to be clear, this is separate from the special treatment of Etc/UTC, Etc/GMT, and GMT which would need to remain for backwards compatibility? (https://tc39.es/ecma402/#sec-canonicalizetimezonename)

@ptomato If I'm understanding your question correctly, I think the answer is "yes, it's separate", but I'll explain more below to make sure.

This proposal makes canonicalization mostly invisible to ECMAScript programs, because we'll stop canonicalizing user-supplied time zone IDs.

However, if ECMAScript supplies the ID, then they'll still be canonicalized. There are two places where this happens:

  1. SystemTimeZoneIdentifier, exposed via Temporal.Now, and also Intl.DTF.p.resolvedOptions().timeZone when no timeZone option was supplied
  2. AvailableCanonicalTimeZoneNames, exposed via Intl.supportedValuesOf('timeZone')

In these two AOs, only "UTC" will be returned, never "Etc/UTC", "Etc/GMT", "GMT", "Zulu", etc. This will match existing behavior of these AOs.

However, this special case only deals with canonicalization of IDs that come from ECMAScript itself. If the user supplies an ID like "Etc/UTC", "Etc/GMT", "GMT", or "Zulu", then we won't change it. The user will get back the ID they put in (albeit case-normalized, if needed). This is different from the current behavior of ECMA-402 implementations.

My hope is that during Stage 3, we can get implementation and user feedback to make sure that this change in behavior doesn't cause any major problems. And if it does, we can revert some of the change, e.g. to always canonicalize UTC-resolving IDs. But I agree with Jordan that it probably makes sense to start with the most consistent behavior which is (2) above: don't canonicalize any user-supplied IDs.

Does this answer your question?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants