Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document leap second handling (in FAQ article)? #309

Closed
trevorld opened this issue Nov 28, 2022 · 1 comment
Closed

Document leap second handling (in FAQ article)? #309

trevorld opened this issue Nov 28, 2022 · 1 comment

Comments

@trevorld
Copy link
Contributor

trevorld commented Nov 28, 2022

Perhaps it would make sense to document how {clock} handles leap seconds? Perhaps in the FAQ article?

I'm observing that on my computer {clock} parses leap seconds as NA values (and issues a Warning) and that differences between UTC times are in POSIX seconds (instead of SI/metric seconds)

Package Parsing leap seconds Difference between UTC times
{clock} as NA POSIX seconds
{nanotime} as next second POSIX seconds
base::as.POSIXct() as next second POSIX seconds
base::as.POSIXlt() as leap second POSIX seconds (probably after conversion to POSIXct())
# `base::.leap.seconds` contains (the second after) all UTC leap seconds (so far)
# Due to leap second at end of 2005 here are three different UTC seconds
leap_before <- "2005-12-31T23:59:59"
leap_second <- "2005-12-31T23:59:60"
leap_after <- "2006-01-01T00:00:00"

## (on my computer) {clock} won't parse, differences are POSIX seconds
library("clock")
year_month_day_parse(leap_second, precision = "second")
Warning: Failed to parse 1 string at location 1. Returning `NA` at that location.
<year_month_day<second>[1]>
[1] NA
naive_time_parse(leap_second)
Warning: Failed to parse 1 string at location 1. Returning `NA` at that location.
<clock_naive_time[1]>
[1] NA
sys_time_parse(leap_second)
Warning: Failed to parse 1 string at location 1. Returning `NA` at that location.
<clock_sys_time[1]>
[1] NA
# Difference of two metric seconds but (on some computers) one POSIX second
sys_time_parse(leap_after) - sys_time_parse(leap_before)
<duration<second>[1]>
[1] 1
## {nanotime} will parse as next second, differences are POSIX seconds
library("nanotime")
as.nanotime(paste0(leap_second, "Z"))
[1] 2006-01-01T00:00:00+00:00
as.nanotime(paste0(leap_after, "Z")) - as.nanotime(paste0(leap_before, "Z"))
[1] 00:00:01
## as.POSIXct() will parse as next second, differences are POSIX seconds
as.POSIXct(leap_second, format = "%FT%T", tz = "UTC") |> format(format = "%F %T")
[1] "2006-01-01 00:00:00"
as.POSIXct(leap_after, format = "%FT%T", tz = "UTC") - as.POSIXct(leap_before, format = "%FT%T", tz = "UTC")
Time difference of 1 secs
## as.POSIXlt() will correctly parse leap second...
## but differences are POSIX seconds (presumably converted to `POSIXct` before differencing)
as.POSIXlt(leap_second, format = "%FT%T", tz = "UTC") |> format(format = "%F %T")
[1] "2005-12-31 23:59:60"
as.POSIXlt(leap_after, format = "%FT%T", tz = "UTC") - as.POSIXlt(leap_before, format = "%FT%T", tz = "UTC")
Time difference of 1 secs
@DavisVaughan
Copy link
Member

I'm not sure we should really use any of the current implementations as a good reference source. They all seem kind of hand wavy, and allow parsing of non leap seconds too. Basically it seems like they all have some simple special handling of "60s".

The main thing is that with POSIXct, leap seconds are completely ignored, full stop. See ?POSIXct:

"POSIXct" times used by R do not include leap seconds on any platform.

So really it is just a matter of what to do during parsing.

The "right" solution is to only allow 60s when parsing if you are actually on a leap second date. Then you need a way to store it, and you have to make a decision about what to do with it when converting to sys-time or naive-time (and from there, Date and POSIXct), which don't support leap seconds. <date> includes a utc_clock class that can handle this, and it actually parses correctly by checking against the actual leap seconds to see if it corresponds to a real leap second or not:
https://github.com/HowardHinnant/date/blob/50acf3ffd8b09deeec6980be824f2ac54a50b095/include/date/tz.h#L2022-L2032

When going from utc_clock -> sys-time / naive-time, date maps leap seconds to the nearest possible moment in time before the leap second, which is reasonable.

I may include this in the future, but leap seconds are a little complicated because they are included in the text form of the time zone database (that clock uses now) but not in the binary form of the time zone database on Mac (which we may switch to in the future for performance). So I'd have to come up with a way to deal with that.

For now I will add some docs about this in FAQ as you say

# Note that 2006 here was not a leap second year, but parsing "sort of works" anyways

format <- "%Y-%m-%d %H:%M:%S"

# POSIXlt allows 60s, but that rolls over when converting to POSIXct
lubridate::fast_strptime("2006-12-31 23:59:60", format)
#> [1] "2006-12-31 23:59:60 UTC"
lubridate::fast_strptime("2006-12-31 23:59:60", format, lt = FALSE)
#> [1] "2007-01-01 UTC"

# Can't represent 61s in POSIXlt, so lubridate rolls over even in the POSIXlt
lubridate::fast_strptime("2006-12-31 23:59:61", format)
#> [1] "2007-01-01 00:00:01 UTC"
lubridate::fast_strptime("2006-12-31 23:59:61", format, lt = FALSE)
#> [1] "2007-01-01 00:00:01 UTC"

# But it thinks this is garbage?
lubridate::fast_strptime("2006-12-31 23:59:62", format)
#> [1] NA
lubridate::fast_strptime("2006-12-31 23:59:62", format, lt = FALSE)
#> [1] NA


# POSIXlt allows 60s, rolls over when converting to POSIXct
strptime("2006-12-31 23:59:60", format, tz = "UTC")
#> [1] "2006-12-31 23:59:60 UTC"
as.POSIXct(strptime("2006-12-31 23:59:60", format, tz = "UTC"))
#> [1] "2007-01-01 UTC"

# POSIXlt can't handle 61s, so base R says this is NA
strptime("2006-12-31 23:59:61", format, tz = "UTC")
#> [1] NA
as.POSIXct(strptime("2006-12-31 23:59:61", format, tz = "UTC"))
#> [1] NA


# Rolls over for 60s, errors on 61s
nanotime::as.nanotime("2006-12-31T23:59:60Z")
#> [1] 2007-01-01T00:00:00+00:00
try(nanotime::as.nanotime("2006-12-31T23:59:61Z"))
#> Error in RcppCCTZ::parseDouble(x, fmt = format, tzstr = tz) : 
#>   Parse error on 2006-12-31T23:59:61Z

Created on 2023-04-21 with reprex v2.0.2.9000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants