Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug roundtripping datetime.time objects after midnight in eastern hemisphere timezones (#2417) #2438

Merged

Conversation

YannickJadoul
Copy link
Collaborator

@YannickJadoul YannickJadoul commented Aug 26, 2020

Closes #2417; see description there of what goes wrong.

Many, many thanks to @henryiii and @bstaletic for helping with debugging and fixing this one!

EDIT: Added reviewable.io at @bstaletic's request. Let's give it a try?


This change is Reviewable

@YannickJadoul YannickJadoul requested a review from henryiii August 26, 2020 22:31
Copy link
Collaborator

@bstaletic bstaletic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thanks for giving Reviewable a chance. Plain github reviews get really annoying (to me) as soon as pulls are not trivial.

With @YannickJadoul's permission, I did take the opportunity to talk about some basic Reviewable features. Everyone reviewing this PR, please open any of the reviewable links in this PR and review it there. (The github comments will appear in the main thread in reviewable. Reviewable comments do appear in github, but might also appear on the github diff page.)

Reviewed 3 of 3 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @henryiii and @YannickJadoul)


include/pybind11/chrono.h, line 153 at r1 (raw file):

        if (!PyDateTimeAPI) { PyDateTime_IMPORT; }

        // Declare these special duration types so the conversions happen with the correct primitive types (int)

I know this comment was here before, but to me it's more confusing than helpful. Should we remove it?

Non-blocking, because it was there. (Reviewable has these little buttons where you can see/set "disposition" about others'/your comments.)


include/pybind11/chrono.h, line 160 at r1 (raw file):

        auto us = duration_cast<us_t>(src.time_since_epoch() % seconds(1));
        if (us.count() < 0)
            us += seconds(1);

Is this actually correct? If we had produced -3us, then adding 1s produces 999'997us. Do we want that or 3us?


include/pybind11/chrono.h, line 165 at r1 (raw file):

        // > If std::time_t has lower precision, it is implementation-defined whether the value is rounded or truncated.
        // (https://en.cppreference.com/w/cpp/chrono/system_clock/to_time_t)
        std::time_t tt = system_clock::to_time_t(time_point_cast<system_clock::duration>(src - us));

The above comment might be a good enough reason to completely avoid the C library and rely completely on Chrono.

Copy link
Collaborator Author

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @bstaletic and @henryiii)


include/pybind11/chrono.h, line 153 at r1 (raw file):

Previously, bstaletic (Boris Staletic) wrote…

I know this comment was here before, but to me it's more confusing than helpful. Should we remove it?

Non-blocking, because it was there. (Reviewable has these little buttons where you can see/set "disposition" about others'/your comments.)

To be honest, I just kept this, indeed. From the comment, I believe the type of the microseconds argument from PyDateTime_FromDateAndTime is matched.
I'm not a <chrono> expert, as you know, but it does sound like there was a reason to create this type, so I'm hesitant to just remove it?


include/pybind11/chrono.h, line 160 at r1 (raw file):

Previously, bstaletic (Boris Staletic) wrote…

Is this actually correct? If we had produced -3us, then adding 1s produces 999'997us. Do we want that or 3us?

I think it is. What's happening is that a time on the eastern hemisphere becomes a date + time from before 1970 in UTC (though std::mktime in the load method, which interprets a tm struct as local time). At that point we still add the microseconds which is always a positive amount in Python's datetime, but the total stays negative. So we need to take the modulo of the time vs. seconds, but make sure it's positive

Example:

  • Suppose it's 1 minute, 2 seconds, 3 microseconds after midnight, in GMT+1. That's 62 000 003 µs after midnight. So without date we just add a dummy 1970/01/01 (because Python has separate types for date/time/datetime, but C++ not), and it becomes 1970/01/01 00:01:02.000003 (or, 62 000 003 µs after the UNIX epoch).
  • std::mktime converts that GMT+1 date and time to UTC. Which becomes 1969-12-31 23:01. So, that's 3537999997 before UNIX epoch or -3 537 999 997 after UNIX epoch (= 62 000 003 µs - 3 600 000 000 µs (= 1 hour)).
  • Converting back, you take out the microseconds, so -3 537 999 997 µs % 1000 000 = -999 997 µs. But Python wants a positive amount of microseconds, so we need to make it positive, so you add another second (1 000 000 µs). Just taking the absolute value, 999 997 µs, would not give you the thing you put in, because things work differently in the negative numbers (i.e., -3 537 999 997 is 3 MORE than -3 538 000 000, so it's -3 538 s + 3 µs).
  • Then you subtract it, because we first added it, and because "If std::time_t has lower precision, it is implementation-defined whether the value is rounded or truncated.". But after subtracting the amount of microseconds, you know it's now divisible by 1000 000 (i.e., a second), and ready for tm and localtime again (which will convert the "almost an hour before 1970" back to our local GMT+1 time).

include/pybind11/chrono.h, line 165 at r1 (raw file):

Previously, bstaletic (Boris Staletic) wrote…

The above comment might be a good enough reason to completely avoid the C library and rely completely on Chrono.

Yes, if we can, that would be great, agreed! But time and calenders are horribly complex.

The only thing I found (when we were looking into this together, yesterday evening), would be to add years, months, days to 1970. But this is not the same, because they do not take the calendar into account: a duration of a month or year is defined as the average time, independent of which point in time it's being added to (I thínk; happy to be proven wrong!).

Example: the difference between the first of January and the first of February is 1 "calendar month", just like the difference between the first of February and the first of March. But in duration, one is 31 days, the other 28 (or 29) days. So you can't just add the average duration and hope to get at the right date. Because if you would add the average duration of a month (30.42 days in a regular year) to the 1st of January, you'd end up somewhere just before noon on the 31st of January, and not midnight the 1st of February. That's why adding calendar-months/years is "context-dependent" (by lack of a better term). I don't think std::chrono::duration takes that into account (but I could be wrong).

Happy to be proven wrong on this, though. If <chrono> handles this, I'd very gladly ditch C's old structs and functions!

Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @bstaletic and @YannickJadoul)


include/pybind11/chrono.h, line 153 at r1 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

To be honest, I just kept this, indeed. From the comment, I believe the type of the microseconds argument from PyDateTime_FromDateAndTime is matched.
I'm not a <chrono> expert, as you know, but it does sound like there was a reason to create this type, so I'm hesitant to just remove it?

You need these, but they are pretty standard things to add for chrono, so not sure if the comment is all that helpful.


include/pybind11/chrono.h, line 165 at r1 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

Yes, if we can, that would be great, agreed! But time and calenders are horribly complex.

The only thing I found (when we were looking into this together, yesterday evening), would be to add years, months, days to 1970. But this is not the same, because they do not take the calendar into account: a duration of a month or year is defined as the average time, independent of which point in time it's being added to (I thínk; happy to be proven wrong!).

Example: the difference between the first of January and the first of February is 1 "calendar month", just like the difference between the first of February and the first of March. But in duration, one is 31 days, the other 28 (or 29) days. So you can't just add the average duration and hope to get at the right date. Because if you would add the average duration of a month (30.42 days in a regular year) to the 1st of January, you'd end up somewhere just before noon on the 31st of January, and not midnight the 1st of February. That's why adding calendar-months/years is "context-dependent" (by lack of a better term). I don't think std::chrono::duration takes that into account (but I could be wrong).

Happy to be proven wrong on this, though. If <chrono> handles this, I'd very gladly ditch C's old structs and functions!

We already use some things from chrono, so it seems like we shouldn't be trying to mix?


tests/test_chrono.py, line 83 at r1 (raw file):

    datetime.time(5, 59, 59, 0),
    datetime.time(5, 59, 59, 1),
])

This doesn't actually fail when run without the changes to pybind11. Is there a way to trigger this failure?

@YannickJadoul YannickJadoul requested a review from henryiii August 27, 2020 13:35
Copy link
Collaborator Author

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @bstaletic and @henryiii)


include/pybind11/chrono.h, line 153 at r1 (raw file):

Previously, henryiii (Henry Schreiner) wrote…

You need these, but they are pretty standard things to add for chrono, so not sure if the comment is all that helpful.

OK, I'll remove the comment, then!


include/pybind11/chrono.h, line 165 at r1 (raw file):

Previously, henryiii (Henry Schreiner) wrote…

We already use some things from chrono, so it seems like we shouldn't be trying to mix?

It would be nice, yes. Do you know how to convert a "calendar" time stamp to a chrono time_point (taking leap years and month lengths etc into account) ? I'd love to ditch the old stuff and go full modern C++ :-)


tests/test_chrono.py, line 83 at r1 (raw file):

Previously, henryiii (Henry Schreiner) wrote…

This doesn't actually fail when run without the changes to pybind11. Is there a way to trigger this failure?

Setting your timezone to UTC/GMT+x will make all tests between 00:00:00 and 0x:00:00 fail, if they have a non-zero amount of milliseconds (otherwise, the bug doesn't get triggered).

The example on https://en.cppreference.com/w/cpp/chrono/c/mktime seems to show you can change the TZ environment variable on POSIX? setenv("TZ", "/usr/share/zoneinfo/America/New_York", 1); // POSIX-specific
So if you want to join me here, "/usr/share/zoneinfo/Europe/Brussels" would hopefully demonstrate.

Copy link
Collaborator Author

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 3 unresolved discussions (waiting on @bstaletic and @henryiii)


tests/test_chrono.py, line 83 at r1 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

Setting your timezone to UTC/GMT+x will make all tests between 00:00:00 and 0x:00:00 fail, if they have a non-zero amount of milliseconds (otherwise, the bug doesn't get triggered).

The example on https://en.cppreference.com/w/cpp/chrono/c/mktime seems to show you can change the TZ environment variable on POSIX? setenv("TZ", "/usr/share/zoneinfo/America/New_York", 1); // POSIX-specific
So if you want to join me here, "/usr/share/zoneinfo/Europe/Brussels" would hopefully demonstrate.

Oh, yes. And to get the 5, 59, 59, 1 to fail, you need to be over in "/usr/share/zoneinfo/Asia/Pyongyang" or "/usr/share/zoneinfo/Asia/Tokyo" or "/usr/share/zoneinfo/Australia/Sydney" or ... well, your favorite holiday destination that's at least GMT+6 on the first of January.

@henryiii
Copy link
Collaborator

Oops! Sorry, didn't mean to push to this branch. 😳

@henryiii henryiii force-pushed the fix-chrono-eastern-timezone branch from 8361044 to af8efcf Compare August 27, 2020 13:53
tests/test_chrono.py Outdated Show resolved Hide resolved
@henryiii henryiii force-pushed the fix-chrono-eastern-timezone branch from af8efcf to 13d2275 Compare August 27, 2020 13:58
Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 3 files reviewed, 3 unresolved discussions (waiting on @bstaletic, @henryiii, and @YannickJadoul)


tests/test_chrono.py, line 83 at r1 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

Oh, yes. And to get the 5, 59, 59, 1 to fail, you need to be over in "/usr/share/zoneinfo/Asia/Pyongyang" or "/usr/share/zoneinfo/Asia/Tokyo" or "/usr/share/zoneinfo/Australia/Sydney" or ... well, your favorite holiday destination that's at least GMT+6 on the first of January.

Okay, added to tests (fixed previous unintentional push). Can you check it locally without your change to pybind11's chrono to verify it partially fails there?

@henryiii
Copy link
Collaborator

henryiii commented Aug 27, 2020

Here's what I get before:

test_chrono.py .....FFF.....FFF..F...............

Looks like I pushed the change to chrono.h again (too much git add -u), so let's wait to revert that change until the tests fail.

@henryiii henryiii force-pushed the fix-chrono-eastern-timezone branch from 13d2275 to 5877469 Compare August 27, 2020 14:04
@henryiii henryiii force-pushed the fix-chrono-eastern-timezone branch from 5877469 to 8a301ec Compare August 27, 2020 14:07
@henryiii
Copy link
Collaborator

Okay, I'm done force pushing. I'm happy with the tests failing before but not after (at least locally). :) Great catch and fix!

Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 3 files reviewed, 4 unresolved discussions (waiting on @bstaletic and @YannickJadoul)


include/pybind11/chrono.h, line 165 at r1 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

It would be nice, yes. Do you know how to convert a "calendar" time stamp to a chrono time_point (taking leap years and month lengths etc into account) ? I'd love to ditch the old stuff and go full modern C++ :-)

You mean in C++20? https://en.cppreference.com/w/cpp/chrono#Calendar

Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 1 of 3 files reviewed, 4 unresolved discussions (waiting on @bstaletic and @YannickJadoul)

@YannickJadoul YannickJadoul requested a review from henryiii August 27, 2020 14:17
Copy link
Collaborator Author

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for strengthening these tests, @henryiii! Better to have this fail everywhere, rather than just on this side of the Atlantic ;-)

Another thing I'm still worried about: time still gets turned back/forward to be stored in the chrono types, depending on the time zone. Should we be doing that? What's the meaning of Python's datetime wrt. timezones? (might be a separate issue though; this bugfix is still valid, independently)

Reviewable status: 1 of 3 files reviewed, 3 unresolved discussions (waiting on @bstaletic, @henryiii, and @YannickJadoul)


include/pybind11/chrono.h, line 165 at r1 (raw file):

Previously, henryiii (Henry Schreiner) wrote…

You mean in C++20? https://en.cppreference.com/w/cpp/chrono#Calendar

Ah, another pybind20 feature ;-)


tests/test_chrono.py, line 83 at r1 (raw file):

Previously, henryiii (Henry Schreiner) wrote…

Okay, added to tests (fixed previous unintentional push). Can you check it locally without your change to pybind11's chrono to verify it partially fails there?

OK, will do!

@henryiii
Copy link
Collaborator

We need to fix that Windows time length...

@YannickJadoul
Copy link
Collaborator Author

We need to fix that Windows time length...

What do you mean?

@henryiii
Copy link
Collaborator

What do you mean?

The CI flake issue where windows sometimes reports a slightly larger time difference. Doesn't need to be in this PR, but we should fix it in the near future.

Copy link
Collaborator

@bstaletic bstaletic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly, :lgtm:. Only one comment regarding Windows/TZ

Reviewed 1 of 3 files at r3, 1 of 1 files at r4.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @henryiii and @YannickJadoul)


include/pybind11/chrono.h, line 153 at r1 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

OK, I'll remove the comment, then!

The comment wasn't removed. Just commenting because Reviewable thinks this is resolved, since @YannickJadoul started the previous comment with "OK", which is a magic word in Reviewable.


include/pybind11/chrono.h, line 165 at r1 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

Ah, another pybind20 feature ;-)

Yeah, there's no std::chrono::year in C++11... I'm sure ditching the C library and going full C++11 Chrono is possible, but might not be very friendly to maintainers. Happy to leave that part for another time.


tests/test_chrono.py, line 88 at r3 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

Except for fooling us (and others) into thinking it ís tested, and running the same tests 3 times. But yes, OK.

We could add pytest.param(..., marks=pytest.mark.skipif("...", reason="TZ environment variable not supported on Windows")),?

That skipif is probably a good idea.

Copy link
Collaborator Author

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @bstaletic and @YannickJadoul)


include/pybind11/chrono.h, line 153 at r1 (raw file):

Previously, bstaletic (Boris Staletic) wrote…

The comment wasn't removed. Just commenting because Reviewable thinks this is resolved, since @YannickJadoul started the previous comment with "OK", which is a magic word in Reviewable.

HEEEEEY, can I decide on my own magic words, Reviewable? :-D

Copy link
Collaborator Author

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think I replied to everything :-)
What to do now, @bstaletic, with Reviewable? Am I waiting for another review of the final changes? Do I need to "LGTM" my own PR? Any other magic buttons to click or try? (e.g., what's up with this "Mark as reviewed'?)

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @bstaletic and @henryiii)


include/pybind11/chrono.h, line 153 at r1 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

HEEEEEY, can I decide on my own magic words, Reviewable? :-D

Done


tests/test_chrono.py, line 83 at r1 (raw file):

Previously, YannickJadoul (Yannick Jadoul) wrote…

OK, will do!

Done! Also fails for me (except in New York), without the changes, so thanks for turning these tests deterministic!


tests/test_chrono.py, line 88 at r3 (raw file):

Previously, bstaletic (Boris Staletic) wrote…

That skipif is probably a good idea.

Done

@YannickJadoul YannickJadoul force-pushed the fix-chrono-eastern-timezone branch from 85e4632 to 83379f9 Compare August 27, 2020 22:09
Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 1 of 3 files reviewed, 2 unresolved discussions (waiting on @bstaletic and @henryiii)

Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 1 of 3 files reviewed, 2 unresolved discussions (waiting on @bstaletic and @henryiii)

Copy link
Collaborator

@bstaletic bstaletic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely looks good. :lgtm_strong:

Two things about reviewable now.

  1. lgtm_strong isn't invalidated after a push.
  2. The one making a blocking comment needs to acknowledge that the change has been made before the discussion is resolved. In this case there's one unresolved discussion because @henryiii didn't ACK.

Reviewed 2 of 2 files at r5.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @henryiii)

@YannickJadoul YannickJadoul requested a review from henryiii August 28, 2020 13:13
Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@YannickJadoul YannickJadoul merged commit 6a19278 into pybind:master Aug 28, 2020
@YannickJadoul YannickJadoul deleted the fix-chrono-eastern-timezone branch August 28, 2020 13:21
@YannickJadoul
Copy link
Collaborator Author

Hurray, we managed working with Reviewable! :-D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Eastern hemisphere timezones break std::chrono::time_point<std::chrono::system_clock, Duration> caster
3 participants