Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing timezone-aware datetime handling #22

Closed
wants to merge 2 commits into from
Closed

Fixing timezone-aware datetime handling #22

wants to merge 2 commits into from

Conversation

catermelon
Copy link

Previously, passing the current time in UTC will generate incorrect deltas. For example, if I pass a recent time in UTC, it'll tell you '7 hours from now' when the answer really is '30 minutes ago'.

I've changed it to look for tz awareness in the date string and if it finds it, it will calculate the delta using the current date in UTC.

Added tests around the change - mostly just copypasting what was there, except with UTC dates. Let me know if you want any changes there. Thanks.

Previously, passing the current time in UTC will generate incorrect deltas - for pacific time,
it'll tell you '7 hours from now' when the answer really is '30 minutes ago'.

I've changed it to look for tzawareness in the date string and if it finds it, it will
calculate the delta using the current date in UTC.

Added tests around the change - mostly just copypasting what was there, except with UTC dates.
@jmoiron
Copy link
Owner

jmoiron commented Nov 13, 2014

Thanks for this great PR.

This is a question borne entirely of naivete, but is this possible to do with the builtin python timezone? It's not a deal breaker since the current behavior is not good, but I'd rather not add the pytz dependency if possible.

@catermelon
Copy link
Author

That's a good question - pytz is a pretty standard library when it comes to handling timezones, but let me see what I can do.

…e date math won't work across timezone boundaries if DST is in effect.
@catermelon
Copy link
Author

Man, I hate timezones. There's like one "right" way to do it, and every other way leads to spiders. 👾

TL;DR: No, because of spiders

I had to go look up my research on timezones again, which is good because I found a flaw in what I was doing.

Basically, the standard libraries are totally broken when it comes to handling daylight savings time. See http://legacy.python.org/dev/peps/pep-0431/, especially this bit about wanting to import pytz-like behavior into the standard lib:

The time zone support in Python has no concrete implementation in the standard library outside of a tzinfo baseclass that supports fixed offsets. To properly support time zones you need to include a database over all time zones, both current and historical, including daylight saving changes. But such information changes frequently, so even if we include the last information in a Python release, that information would be outdated just a few months later.

Time zone support has therefore only been available through two third-party modules, pytz and dateutil, both who include and wrap the "zoneinfo" database.

pytz abstracts that away for you. Reading this made me realize I was just doing date math on any incoming datetime objects, including ones in non-UTC timezones, and to do the datetime arithmetic safely, I'd have to convert them to UTC, which is broken if done through the standard library because of the former problem.

pytz does all this correctly, and it is the gold standard for handling timezones.

@jmoiron
Copy link
Owner

jmoiron commented Nov 13, 2014

Ah okay, so the actual databases of timezones to do the conversions with aren't in the stdlib. That makes total sense. I've used dateutil in the past and there was a period where it would install the python 3.x version on 2.x, so I'm inclined to prefer pytz.

That being said, I wonder, is it better to:

  1. always assume (or otherwise convert) to UTC to do the comparison or...
  2. require UTC, dodging the need to guess/convert

2 might break some of the simplest use cases and offload a bit of complexity on the caller, but it's a dead simple API both to document and implement. It also allows the user to control how timezones are approached. 1 might be more convenient, but there are also "naive" times which do not contain the relevant information to convert; I've encountered libraries which throw errors when they get these, is that the right approach?

Plenty of spiders. I'm open to suggestion.

@catermelon
Copy link
Author

That is a really good point, I didn't think about how to properly handle naive datetimes.

I'm in favor of making libraries do the hard work, so I think the most natural approach to assume all naive timezones are in the user's timezone. So if you got a naive datetime, you'd add that tzinfo and then convert it to UTC to do the date math. That way datetime.now() and datetime.today() would just work without people having to think about it..

However...it looks like pytz has no way of getting the current timezone, and the stdlib functions doesn't take into account datetime so you get wrong stuff.. I found this library: https://github.com/regebro/tzlocal which fills that hole.

It appears that dateutil (the other timezone library) may have this functionality, so I could look into using that, instead.

@posita posita mentioned this pull request May 12, 2015
@posita
Copy link

posita commented May 12, 2015

So if you got a naive datetime, you'd add that tzinfo and then convert it to UTC to do the date math.

I may be oversimplifying, but why not go the other way and treat offset-naive datetimes as they are treated now, but assume the caller has provided a decent tzinfo implementation for offset-aware datetimes? In other words, leave the notion of "now" as-is for offset-naive values, and only intervene if they are offset-aware. If that's the case, I'm pretty sure all you need is a functional tzinfo implementation for UTC.

Something like my #9 (comment)? First, you'd have to make or steal a UTC tzinfo implemenation:

# new class (plundered from dateutil.tz.tzutc)
class myutctzinfo(datetime.tzinfo):
    ZERO = timedelta(0)
    def utcoffset(self, dt):
        return myutctzinfo.ZERO
    def dst(self, dt):
        return myutctzinfo.ZERO
    def tzname(self, dt):
        return u"UTC"
    def __eq__(self, other):
        return (isinstance(other, tzutc) or
                (isinstance(other, tzoffset) and other._offset == myutctzinfo.ZERO))
    def __ne__(self, other):
        return not self.__eq__(other)
    def __repr__(self):
        return u"%s()" % self.__class__.__name__
    __reduce__ = object.__reduce__

_TZ_UTC = myutctzinfo() # new singleton

Then you could use your UTC tzinfo to make humanize.time.date_and_delta offset-tolerant this way:

def date_and_delta(value):
    ...
    now = _now()
    if isinstance(value, datetime):
        # ---- BEGIN NEW CODE ----
        if value.tzinfo is not None:
            now = datetime.utcnow().replace(tzinfo=_TZ_UTC)
        # ----- END NEW CODE -----
        date = value
        delta = now - value
    ...

The idea is to leave the complexity of dealing with local timezones (and library choice) up to the caller. That being said, @trustrachel, you sound much more "timezone aware" than I am. 😁 What am I missing with the above approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants