-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Higher performance / less allocating ZonedDateTimes & TimeZones #182
Comments
Thanks for the feedback! I think there is definitely room for some performance improvements around allocations with
Internally |
Thanks for the explanation - that makes sense, and I have probably been misattributing performance to the timezone logic when in fact is more likely due to allocations (e.g. adding Are there any obvious things that can be done to reduce allocations? |
The benchmark you mentioned: julia> using Dates, TimeZones, BenchmarkTools
julia> @btime (DateTime(2019):Day(1):DateTime(2020)) .+ Minute(1);
309.351 ns (1 allocation: 3.00 KiB)
julia> @btime (ZonedDateTime(2019,tz"UTC"):Day(1):ZonedDateTime(2020,tz"UTC")) .+ Minute(1);
2.300 μs (372 allocations: 14.59 KiB) There are definitely some improvements to be made here. I'll mention first that if you're working primarily in UTC then the only part of that time zone which isn't a bits-type is the name which is a |
Nice, thank you. I did spend a little more time reading the code and it got me wondering if the current common Timezones (Fixed & Variable) could be entirely bitstypes as well. Given that there is a predefined set of timezones, these could be encoded as primitves (Enums or Ints), and have the name (+ any other heavyweight / derivative attributes) be stored or computed externally, e.g. lookup tables or function dispatches. This would all be code-generated, ideally only once when first loading the timezone database. A ZonedDateTime could then just be a DateTime (Int64) + a Timezone encoded as an Int primitive (Int16?) and would be super lightweight for most operations other than displaying & localising I think. If I understand correctly, this would still require |
By the way: do you know of any real scenarios where users of the library are defining their own |
An example would be: julia> FixedTimeZone("CST", -6 * 3600) # Central Standard Time (North America)
CST (UTC-6) I'll also note that internally |
Thanks, I hadn't realised that something like CST wasn't actually in the IANA database. It would be nice to have a majority of common timezones be entirely bitstype for performance, but if that isn't so easy, perhaps the timezone name could be optional (i.e. |
I really like the idea of having the timezone as a type parameter -- I was about to suggest the same thing before I saw this issue. I'm currently writing a database client for a time series DB where I have to deal with rather large arrays of datetime objects. While converting my |
Yeah, having the timezone strings duplicated everywhere is a bit of a pain. In my code, I prefer for every Some interesting stuff is happening in the C++ world wrt datetimes / timezones. https://github.com/HowardHinnant/date/tree/master/include/date is basically the implementation that has been accepted in the c++ 20 standard. I wonder if some inspiration could be taken from there around zero / low cost abstraction handling of this stuff. The C++ community are usually pretty solid in that respect. |
I also went with naive An example of a lib that already does timestamps as type arguments would be chrono for Rust. |
Yes, that's a nice example - special casing UTC and FixedOffset. The tricky part is how to make this non-(performance)-breaking. If
operations on |
I'll note with TimeZones.jl this isn't true as the time zone information is relevant in arithmetic with julia> using TimeZones, Dates
julia> zdt = ZonedDateTime(2015,11,1,tz"America/Winnipeg")
2015-11-01T00:00:00-05:00
julia> zdt + Day(1)
2015-11-02T00:00:00-06:00
julia> zdt + Hour(24)
2015-11-01T23:00:00-06:00
julia> astimezone(astimezone(zdt, tz"UTC") + Day(1), tz"America/Winnipeg")
2015-11-01T23:00:00-06:00 |
True, fair point.
…On Fri, 23 Aug 2019, 19:38 Curtis Vogt, ***@***.***> wrote:
In almost all cases the timezone attached to them is UTC (other timezones
are really a "presentation layer" thing)
I'll note with TimeZones.jl this isn't true as the time zone information
is relevant with working DatePeriods (a Period of a Day or larger):
julia> using TimeZones, Dates
julia> zdt = ZonedDateTime(2015,11,1,tz"America/Winnipeg")2015-11-01T00:00:00-05:00
julia> zdt + Day(1)2015-11-02T00:00:00-06:00
julia> zdt + Hour(24)2015-11-01T23:00:00-06:00
julia> astimezone(astimezone(zdt, tz"UTC") + Day(1), tz"America/Winnipeg")2015-11-01T23:00:00-06:00
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#182?email_source=notifications&email_token=ABA2BXIRH2FUITOS23Q67WLQGAVA7A5CNFSM4GSTWGN2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5BAGYI#issuecomment-524419937>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABA2BXN5FMJDOBH6SCXSBPTQGAVA7ANCNFSM4GSTWGNQ>
.
|
@omus do you have any opinions on if / how to make a large breaking change like this? |
I would say just go for it and don't worry about how breaking the change is to start with. We need a prototype that demonstrates a performance improvement over the original code first through benchmarks (BenchmarkTools.jl). Then we discuss how breaking the change is and possible ways transition cleanly with deprecations. In the worst case scenario we can just use a new type. Unfortunately, I probably won't have time to experiment with this myself so if you want to do the legwork I'll do my best to support. |
Just reading more into the rust chrono implementation. There is the main chrono linked above, which implements Then there is chrono-tz which adds the IANA database - basically doing the job of
(This code is all generated during build time obviously). This design is also very efficient: a I think this design would be great to mirror in Julia. I notice that I'm not sure where stuff like your CST example above fits into all of this. Given that CST just appears to be an well-known (but 'unofficial') alias for UTC-6, perhaps it's not really in the domain of a timezone library? Neither chrono-tz nor python pytz seem to know about it, for example. (Of course, people could still subtype @omus @athre0z do either of you have strong opinions on what I've described above? @athre0z also please correct me if I've misunderstood the Rust code - have not written rust in about 5 years... |
Instead of the enum / bytes based switching, it would also be possible do this using Julia’s method dispatch machinery - i.e. have a VariableTimeZone subtype for every IANA zone. This would be third-party extensible, and faster (zero logic) when specific timezones are known at compile time, but type based dynamic dispatch would be much more expensive than a switch statement otherwise. The nice thing, I suppose, is that these different TimeZone subtypes can be experimented with alongside the current ones, and compared. The issue of lighter weight TimeZone subtypes is pretty orthogonal to the issue of adding a type parameter to ZonedDateTime. |
It seems bad that the TimeZones.jl/src/types/zoneddatetime.jl Line 11 in 70a879c
|
I've been doing some experimentation around having Just wanted to share a very short update so people following this issue know this is being worked on. |
Thinking about this again. @omus one thing I'm confused about: does ZonedDateTime have the zone field, if the utc_datetime field is meant to be UTC? EDIT: I see, it's not the zone of the |
I have a lot of code that deals with DateTimes. Occasionally I do stuff with funny timezones, but a vast majority of the time I represent things in UTC. Operations on ZoneDateTimes are far more expensive than the same operations naive DateTimes (for good reasons: timezones have to be compared + ZonedDateTimes are not bits type, so operations require some indirection & allocation). For performance reasons, I would rather store and manipulate all my UTC ZonedDateTimes as naive DateTimes with the timezone implicitly understood by the programmer, but for clarity, safety, ease of reasoning & programming, etc. I would prefer to have the timezone be explicit.
This got me thinking: it would be neat to have some sort of ZonedDateTime whose UTC-ness is expressed and enforced at the type level, and that is stored as a DateTime, such that UTC-to-UTC comparisons & arithmetic can take a fast path and just be do direct integer operations like DateTimes, with more general ZonedDateTime operations going via the slower ZonedDateTime paths.
This is clearly do-able, and I was more trying to work out how easily it would slot in to TimeZones.jl & where it could go in the type hierarchy etc. in order to avoid as much duplication of code as possible.
One possible means of achieving this (and possibly other interesting specializations) would be if a ZonedDateTime's timezone type were exposed as a type parameter. I'm writing this in a hurry so haven't given it full thought though..
The text was updated successfully, but these errors were encountered: