-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Timezone to Scalar::Time* types, and better timezone awareness to Datafusion's time types #1455
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @maxburke -- this looks like a great improvement!
Would it be possible to add some basic sql end to end tests in sql.rs??
Perhaps taking advantage of make_timestamp_table
to test: https://github.com/apache/arrow-datafusion/blob/0052667afae33ba9e549256d0d5d47e2f45e6ffb/datafusion/tests/sql.rs#L4106-L4109
I am specifically thinking about coverage for the coercion logic you added (perhaps showing some basic comparison on timestamp types as wwell as min
and max
for timestamp with timezone types?)
Also, I am trying out this branch with IOx and so far it is going well 👍 : https://github.com/influxdata/influxdb_iox/pull/3407
datafusion/src/logical_plan/expr.rs
Outdated
@@ -974,7 +974,7 @@ impl std::fmt::Display for Expr { | |||
ref left, | |||
ref right, | |||
ref op, | |||
} => write!(f, "{} {} {}", left, op, right), | |||
} => write!(f, "{:?} {} {:?}", left, op, right), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the change to use Debug
formatting rather than Display
formatting ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this was part of my own debugging when trying to sort out why my queries were failing. I'll remove.
match (lhs_type, rhs_type) { | ||
(Utf8, Date32) => Some(Date32), | ||
(Date32, Utf8) => Some(Date32), | ||
(Utf8, Date64) => Some(Date64), | ||
(Date64, Utf8) => Some(Date64), | ||
(Timestamp(lhs_unit, lhs_tz), Timestamp(rhs_unit, rhs_tz)) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
(l, r) => { | ||
assert_eq!(l, r); | ||
l.clone() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(l, r) => { | |
assert_eq!(l, r); | |
l.clone() | |
} | |
(l, r) if l == r => { | |
l.clone() | |
} | |
_ => unreachable!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this but we need an else
block in order to make the types align, so this becomes:
(l, r) => if l ==r { l.clone() } else { unreachable!() }
which is semantically identical, but the version with assert_eq!
will provide more contextual information when the condition doesn't hold
…h a timezone, fix a missed case in the binary ops applied to timestamp types with timezones
Done! |
it's a great fix. |
@@ -6615,3 +6622,357 @@ async fn csv_query_with_decimal_by_sql() -> Result<()> { | |||
assert_batches_eq!(expected, &actual); | |||
Ok(()) | |||
} | |||
|
|||
#[tokio::test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about add some cases with different TIME_ZONE, for example UTC-7
or UTC+8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change doesn't do any computation with the timezones, it just passes them through from the underlying data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, among other things I also don't think arrow-rs respects timezones fully (and it doesn't have a full set of arithmetic on time/date/timestamp types either)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great -- thank you @maxburke
#[tokio::test] | ||
async fn timestamp_coercion() -> Result<()> { | ||
{ | ||
let mut ctx = ExecutionContext::new(); | ||
let table_a = make_timestamp_table::<TimestampSecondType>()?; | ||
let table_b = make_timestamp_table::<TimestampMillisecondType>()?; | ||
let table_a = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Closes #1454