-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use session TimeZone for timestamp_tz column #8680
Comments
Iceberg's Mapping to session zone would have representability consequences as in #5488. |
Although iceberg's timestamptz format can only represent one time point physically, iceberg connector stipulates that this time point is UTC time zone, that is, it includes the concept of time zone. We think that there is no problem in processing time with time zone according to the local time zone. After all, if necessary, the time zone can be converted to UTC again. In the Hive connector, we found that when processing the int96 format in parquet, the timezone in session is used when processing the column with the field type of The above implementation can be seen in Therefore, we think that we can use session timezone when processing iceberg's timestamptz format. |
yes, because `timestamp with time zone is the only Trino type that can be used to represent a point in time.
depending on the local time zone and actual data, their indeed might be no problem, or there may be
that looks like a bug in Hive connector, because some values can be misrepresented, or may hit the #5781 problem |
Although all problems could be resolved by converting to UTC again, problems related to DST are often ambiguous and inexplicit. That is the reason why you would like to use UTC as default time zone. Is my understanding correct? Like for #5488, if user convert time zone to 'America/Los_Angeles', they will still get the same result.
This conversion must be obvious, and should not be done by connector. |
Note that:
two different points in time converted to timestamp with time zone in PT zone are then getting returned to the user with identical representation. Also, the fact that engine discerns them (even though they look the same) is a bug tracked in #5781 |
when query timestamp_tz column data from iceberg table, I found that Trino uses the UTC time zone to process these columns, which increases the cost of using queries.
For example:
select ts for table;
specify time zone
Can we use session time zone to process these fields with time zone so that users do not need to make additional time zone conversion when querying?
There are places in the processing logic of Parquet/ORC file where the time zone can be modified. I don't know whether this is reasonable.
I hope to get better suggestions. Welcome to discuss
The text was updated successfully, but these errors were encountered: