From da3c74eccc4978bdaeca4760e98a77aff560e19b Mon Sep 17 00:00:00 2001 From: Arttu Date: Mon, 29 Jul 2024 01:30:19 +0200 Subject: [PATCH] fix: use int64 instead of uint64 for PrecisionTimestamp(Tz) literal value (#668) This allows timestamps to refer to time before epoch, and aligns with other systems (Spark, DataFusion/Arrow, DuckDB, Postgres, Parquet at least) BREAKING CHANGE: PrecisionTimestamp(Tz) literal's value is now int64 instead of uint64 --- proto/substrait/algebra.proto | 2 +- site/docs/types/type_classes.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/proto/substrait/algebra.proto b/proto/substrait/algebra.proto index d85b1ce65..b2d1a87d0 100644 --- a/proto/substrait/algebra.proto +++ b/proto/substrait/algebra.proto @@ -861,7 +861,7 @@ message Expression { // Sub-second precision, 0 means the value given is in seconds, 3 is milliseconds, 6 microseconds, 9 is nanoseconds int32 precision = 1; // Time passed since 1970-01-01 00:00:00.000000 in UTC for PrecisionTimestampTZ and unspecified timezone for PrecisionTimestamp - uint64 value = 2; + int64 value = 2; } message Map { diff --git a/site/docs/types/type_classes.md b/site/docs/types/type_classes.md index fda1cfade..26233493a 100644 --- a/site/docs/types/type_classes.md +++ b/site/docs/types/type_classes.md @@ -41,8 +41,8 @@ Compound type classes are type classes that need to be configured by means of a | NSTRUCT<N:T1,...,N:Tn> | **Pseudo-type**: A struct that maps unique names to value types. Each name is a UTF-8-encoded string. Each value can have a distinct type. Note that NSTRUCT is actually a pseudo-type, because Substrait's core type system is based entirely on ordinal positions, not named fields. Nonetheless, when working with systems outside Substrait, names are important. | n/a | LIST<T> | A list of values of type T. The list can be between [0..2,147,483,647] values in length. | `repeated Literal`, all types matching T | MAP<K, V> | An unordered list of type K keys with type V values. Keys may be repeated. While the key type could be nullable, keys may not be null. | `repeated KeyValue` (in turn two `Literal`s), all key types matching K and all value types matching V -| PRECISIONTIMESTAMP<P> | A timestamp with fractional second precision (P, number of digits) 0 <= P <= 9. Does not include timezone information and can thus not be unambiguously mapped to a moment on the timeline without context. Similar to naive datetime in Python. | `uint64` microseconds or nanoseconds since 1970-01-01 00:00:00.000000000 (in an unspecified timezone) -| PRECISIONTIMESTAMPTZ<P> | A timezone-aware timestamp, with fractional second precision (P, number of digits) 0 <= P <= 9. Similar to aware datetime in Python. | `uint64` microseconds or nanoseconds since 1970-01-01 00:00:00.000000000 UTC +| PRECISIONTIMESTAMP<P> | A timestamp with fractional second precision (P, number of digits) 0 <= P <= 9. Does not include timezone information and can thus not be unambiguously mapped to a moment on the timeline without context. Similar to naive datetime in Python. | `int64` seconds, milliseconds, microseconds or nanoseconds since 1970-01-01 00:00:00.000000000 (in an unspecified timezone) +| PRECISIONTIMESTAMPTZ<P> | A timezone-aware timestamp, with fractional second precision (P, number of digits) 0 <= P <= 9. Similar to aware datetime in Python. | `int64` seconds, milliseconds, microseconds or nanoseconds since 1970-01-01 00:00:00.000000000 UTC ## User-Defined Types