Support for nano-second/microsecond timestamps #13063

MertHoc · 2019-07-05T19:59:39Z

Adding Nanosecond support to PrestoDB

Introduction

We wish to support Nanosecond timestamps within Presto to support companies that retrieve data at that granularity. One industry that deals with nanosecond granularity is the finance industry.

Within this project, we will introduce a Fractional second support to TIMESTAMP, and TIMESTAMP WITH TIME ZONE with precision greater than 3 (ms). For example:

CREATE EXTERNAL TABLE test (
	timestamp_microseconds TIMESTAMP(6),
	timestamp_nano_with_tz TIMESTAMP(9) WITH TIME ZONE
) STORED AS TEXT LOCATION 'XYZ';

Design Decisions:

Encoding

The current timestamp data types are being encoded as long at the millisecond resolution[1][2][3] when packing into blocks during shuffling and movement of data. The original thought when looking at this project was to always encode the timestamp at the nanosecond resolution within an existing long. With this method, we could store timestamps between the years 1678 to 2262 [4]. If we needed to In the future, if we needed a wider range, we would add a new int that would store the nanoseconds from midnight, similar to how other implementations store timestamps. This approach allowed us to minimize the number of code changes while keeping the ability to enhance the time range in the future if needed.

However, after some research, this approach may not work. For timestamps that contain time zone information, the timezone is packed into the long using the last 3 bytes of the long, and the milliseconds is shifted by the 3 bytes to the left and stored in the remaining bytes[6]. This reduces the available range of possible dates to only 20 days from Jan 1, 1970 [5] which is not sufficient. Thus, we will be forced to information needed into a buffer larger than 8 bytes. The components that we would need to store are:

the timestamp in milliseconds (minimum 50 bits to remain compatible with todays range [0-1,125,899,906,842,624])
nanoseconds portion (minimum of 20 bits [0-1,000,000])
timezone (12 bits if implemented in the same fashion as today)

I believe that precision is not needed to be stored with the other information as we will treat everything at nanosecond resolution.

Thus, I am proposing the following:

For time only data types (TIME, TIME WITH TIME ZONE) we change the resolution to nanosecond, and leave the data type as a Long is sufficient to store the data.
For TIMESTAMP and TIMESTAMP WITH TIME ZONE (data types that contain both date and time), we would need to add an extra int (4 bytes) to store the nanoseconds portion.

Impact:
The impact of adding the extra 4 bytes (int) will be the following:

When users upgrade, they may start to see certain queries start to fail due to OOM since we are making the timestamp bigger.
Functions that originally returned long will need to be changed to something else. This is something that we still need to figure out. Example functions are currentTimestamp (

presto/presto-main/src/main/java/com/facebook/presto/operator/scalar/DateTimeFunctions.java

Line 143 in 1c1108f

public static long currentTimestamp(ConnectorSession session)

). However, these functions do not seem to be called from anywhere.

Mitigation:
There are two mitigation strategies we can employ:

We can employ a config parameter to determine if we pack and unpack the 4 bytes for nanoseconds. If we do not pack the extra 4 bytes, then the behavior should be exactly the same.
We can pack a single bit that determines if a timestamp is being packed using the 4 bytes for nanoseconds. If so, then we unpack an int from the block.

Effects on Precision when comparing two timestamps with different precisions:

The result of any operation on two timestamps will result with a timestamp that is of higher precision. The precision decimals of the lower precision timestamp will be assumed to be 0 if the digits do not exist. This is the behavior of DB2, and seems to be specified in the SQL Spec. (See below for details).

Justification:
As per SQL Spec (https://standards.iso.org/ittf/PubliclyAvailableStandards/c060394_ISO_IEC_TR_19075-2_2015.zip)
"Year-month intervals are comparable only with other year-month intervals. If two year-month intervals have different interval precision, they are, for the purpose of any operations between them, converted to the same precision by appending new datetime fields to either one of the ends of one interval, or to both ends. New datetime fields are assigned a value of 0 (zero)."

Similarly with "Day-time intervals are comparable only with other day-time intervals. If two day-time intervals have different interval precision, they are, for the purpose of any operations between them, converted to the same precision by appending new datetime field to either one of the ends of one interval, or to both ends. New datetime fields are assigned a value of 0 (zero)."

From DB2’s documentation, ( https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/sqlref/src/tpc/db2z_datetimecomparisions.html)
"When comparing timestamp values with different precision, the higher precision is used for the comparison and any missing digits for fractional seconds are assumed to be zero."
Displaying Timestamps with Nanosecond granularity:

Today, I believe we are always displaying the timestamp in "uuuu-MM-dd HH:mm:ss.SSS" format. I believe that this should continue and provide functions that can output different formats (date_format()).

What changes are being made?

(THIS IS NOT EXHAUSTIVE AS OF YET)

Grammar Changes:
SqlBase.g4 -> Add specification for precision in grammar (

presto/presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4

Lines 761 to 767 in 7545741

    
           TIME_WITH_TIME_ZONE 
        
               : 'TIME' WS 'WITH' WS 'TIME' WS 'ZONE' 
        
               ; 
        
           TIMESTAMP_WITH_TIME_ZONE 
        
               : 'TIMESTAMP' WS 'WITH' WS 'TIME' WS 'ZONE' 
        
               ;

)

Change SPI to change Long’s to int128 for time/timestamps.

Functions:

JDBC:

https://github.com/prestodb/presto/blob/master/presto-jdbc/src/main/java/com/facebook/presto/jdbc/ColumnInfo.java

Parquet Changes:

ORC Changes:

https://github.com/prestodb/presto/blob/master/presto-orc/src/main/java/com/facebook/presto/orc/writer/TimestampColumnWriter.java

RCFile Changes:

presto/presto-rcfile/src/main/java/com/facebook/presto/rcfile/text/TimestampEncoding.java

Line 106 in 7545741

type.writeLong(builder, parseTimestamp(slice, offset, length));

Further changes depending on acceptance on Design.

Endnotes
[1] SqlTime -

presto/presto-spi/src/main/java/com/facebook/presto/spi/type/SqlTime.java

Lines 29 to 30 in 7545741

    
           private final long millis; 
        
           private final Optional<TimeZoneKey> sessionTimeZoneKey;

[2] SqlTimestamp -

presto/presto-spi/src/main/java/com/facebook/presto/spi/type/SqlTimestamp.java

Lines 32 to 33 in 7545741

    
           private final long millis; 
        
           private final Optional<TimeZoneKey> sessionTimeZoneKey;

[3] SqlTimeWithTimeZone -

presto/presto-spi/src/main/java/com/facebook/presto/spi/type/SqlTimeWithTimeZone.java

Lines 33 to 34 in 7545741

    
           private final long millisUtc; 
        
           private final TimeZoneKey timeZoneKey;

[4] 9223372036854775807 (size of long) / 1000,000,000 (ns => s ) / 60 (sec/min) / 60 (min/hr) / 24 (hr/day) / 365 (days/year) = 292 years. 1970 + 292 = 2262, 1970 - 292 = 1678

[5] 2^(64-12) (size of long) / 1000,000,000 (ns => s ) / 60 (sec/min) / 60 (min/hr) / 24 (hr/day) / 365 (day/year) ~ 3 years.

[6] DateTimeEncoding.java -

presto/presto-spi/src/main/java/com/facebook/presto/spi/type/DateTimeEncoding.java

Line 26 in af21005

private static final int TIME_ZONE_MASK = 0xFFF;

The text was updated successfully, but these errors were encountered:

hocanint-amzn · 2019-07-08T16:16:46Z

Sorry all, I created this SIM with a very old account. I will be interacting with this SIM through this account including any PR's, if any.

aweisberg · 2019-07-08T21:26:40Z

I talked about this a bit with @oerling. It looks like the syntax for this is to have "TYPE ( p )" where p is the fractional seconds that is supported. This is how it's done in the ANSI SQL standard.

There are two parts of this. One is reusable no matter what we go with internally in terms fo 64-bits or 96-bits. The work is to add the syntax and storing the additional type information in the metadata store. Once you can specify precision all the things that rely on precision like formatting and parsing or that supply metadata for columns need to be updated to pass that information on.

The second part IMO which is maybe optional is adding support for more than 64-bits of precision. And my question is do we have to do that now? Is it worth just changing how we interpret the existing 64-bits since we need to do all that work anyways? What if we never come across a use case that needs that much precision outside that time range?

wenleix · 2019-07-09T19:34:57Z

@MertHoc : Thanks for the contribution. One unrelated trick: when you want to refer to a specific line in a file, you might want to get the permanent link to files: https://help.github.com/en/articles/getting-permanent-links-to-files . Otherwise, the link will refer to different lines once file changed :)

MertHoc · 2019-07-30T02:26:08Z

@wenleix Thanks for the tip. I have updated my links to permalinks
@aweisberg I have updated the summary with the doc that I shared with you.
If we can please review, we would love to get started on implementation. Thanks!

rschlussel · 2019-08-01T20:46:40Z

@MertHoc thanks for the great write up!

Functions that originally returned long will need to be changed to something else. This is something that we still need to figure out. Example functions are currentTimestamp. However, these functions do not seem to be called from anywhere.

Those are documented user-facing functions. They return a long because that's the current representation for the TIMESTAMP_WITH_TIMEZONE type (they are annotated as returning that type). If the representation of timestamp_with_timezone changes, then the return type of those functions should be changed so that a user still gets back a timestamp with time zone ( TimestampWithTimeZoneType should still parse the result correctly).

hocanint-amzn · 2019-08-13T13:33:51Z

@rschlussel Thanks for the information. That makes a lot of sense.
Everyone else, there were concerns about the potential performance impact of this change, as we are increasing the number of bytes needed to store timestamps in a block and the potential of breaking existing users as timestamps will higher memory requirements. How can we move forward ?

aweisberg · 2019-08-13T19:53:44Z

You have made the case and given interested parties a chance to comment. I know @arhimondr @rschlussel, @mbasmanova and @oerling have considered this and think it's a reasonable thing to do. We'll have to measure the result, but I think it's going to be good enough.

oerling · 2019-08-13T20:05:38Z

I think that performance or memory consumption differences arising from this will fall below measurement threshold: If 2% of columns are timestamps and a few of these take double space in query intermediate results, nobody will notice even if looking. I further think that getting this matter successfully integrated into PrestoDB is a measure of our stated willingness to engage with the community. From: Mert Hocanin <notifications@github.com> Sent: Tuesday, August 13, 2019 6:35 AM To: prestodb/presto <presto@noreply.github.com> Cc: oerling <erling@xs4all.nl>; Mention <mention@noreply.github.com> Subject: Re: [prestodb/presto] Support for nano-second/microsecond timestamps (#13063) @rschlussel <https://github.com/rschlussel> Thanks for the information. That makes a lot of sense. Everyone else, there were concerns about the potential performance impact of this change, as we are increasing the number of bytes needed to store timestamps in a block and the potential of breaking existing users as timestamps will higher memory requirements. How can we move forward ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13063?email_source=notifications&email_token=AKPPPT4NL5JKAX4G6OM2W33QEKZ7LA5CNFSM4H6NUFM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4FVKWQ#issuecomment-520836442> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AKPPPT4MPKSFJ5J63I274ULQEKZ7LANCNFSM4H6NUFMQ> . <https://github.com/notifications/beacon/AKPPPTZTDNLQVJHJKMC6NTDQEKZ7LA5CNFSM4H6NUFM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4FVKWQ.gif>

mbasmanova · 2019-08-14T13:52:58Z

@MertHoc Thanks for detailed design proposal. I have some questions.

Looks like you are suggesting to parameterize timestamp type with precision (similar to decimal). What would be the implications, if any, on Hive Metastore? How would you store this new time there?
You may want to take into account on-going changes to ORC readers and repartitioning - Add TimestampSelectiveStreamReader #13213 and Optimize PartitionedOutputOperator #13183
Currently a block for timestamp is just a block of longs. Are you envisioning a new block that stores data in two arrays: long[] seconds and int[] nanoSeconds?

mbasmanova · 2019-08-14T13:53:36Z

CC: @yingsu00 @tdcmeehan @bhhari @sayhar

sdruzkin · 2020-10-15T05:32:41Z

We are also interested in the nanosecond precision for timestamps to fully support timestamps in the ORC file format.

rongrong assigned wenleix Jul 5, 2019

hocanint-amzn mentioned this issue Aug 13, 2019

Support for nanosecond/microsecond precision in TIMESTAMP and TIMESTAMP WITH TIME ZONE trinodb/trino#1284

Closed

13 tasks

s-sanjay mentioned this issue Aug 25, 2020

Add support in parquet reader for reading TIMESTAMP_MICROS type. #15074

Merged

mohittt8 mentioned this issue Jun 9, 2022

fix(presto): use milliseconds timespec for presto apache/superset#20333

Merged

9 tasks

tdcmeehan mentioned this issue Jul 25, 2024

Equality semantics of TIMESTAMP WITH TIME ZONE type can cause inconsistent behavior #23252

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for nano-second/microsecond timestamps #13063

Support for nano-second/microsecond timestamps #13063

MertHoc commented Jul 5, 2019 •

edited

Loading

hocanint-amzn commented Jul 8, 2019

aweisberg commented Jul 8, 2019 •

edited

Loading

wenleix commented Jul 9, 2019

MertHoc commented Jul 30, 2019

rschlussel commented Aug 1, 2019

hocanint-amzn commented Aug 13, 2019

aweisberg commented Aug 13, 2019

oerling commented Aug 13, 2019 via email

mbasmanova commented Aug 14, 2019

mbasmanova commented Aug 14, 2019

sdruzkin commented Oct 15, 2020

Support for nano-second/microsecond timestamps #13063

Support for nano-second/microsecond timestamps #13063

Comments

MertHoc commented Jul 5, 2019 • edited Loading

Adding Nanosecond support to PrestoDB

Introduction

Design Decisions:

Encoding

Effects on Precision when comparing two timestamps with different precisions:

What changes are being made?

hocanint-amzn commented Jul 8, 2019

aweisberg commented Jul 8, 2019 • edited Loading

wenleix commented Jul 9, 2019

MertHoc commented Jul 30, 2019

rschlussel commented Aug 1, 2019

hocanint-amzn commented Aug 13, 2019

aweisberg commented Aug 13, 2019

oerling commented Aug 13, 2019 via email

mbasmanova commented Aug 14, 2019

mbasmanova commented Aug 14, 2019

sdruzkin commented Oct 15, 2020

MertHoc commented Jul 5, 2019 •

edited

Loading

aweisberg commented Jul 8, 2019 •

edited

Loading