Core: Add new nanosecond supporting field mapper #32601

spinscale · 2018-08-03T08:04:20Z

This new field mapper should support nanosecond timestamps. There are ideas to add support for this. You could come up with a new data structure that supports any date with a nanosecond resolution - which means that you need another data structure than the current long value we use for our current dates. This also implies that indexing and querying will be more expensive.

The other alternative would be to use a long and store the nanoseconds since the epoch. This limits our dates to our range of 1677 toll 2262, meaning we cannot store birthdays from many people in wikipedia. However, when you need nanosecond resolution it is usually about log files and not about birth dates. And those log files usually fit into the above mentioned date range.

This issue suggest to implement a timestamp (names are just suggestions here) field mapper, that stores dates in nanosecond resolution as a long.

This mapper needs to reject any date that is out of the above range when indexing (which also means there is a query short circuit).

Backwards compatibility

The most important part is to be able to search across shards where one field is a long in milliseconds and one field a long in nanoseconds. Adrien came up with the idea of extending org.elasticsearch.common.lucene.Lucene.readSortValue(StreamInput in) and add a special type to mark a sorting as timestamp as nanoseconds, this way merging of results will be possible by adapting the values before merge.

Something to keep in mind here: When mixing indices that have dates in nanos and dates in millis, and we convert to nanos, we cannot deal with dates outside of the nanosecond range. So we have to error out when such a query comes in before doing flawed conversions.

Note: If the long is treated unsigned we could move the range (also requiring more different conversions, if mixing up with millis)

Aggregations

Having nanosecond resolution buckets would result in a lot of buckets, so I do consider this a second step and this should not stop adding the field mapper to add first preliminary support.

Relates #27330

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-08-03T08:04:21Z

Pinging @elastic/es-core-infra

jasontedor · 2018-08-03T12:23:32Z

Relates #10005

uschindler · 2018-08-03T13:24:30Z

Another alternative to use a double with "days since epoch" or similar. Dates around the current time/epoch have highest precision, but datetimes far away from today loose precision. We use this aproach for scientific data, as it is obvious that exact date times only make sense around now, not for stuff far away.
This approach allows easy sorting and the conversion to java.time is easy:

  /** Converts a TemporalAccessor to a double (days since epoch). Includes time, if available. */
  public static double temporalToDouble(TemporalAccessor accessor) {
    double r = accessor.getLong(ChronoField.EPOCH_DAY);
    if (accessor.isSupported(ChronoField.NANO_OF_DAY)) {
      r += accessor.getLong(ChronoField.NANO_OF_DAY) / NANOS_PER_DAY;
    }
    return r;
  }

  /** Converts a double with the days since epoch to an Instant. */
  public static Instant doubleToInstant(double epochDouble) {
    final long epochDays = (long) epochDouble;
    return Instant.EPOCH.plus(epochDays, ChronoUnit.DAYS)
        .plusNanos(Math.round((epochDouble - epochDays) * NANOS_PER_DAY));
  }

Just ideas! (this code is untested, I just converted it from millies to nanos, maybe there are some sign problem, but i think it's tested also for dates before epoch).

uschindler · 2018-08-03T13:29:36Z

As far as I remember PostgreSQL internally uses the same data type for SQL timestamps.

This change adds an option to the `FieldSortBuilder` that allows to transform the type of a numeric field into another. Possible values for this option are `long` that transforms the source field into an integer and `double` that transforms the source field into a floating point. This new option is useful for cross-index search when the sort field is mapped differently on some indices. For instance if a field is mapped as a floating point in one index and as an integer in another it is possible to align the type for both indices using the `numeric_type` option: ``` { "sort": { "field": "my_field", "numeric_type": "double" <1> } } ``` <1> Ensure that values for this field are transformed to a floating point if needed. Only `long` and `double` are supported at the moment but the goal is to also handle `date` and `date_nanos` when elastic#32601 is merged.

spinscale added :Core/Infra/Core Core issues without another label v7.0.0 labels Aug 3, 2018

spinscale mentioned this issue Aug 3, 2018

Core: Migrating from joda time to java.time #27330

Closed

18 tasks

danielmitterdorfer added the blocker label Dec 4, 2018

spinscale mentioned this issue Jan 23, 2019

Add nanosecond field mapper #37755

Merged

jimczi mentioned this issue Jan 31, 2019

Add an option to force the numeric type of a field sort #38095

Merged

spinscale closed this as completed in #37755 Feb 4, 2019

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Mpdreamz mentioned this issue Aug 7, 2019

[meta] 7.3 Release elastic/elasticsearch-net#4001

Closed

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core: Add new nanosecond supporting field mapper #32601

Core: Add new nanosecond supporting field mapper #32601

spinscale commented Aug 3, 2018 •

edited

Loading

elasticmachine commented Aug 3, 2018

jasontedor commented Aug 3, 2018

uschindler commented Aug 3, 2018 •

edited

Loading

uschindler commented Aug 3, 2018

Core: Add new nanosecond supporting field mapper #32601

Core: Add new nanosecond supporting field mapper #32601

Comments

spinscale commented Aug 3, 2018 • edited Loading

Backwards compatibility

Aggregations

elasticmachine commented Aug 3, 2018

jasontedor commented Aug 3, 2018

uschindler commented Aug 3, 2018 • edited Loading

uschindler commented Aug 3, 2018

spinscale commented Aug 3, 2018 •

edited

Loading

uschindler commented Aug 3, 2018 •

edited

Loading