Skip to content

Releases: cognitedata/cdp-spark-datasource

1.4.61

03 Feb 15:29
Compare
Choose a tag to compare

Enhancements

  • Add support delete by externalIds for Assets, Events and TimeSeries

Fixes

  • Handle empty string for Boolean, Double, Float, Byte, Short, Integer and Long type in RawJsonConverter

1.4.18

04 Feb 13:44
Compare
Choose a tag to compare

Download the release from Maven Central.

1.4.18

Fixes

  • Stop reading from CDF immediately on task completion/cancellation.
    This will allow Spark to start processing other tasks more quickly, especially when
    there are exceptions thrown by tasks.

1.4.17

Fixes

  • Handle additional uncaught exceptions locally, instead of having them kill the executor.

1.4.16

30 Jan 14:30
Compare
Choose a tag to compare

Download the release from Maven Central.

1.4.16

Fixes

  • Handle some uncaught exceptions locally, instead of having them kill the executor.

1.4.15

16 Jan 22:50
7b37f70
Compare
Choose a tag to compare

1.4.15

Enhancements

  • The labels field is now available for assets on update and upsert operations.
  • The labels field is now available for asset hierarchy builder on upsert operation.

1.4.14

13 Jan 15:28
Compare
Choose a tag to compare

Download the release from Maven Central.

1.4.14

Enhancements

  • Set max retry delay on requests to 30 seconds by default, configurable via
    maxRetryDelay option.

Fixes

  • Fix a potential deadlock in handling exceptions when reading and writing data from CDF.

1.4.13

Enhancements

  • relationships have been added as a new resource type. See relationships
    for more information.
  • The labels field is now available for assets on read and insert operations.

1.4.12

Enhancements

  • Spark 3 is now supported!
  • labels have been added as a new resource type. See Labels
    for more information.

1.4.11

Fixes

  • Fix a bug where certain operations would throw a MatchError instead of the intended exception type.

1.4.10

Enhancements

  • Improved error message when attempting to use the asset hierarchy builder to move an asset between different root assets.

1.4.9

Enhancements

  • Upgrade to Cognite Scala SDK 1.4.1
  • Throw a more helpful error message when attempting to use sequences that contain columns without an externalId.

1.4.8

Fixes

  • Attempting an update without specifying either id or externalId will now result in a CdfSparkException instead of an IllegalArgumentException.

1.4.7

Enhancements

  • The X-CDP-App and X-CDP-ClientTag headers can now be configured using the applicationName and clientTag options.
    See the Common Options section for more info.

Fixes

  • Nested rows/structs are now correctly encoded as plain JSON objects when writing to RAW tables.
    These were previously encoded according to the internal structure of org.apache.spark.sql.Row.

1.4.6

Fixes

  • Use the configured batch size also when using savemode

1.4.5

Fixes

  • Make all exceptions be custom exceptions with common base type.

1.4.4

Fixes

  • Excludes the netty-transport-native-epoll dependency, which isn't handled
    correctly by Spark's --packages support.

1.4.3

Fixes

  • Still too many dependencies excluded. Please use 1.4.4 instead.

1.4.2

26 Jun 17:40
Compare
Choose a tag to compare

Download the release from Maven Central.

1.4.2

Enhancements

  • Clean up dependencies to avoid evictions.
    This resolves issues on Databricks where some evicted dependencies were loaded,
    which were incompatible with the versions of the dependencies that should have
    been used.

1.4.1

We excluded too many dependencies in this release. Please use 1.4.2 instead.

Enhancements

  • Clean up dependencies to avoid evictions.

1.4.0

Breaking changes

  • Metadata values are no longer silently truncated to 512 characters.

1.3.1

Enhancements

  • Deletes are now supported for datapoints. See README.md for examples.

Fixes

  • An incorrect version was used for one of the library dependencies.

1.3.0

Breaking changes

Although not breaking for most users, this release updates some core
dependencies to new major releases. In particular, it is therefore
not possible to load 1.3.x releases at the same time as 0.4.x releases.

Enhancements

  • Sequences are now supported, see README.md for examples using
    sequences and sequencerows.

  • Files now support upsert, delete, and several new fields like
    dataSetId have been added.

  • Files now supports parallel retrieval.

1.2.20

Enhancements

  • Improved error message when a column has a incorrect type

Fixes

  • Filter pushdown can now handle null values in cases like p in (NULL, 1, 2).
  • Asset hierarchy now handles duplicated root parentExternalId.
  • NULL fields in metadata are ignored for all resource types.

1.2.19

Enhancements

  • Improve data points read performance, concurrently reading different time
    ranges and streaming the results to Spark as the data is received.

1.2.18

31 Mar 15:43
Compare
Choose a tag to compare

Download the release from Maven Central.

1.2.18

Enhancements

  • GZip compression is enabled for all requests.

Fixes

  • "name" is now optional for upserts on assets when external id is
    specified and the asset already exists.

  • More efficient usage of threads.

1.2.17

17 Mar 14:37
Compare
Choose a tag to compare

Download the release from Maven Central.

1.2.17

Fixes

  • Reimplement draining the read queue on a separate thread pool.

1.2.14

17 Mar 07:59
05f783c
Compare
Choose a tag to compare

Download the release from Maven Central.

1.2.14

Enhancements

  • dataSetId can now be set for asset hierarchies.

  • Metrics are now reported for deletes.

Fixes

  • Empty updates of assets, events, or time series no longer cause errors.

1.2.16

17 Mar 08:00
05f783c
Compare
Choose a tag to compare

Download the release from Maven Central.

1.2.16

Breaking changes

  • Include the latest data point when reading aggregates. Please note that this is a breaking change
    and that updating to this version may change the result of reading aggregated data points.

Enhancements

  • Data points are now written in batches of 100,000 rather than 1,000.

  • The error messages thrown when one or more columns don't match will
    now say which columns have the wrong type.

  • Time series delete now supports the ignoreUnknownIds option.

  • Assets now include parentExternalId.

Fixes

  • Schema for RAW tables will now correctly be inferred from the first 1,000 rows.

  • Release threads from the threadpool when they are no longer going to be used.