-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for GENERATED ALWAYS AS IDENTITY in DeltaTableBuilder #1072
Comments
Is this still on the roadmap? |
Any news on this issue status? |
Any update on release date ? |
This is definitely still on the roadmap! However, at the moment all the focus is on completing Deletion Vectors, which is in high demand. We will only get to this item after that work is complete. |
Since Delta Lake 3.1.0 (with deletion vectors) is out now, would you consider working on it for 3.2, @bart-samwel 😇 |
Thank you for the reminder! It is near the top of our list now. I can't make any hard guarantees, but I'm hopeful that we'll get to this pretty soon. |
@bart-samwel |
Just to make sure there's no confusion here: Delta Standalone is different from the Spark connector for of Delta Lake. Standalone is a library that can be used to implement connectors for non-Spark systems, and it is not really getting the new features anymore -- its design is not really suitable to support many of the new features easily. All of the new efforts are going into Delta Kernel, which is the new library for building connectors. It makes it a lot easier to keep up with new features, and we intend to keep it up to date. Identity columns is a feature where we have unfortunately dropped the ball even for support in the Spark connector. It's the exception though, not the rule!
Certainly not! Like I said, identity columns is an exception. Liquid clustering is actually released in Delta Lake 3.1 which came out last week! https://github.com/delta-io/delta/releases |
Hi, currently in my company, I'm not using Spark SQL anywhere. Here I wanted to utilize DeltaTableBuilderAPI. So wanted to ask whether is this resolved, if no, when will we get this update? Many thanks, |
@SYOGESH045 The next release of Delta is going to be Delta 3.3. The identity column support seems to be in progress - #3044. So Delta 3.3 should have it. If I have to hazard a guess, Delta 3.3 should be released in 2-3 months. |
Last version of Databricks added support for identity column in Delta table.
It is possible to define GENERATED ALWAYS AS IDENTITY in column specification.
It would be nice to do the same using DeltaTableBuilder for example:
DeltaTable.create(spark)
.tableName("default.people10m")
.addColumn("id", "BIGINT", generatedAlwaysAs="IDENTITY(START WITH 10 INCREMENT BY 10)")
.addColumn("firstName", "STRING")
.addColumn("middleName", "STRING")
.addColumn("lastName", "STRING", comment = "surname")
.addColumn("gender", "STRING")
.addColumn("birthDate", "TIMESTAMP")
.addColumn("dateOfBirth", DateType(), generatedAlwaysAs="CAST(birthDate AS DATE)")
.addColumn("ssn", "STRING")
.addColumn("salary", "INT")
.partitionedBy("gender")
.execute()
The text was updated successfully, but these errors were encountered: