-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support copy-on-write mode for Iceberg write #17272
Comments
Yes, I think we should support copy-on-write mode for UPDATE and DELETE for case when we update/delete most of the file. Creating big delta files isn't helpful for anyone. |
@findepi, there's a table setting for choosing between copy-on-write and merge-on-read: |
For now these settings can not be set during table creation :( |
This feature is highly requested in my org, |
One concern called out by @dain (on Slack) that I will persist here:
I don't think this should be a blocker to adding support -- but something to keep in mind e.g. when considering doing a MoR vs CoW decision automatically vs based on explicit user input, since there is some difference in functionality supported. |
I had misunderstood, the table property only affects other engines. Trino still always writes using MoR. |
Today Iceberg writes only support merge-on-read mode. Copy-on-write mode is a frequent ask for users that want better file layout without the need to run compactions frequently.
Technically this could be achieved pretty easily. The CoW implementation is already available for Delta:
https://github.com/trinodb/trino/blob/master/plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMergeSink.java
The text was updated successfully, but these errors were encountered: