Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support copy-on-write mode for Iceberg write #17272

Open
jackye1995 opened this issue Apr 27, 2023 · 6 comments
Open

Support copy-on-write mode for Iceberg write #17272

jackye1995 opened this issue Apr 27, 2023 · 6 comments
Labels
iceberg Iceberg connector

Comments

@jackye1995
Copy link
Member

Today Iceberg writes only support merge-on-read mode. Copy-on-write mode is a frequent ask for users that want better file layout without the need to run compactions frequently.

Technically this could be achieved pretty easily. The CoW implementation is already available for Delta:

https://github.com/trinodb/trino/blob/master/plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMergeSink.java

@jackye1995 jackye1995 added the iceberg Iceberg connector label Apr 27, 2023
@findepi
Copy link
Member

findepi commented May 4, 2023

Yes, I think we should support copy-on-write mode for UPDATE and DELETE for case when we update/delete most of the file. Creating big delta files isn't helpful for anyone.
If we support this strategy, we should automatically choose between copy-on-write and merge-on-read within each IcebergMergeSink independently.

@rdblue
Copy link
Contributor

rdblue commented May 4, 2023

@findepi, there's a table setting for choosing between copy-on-write and merge-on-read: write.(operation).mode where operation can be merge, update, or delete and mode can be copy-on-write or merge-on-read.

@C-h-e-r-r-y
Copy link

@rdblue

For now these settings can not be set during table creation :(

@MotiPoyastro
Copy link

This feature is highly requested in my org,
Trino 443 + Iceberg on S3 parquets.
frequent merge into queries becomes a problem on a large scale.
currently dealing with post optimize/compaction for many small files created on a daily basis.
but this is not a desired flow for us.

@xkrogen
Copy link
Member

xkrogen commented Oct 28, 2024

One concern called out by @dain (on Slack) that I will persist here:

CoW has it’s own problems because it breaks the change feed, so you cannot do downstream delta-style transforms (e.g., incremental materialized views)

I don't think this should be a blocker to adding support -- but something to keep in mind e.g. when considering doing a MoR vs CoW decision automatically vs based on explicit user input, since there is some difference in functionality supported.

@hashhar
Copy link
Member

hashhar commented Nov 18, 2024

With #24031 it should now be possible to specify the default operation mode via the table property as @rdblue mentioned above.

Is it still desirable to provide an "integrated" solution for this in Trino? for most use-cases I think the current default of MoR on Trino is good.

I had misunderstood, the table property only affects other engines. Trino still always writes using MoR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
iceberg Iceberg connector
Development

No branches or pull requests

7 participants