Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add add_files procedure in Iceberg connector #11744

Closed
erikerlandson opened this issue Mar 31, 2022 · 7 comments · Fixed by #22751
Closed

Add add_files procedure in Iceberg connector #11744

erikerlandson opened this issue Mar 31, 2022 · 7 comments · Fixed by #22751
Assignees
Labels

Comments

@erikerlandson
Copy link

like so:
https://github.com/RussellSpitzer/iceberg/blob/a4279fc5842046043f2afdc90f2428243958574d/spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestAddFilesProcedure.java#L80

@osscm
Copy link
Contributor

osscm commented Jun 1, 2022

+1

@findepi findepi changed the title support iceberg style add_files Add add_files procedure in Iceberg connector Jun 2, 2022
@erikerlandson
Copy link
Author

One of the use cases I had in mind for this was that the files I want to add are sitting on some s3 bucket. So in this use case,there needs to be a way to supply add_files with s3 credentials, as parameters.

@alexjo2144
Copy link
Member

Similarly to the Hive connector's allow-register-partition-procedure this should be disabled by default and opted in using a catalog property. The idea being that it should only be turned on if file system location based access control is in place.

@blopezpi
Copy link

Any updates on this? It will be great to have this procedure for importing a bunch of data directly avoiding any insert command.

@anandsakhare
Copy link

+1

@ebyhr
Copy link
Member

ebyhr commented Jul 24, 2024

@martint Could you review the syntax of this procedure? The procedure name in Spark is add_files. The detailed information is documented at https://iceberg.apache.org/docs/latest/spark-procedures/#add_files

The arguments should follow Trino conventions (e.g. source_table should not be abused for locations), but the name looks good to me.

@MichaelTiemannOSC
Copy link

Awesome works. Can't wait to try it out in 461!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

7 participants