Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFI initial implementation #12920

Merged
merged 28 commits into from
Oct 30, 2024

Conversation

timsaucer
Copy link
Contributor

@timsaucer timsaucer commented Oct 14, 2024

Which issue does this PR close?

This is to address part of apache/datafusion-python#823 downstream but may have wider application than just python.

Rationale for this change

This PR allows for registering table providers via a stable FFI. With this change it enables breaking the requirement for python providers to include all of datafusion-python and re-export it. With this change we can allow providers with different underlying datafusion versions to interoperate.

What changes are included in this PR?

Adds support for TableProvider via FFI. In order to support this, it also includes ExecutionPlan, SessionConfig, PlanProperties, and TableType. As this gets used more, I expect we will want to expose other features but this gives an initial first implementation that solves an immediate need.

Are these changes tested?

Some unit tests are provided. Additionally I did the following test:

I created a separate crate with the contents of datafusion/ffi so that I can test it against different versions of DataFusion by modifying the dependencies in Cargo.toml. Then I used this crate to build a test implementation of datafusion-python against DataFusion 42.0.0. I adjusted the test crate and built a test implementation of delta-rs against DataFusion 41.0.0. Then I registered the delta table in python against the session context. I was able to query the table with push down filters via this FFI interface even though the underlying DataFusion versions were different.

Additionally I ran memory leak checks against the provided unit tests and against running in python.

Are there any user-facing changes?

This is not breaking, but a pure addition of a new datafusion-ffi library.

Remaining Issues

  • There is some inconsistency between the usage of ExportedXYZ and just using the raw FFI_XYZ. We should make the usage consistent across all struct types.
  • Add documentation to explain the reasoning behind creating the data the way we do in the private data and foreign structs.
  • Add documentation to explain more clearly the delineation between the ExportedXYZ and ForeignXYZ. It would probably be good to have a use case since which is "foreign" and which is "exported" can be complicated during some of the function calls.

@timsaucer timsaucer changed the title Feature/ffi initial implementation FFI initial implementation Oct 14, 2024
@timsaucer timsaucer marked this pull request as ready for review October 16, 2024 23:58
@timsaucer
Copy link
Contributor Author

In response to #12357 I believe this PR should go into the core DataFusion because it does allow for extension of core features and enabling of runtime loading of libraries. However the driving case is datafusion-python so I would be comfortable moving it either there or to a datafusion-contrib repository.

@alamb
Copy link
Contributor

alamb commented Oct 20, 2024

I see this PR and plan to review it, hopefully today but likely tomorrow (Mon)

Copy link
Contributor

@notfilippo notfilippo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work. I've left a few comments but the idea is great and I definitely support it. I've been working on some experimental FFI bindings for Go myself and I would love to swap them for this upstream implementation at some point.

I should also mention that I have ScalarUDFs working on my barebones FFI bindings that I would love to clean up and contribute upstream when this lands.

datafusion/ffi/src/record_batch_stream.rs Outdated Show resolved Hide resolved
datafusion/ffi/Cargo.toml Show resolved Hide resolved
datafusion/ffi/Cargo.toml Outdated Show resolved Hide resolved
datafusion/ffi/src/table_provider.rs Outdated Show resolved Hide resolved
datafusion/ffi/src/execution_plan.rs Show resolved Hide resolved
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really neat idea -- thank you @timsaucer

I am not super familiar with the FFI interfaces in general, but the idea makes sense to me. I don't review this entire PR for safety -- if you could reduce the API surface area and add comments, I will do so however.

My suggestions are to:

  1. Try and reduce the initial API surface as much as possible (e.g. do we need to expose plan properties?)
  2. Try and improve the comments (I left some suggestions)
  3. Mark the feature as "BETA" or something and hint that the interfaces may not be stable until we work through all the details
  4. File a ticke to track follow on work

Some obvious follow on work in my mind would be:

  1. Add an end to end test (perhaps in a similar vein to https://github.com/apache/datafusion/tree/main/datafusion/wasmtest) that shows thi working
  2. Add a section to the "library users guide" about using this API: https://datafusion.apache.org/library-user-guide/index.html

Comment on lines +22 to +23
This crate contains code to allow interoperability of Apache [DataFusion]
with functions from other languages using a stable interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may also be good to point out that the ffi interface allows different versions of datafusion to interact with each other over stable interfaces in the intro

Suggested change
This crate contains code to allow interoperability of Apache [DataFusion]
with functions from other languages using a stable interface.
This crate contains code to allow interoperability of Apache [DataFusion]
with functions from other languages and/or versions using a stable interface.

Comment on lines +27 to +29
We expect this crate may be used by both sides of the FFI. This allows users
to create modules that can interoperate with the necessity of using the same
version of DataFusion. The driving use case has been the `datafusion-python`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this text is backwards -- it is "without the necessity". Maybe we could rephrase like

Suggested change
We expect this crate may be used by both sides of the FFI. This allows users
to create modules that can interoperate with the necessity of using the same
version of DataFusion. The driving use case has been the `datafusion-python`
We expect this crate may be used by both sides of the FFI. This allows users
to create modules that can interoperate using different
versions of DataFusion. The driving use case has been the `datafusion-python`

2. Users may want to create a modular interface that allows runtime loading of
libraries.

## Struct Layout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we move this discussion of code layout / struct naming into the rust code so it stays closer to the code and has less chance to get out of date

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll probably move this over to TableProvider which is the most complex and have the other structs refer to it for more information.

#[derive(Debug, StableAbi)]
#[allow(missing_docs)]
#[allow(non_camel_case_types)]
pub struct FFI_PlanProperties {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to expose PlanProperties? This seems pretty low level and a large API surface area. Unless it is all needed I suggest we don't expose it at first and only do so when needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I think we need to. The other option I could see would be to add it to the protobuf message definition and use that to transfer it back and forth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb I've added a separate PR #13136 for adding these to the protobuf message definitions. If you support that, then I'll be able to remove this exposure here and convert it over to serialized data. Actually, I think this is a much nicer approach since these are static values for a plan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try and review #13136 later today

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok now I think I understand the intricies more I think the approach in this PR is the better approach for the reasons I describe in #13136 (review)

provider: &Self,
session_config: &FFI_SessionConfig,
projections: RVec<usize>,
filters_serialized: RVec<u8>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would probably help to document what these types are (e.g. that this is a protobuf serialized bytes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed updated documentation where I describe each of the protobuf byte parameters.

`DataFusion` for both libraries **and** the same compiler. It will not work
in general.

It is worth noting that which library is the `local` and which is `foreign`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should name this "DataFusion C API" or something instead of using the rust FFI term 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally pushing for a C style API but based on some feedback on the discord decided to go with this crate that focuses on rust-rust interface with a very nice interface. I'll see what it would be like to try using C with it as a small test.

@timsaucer
Copy link
Contributor Author

Thanks for the feedback! I'll try to address the remaining concerns in the next few days.

@notfilippo
Copy link
Contributor

After you are done I'll gladly give it another look! 🚀

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @timsaucer -- I think this is a great first step towards the FFI interface . I do think it will be more valuable with additional documentation and examples

One cool thing to do would be to make a delta-rs FFI binding (and maybe other providers in https://github.com/datafusion-contrib/datafusion-table-providers) -- FYI @phillipleblanc

The key benefit here would be that that the providers be locked into the same DataFusion version as used by the consumer

@timsaucer timsaucer mentioned this pull request Oct 29, 2024
3 tasks
@timsaucer
Copy link
Contributor Author

Adding issue #13175 to track additional documentation. I'd like to merge this in and work on that (I've assigned it to myself) so we can unblock datafusion-python and delta-rs

@timsaucer timsaucer force-pushed the feature/ffi-initial-implementation branch from 486bf06 to bf626f4 Compare October 30, 2024 00:21
@timsaucer
Copy link
Contributor Author

Thank you for the reviews. I just did a quick rebase so the merge can go through. I'll get started on that end to end example in a different PR.

@timsaucer timsaucer merged commit 538e0d1 into apache:main Oct 30, 2024
26 checks passed
@timsaucer timsaucer deleted the feature/ffi-initial-implementation branch October 31, 2024 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants