Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Rust programming language #3894

Closed
wants to merge 10 commits into from

Conversation

josephDunne
Copy link

This pull request include an implementation of the Flatbuffers runtime for the Rust programming language and Rust code generation via the flatc binary.

Code is well documented and currently passes a full suite of tests. I do however still need to add a sample and do more benchmark testing. In the meantime I want to get this on the projects radar to discuss.

Thanks!

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please let us know the company's name.

@josephDunne
Copy link
Author

I signed it!

@googlebot
Copy link

CLAs look good, thanks!

@Lakedaemon
Copy link
Contributor

Regarding the refactoring effort going on for the flatbuffers library (to make it simpler and better for you code generator writer and for the flatbuffers library maintainer), it would be great in my opinion, if you could remove the dependency on the TD macro (that forces you to touch the parser file, as well as the files of other code generators) and make the rust code generator self-contained in idl_gen_rust.cpp

You would mostly have to define your own GetXType methods
You can look how it is done in the javascript code generator (no TD in there) or in the Python, php, go code generators

@josephDunne
Copy link
Author

josephDunne commented Jun 2, 2016

Sure I will do that. If its going to be self contained should I just implement it entirely in Rust...in a separate repo perhaps?
I would much rather use Rust macros for this kind of thing then hand writing cpp

@Lakedaemon
Copy link
Contributor

Le 02/06/2016 08:52, Joseph Dunne a écrit :

Sure I will do that.

If its going to be self contained should I just implement it entirely
in Rust...in a separate repo perhaps?

no, your C++ code is fine. @gwvo will tell you if he wants changes (or
not) to merge your code in the library.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3894 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAGTB97QrMpYN7gKeU5tsuwpDwTqrSjMks5qHn2SgaJpZM4Ir42e.

TD(UTYPE, "", uint8_t, byte, byte, byte, uint8, u8) /* begin scalar/int */ \
TD(BOOL, "bool", uint8_t, boolean,byte, bool, bool, bool) \
TD(CHAR, "byte", int8_t, byte, int8, sbyte, int8, i8) \
TD(UCHAR, "ubyte", uint8_t, byte, byte, byte, u8, u8) \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You appear to also be changing the Python type?
Also: align columns.

@ghost
Copy link

ghost commented Jun 2, 2016

Wow! Looks like great work.

I know @rw was working on a Rust port as well, maybe he can help review / suggest improvements or functionality from his port.

@Lakedaemon : for now, the TD macro is how all generators work, so that's what this PR should follow as well, until we somehow change it for al generators.

@josephDunne
Copy link
Author

Thanks for reviewing. I will do the cleanup and refactor and then @rw can take a look

@rw
Copy link
Collaborator

rw commented Jun 2, 2016

Great to see this! Let me know when you are ready for review.

@rw
Copy link
Collaborator

rw commented Jun 2, 2016

@josephDunne My biggest piece of feedback on this version is that I need to see fuzz testing. The C++, Go, and Python versions all have fuzzers, so there are lots of examples. Let me know if that makes sense to you.

Another piece of feedback is that instead of using unimplemented! for the conversion from integer to enum, it should return an option type.

Otherwise, a quick read of the runtime and generated code raises no red flags at this time, nice work!

@josephDunne
Copy link
Author

josephDunne commented Jun 2, 2016

Makes sense. I will add it. Do you mind if its quickcheck tests (assuming quickcheck works on rust stable)? Or do you prefer I copy the approach already in use?

Also how do you feel about replacing the generated boiler plate with macros...something like this:

table!(Monster, [{hp, 8, 150},
                 {mana, 10, 100},
                 {test,, 12, Vec3}
                 ...
                ]);

These macros are then expanded at compile time. In general I dont like adding extra layers of indirection but it would make the CPP generator a lot easier to maintain as most of the work will be done in the rust library.

@ghost
Copy link

ghost commented Jun 2, 2016

^ big fan of macros here ;)

@Lakedaemon
Copy link
Contributor

@gwvo I'm not surprised :p

Actually, idl_gen_js.cpp doesn't use the TD macro. They use GenType method that is really easy to understand (I'm a big fan) :
static std::string GenType(const Type &type) {
switch (type.base_type) {
case BASE_TYPE_BOOL:
case BASE_TYPE_CHAR: return "Int8";
case BASE_TYPE_UTYPE:
case BASE_TYPE_UCHAR: return "Uint8";
case BASE_TYPE_SHORT: return "Int16";
case BASE_TYPE_USHORT: return "Uint16";
case BASE_TYPE_INT: return "Int32";
case BASE_TYPE_UINT: return "Uint32";
case BASE_TYPE_LONG: return "Int64";
case BASE_TYPE_ULONG: return "Uint64";
case BASE_TYPE_FLOAT: return "Float32";
case BASE_TYPE_DOUBLE: return "Float64";
case BASE_TYPE_STRING: return "String";
case BASE_TYPE_VECTOR: return GenType(type.VectorType());
case BASE_TYPE_STRUCT: return type.struct_def->name;
default: return "Table";
}
}

Also, the go/python/... folks could also avoid the TD macro by defining static const char ctypename[] inside their idl_gen_**.cpp file, making their GenBasic method easier to understand

IMO, idl.h should not contain language specific stuff. I'll fight the TD macro or die in the process. :)

@rw
Copy link
Collaborator

rw commented Jun 3, 2016

@josephDunne I'd like to see 1-to-1 equivalency, first.

After that, though, QC is great. (My only concern is that QC's default Arbitrary impls probably can't provide the test cases we care about. Generator functions will have to be created.)

@josephDunne
Copy link
Author

josephDunne commented Jun 3, 2016

@rw 👍

@josephDunne
Copy link
Author

I have done some initial refactoring and cleanup. Fuzzing tests are still outstanding. I may only get to this next weekend. I am however using this code as part of a project so its getting some field testing in the mean time :-)

some discussion points (@rw?)

  • Using macros has made the cpp code a lot smaller and cleaner. However its a bit tricker to implement the builders as macros. Will revisits this when I do the Fuzz testing.
  • Switching to Trait extensions for the builders works nicely. Users will need a strategy to deal with name conflicts if the generated builder objects have similar function names - using Rusts scoped imports should work.
  • I dont like the iterator implementation. It does not implement object reuse so is allocating more pointers on the stack then it really needs to. Also for scalar types it should be possible to return slices rather than full Iterators; A slice of objects that deference to a scalar type (so the little endian conversion can happen in the deref operation) something like this:
fn get_scalar_vector<u32>(buff: &[u8], offset: UOffset) -> &[T] where T: Deref<Target=u32> { .. }

for x in table.get_scalar_vector<u32>(..) {
   println!("This is a u32 number: {}", *x)
   println!("This is a pointer into the buffer transmuted as some type T: {p}", x);
}

Not sure..

  • Table and Struct objects wrap and own a Table Object. The alternative is they only borrow a table. The owned implementation is simpler but allocates more on the stack when an object is created from a buffer. Since a table is just one pointer and one usize on the stack I feel that this is acceptable. Either way I need to revisit this to make sure that we can properly reuse existing objects.
  • need to implement more traits and utility functions. Suggestions welcome.
  • Are Union types only allowed to be tables? Or cant a Union have a struct, scalar or table option?

@rw
Copy link
Collaborator

rw commented Jun 5, 2016

@josephDunne great progress!

Regarding fuzzing, it seems it would be most straightforward to skip QuickCheck for now and use the code from the Go port: https://github.com/google/flatbuffers/blob/master/tests/go_test.go#L283-L426 That uses a simple LCG RNG to drive some serialization cases. Do you think it would be straightforward to port to Rust?

Also, do you know if any Rust code makes heap allocations besides the Vec that backs the Builder?

@rw
Copy link
Collaborator

rw commented Apr 5, 2018

Hi all!

I'm glad there's still interest in a Rust port. I know it's been a disappointing ride over the last few years for those of you wanting to use FlatBuffers in Rust, and seeing it not come together.

About a month ago, I dug in full-time for a few weeks and made another attempt at writing a solid Rust port. As some of you may recall, I've tried this before. This time around, though, my Rust-fu is much better, and I'm doing this out of personal necessity. I want to use FlatBuffers in Rust to develop other software where speed is critical: storing data in a write-ahead transaction log in a NoSQL datastore. Using Rust with FlatBuffers is the optimal choice for that project.

Here are some goals of this unpublished WIP port:

  • Extensive test coverage: this is non-negotiable and includes the types I listed above. Unit, fuzz, integration, etc.
  • Feature parity with the C++ code generator: This includes things like the builder convenience functions and the verification API. I might wait to implement the reflection API and flexbuffers support.
  • No user-facing macros: While macros are nice to write as a language designer, I find them to be a problem when asking new users to get familiar with a codebase. (I am assuming that it is good for our users to understand how the generated code really works.) I like how easy it is in Go, for example, to learn how something operates by reading its code. In my experience, macros make this a lot harder.
  • To be as fast as the C++ port: this includes the ability to unsafely interpret FlatBuffers memory as Rust objects.
  • Idiomatic Rust.

Here is the progress so far:

  • Cargo-compatible package format.
  • Code generation (based on the C++ generator) is ~complete.
  • Generated code for the test schemas compiles and imports successfully.
  • Tests are being ported one at a time from the C++ test suite. This is in the early stages.
  • The runtime library is being built in tandem with the test suite (TDD).
  • 181 git commits so far.

I intend to publish the WIP branch of this in the upcoming weeks so that I can start getting feedback on the user experience of the generated code. Thanks for your patience!

@josephDunne
Copy link
Author

I have added you as admin on crates.io for the flatbuffers crate. Let me know if you get it

@aardappel
Copy link
Collaborator

@rw this is awesome news! can't wait :)

@andygrove
Copy link

@rw This is great news! I am wanting to use flatbuffers for the Apache Arrow rust library (we need it for IPC). I would be happy to be an early adopter / tester!

@binary132
Copy link

@rw please don't hesitate to post your WIP branch, even if it's not clean yet. Weeks are precious :)

@andygrove
Copy link

@rw Is there any update on this? I'm keen to use this with the Rust version of Apache Arrow so that we can implement the IPC mechanism. I'm happy to be an early adopter/tester.

@rw
Copy link
Collaborator

rw commented Jun 15, 2018

Below is a link to an experimental, currently-broken WIP branch. I'm putting the code up here only to show progress for those of you who are curious. (Also, to force myself to develop in the open.)

https://github.com/rw/flatbuffers/tree/2018-02--rust

The generated code is at https://github.com/rw/flatbuffers/blob/2018-02--rust/tests/rust_usage_test/src/monster_test_generated.rs

Please bear with me as I get this into shape.

@rw
Copy link
Collaborator

rw commented Jun 19, 2018

Generated code typechecks and is importable as a Rust crate; some wire format tests are starting to pass.

@andygrove
Copy link

@rw What is a good way to report issues / discuss this? I'm hitting compilation errors building the project. Do you have a gitter address?

@rw
Copy link
Collaborator

rw commented Jun 20, 2018

@andygrove I use the following command:

make all test && ((cd tests && ./RustTest.sh ) ; cd ..)

But, I can't support your usage right now since the core code is in flux. I'm just giving status updates here because it seems a lot of people are waiting for this to land.

@rw
Copy link
Collaborator

rw commented Jun 20, 2018

Got 10 byte layout tests passing, and tomorrow I'll be implementing the start/end functions for creating Tables.

@rw
Copy link
Collaborator

rw commented Jul 1, 2018

Byte layout tests (from Go and Python) all pass.

@ChristopherRabotin
Copy link

@rw is there a repo where you're making this commits? It seems like the branch associated to this pull request hasn't been updated since August 2017. I'd be keen to help in the development of this feature, if my help is needed. Thanks

@abreis
Copy link

abreis commented Jul 3, 2018

@ChristopherRabotin I believe it's this branch:
https://github.com/rw/flatbuffers/commits/2018-02--rust

@rw
Copy link
Collaborator

rw commented Jul 4, 2018

Fuzz test of scalar serialization inside vtables passes.

@rw
Copy link
Collaborator

rw commented Jul 7, 2018

Tests pass that verify generated Rust code can correctly interpret the example serialized files created by the C++, Go, Java, and Python test suites. (Except for vector data, since I'm still working on the ergonomics of vectors.)

@ry ry mentioned this pull request Jul 7, 2018
@rw
Copy link
Collaborator

rw commented Jul 12, 2018

Creation of example serialized data (the gold "Monster") in Rust partially passes: library code and generated code can both create correct tables with scalar values, strings, and table unions.

@ry
Copy link

ry commented Jul 12, 2018

@rw Thanks for your work! I'm closely tracking your progress and have already made attempts to use it. I wasn't able to get it working last week, but I will try again soon.
https://github.com/ry/deno/blob/ad4f335847daffcc5556008aebebaf2d1eecac67/gclient_config.py#L31-L36

@rw
Copy link
Collaborator

rw commented Jul 24, 2018

Non-user-visible status update: almost done with a code generator refactor. This is a way to eliminate long-tail bugs, and I can do it now that I'm over most of the conceptual hurdles and can create better abstractions.

@rw
Copy link
Collaborator

rw commented Aug 9, 2018

I solved a type system problem: we now lift Flatbuffer Offset traversals into the typesystem. This gets the ergonomics closer to the C++ port (which is a good thing).

Proof of concept: https://play.rust-lang.org/?gist=20b4ab8ec5b82c409d431f3658eadd89&version=stable&mode=debug&edition=2015


Edit: improved version: https://play.rust-lang.org/?gist=648af791b28a91cd4c705897c5499eed&version=stable&mode=debug&edition=2015

@rw
Copy link
Collaborator

rw commented Aug 13, 2018

This code is still WIP (I very much need to clean up a lot of things, and rebase) but this branch has all (70+) tests passing. If you're feeling adventurous, try it out: https://github.com/rw/flatbuffers/tree/2018-08-12--all-tests-passing

Update: As of commit e6e634d, this code is up-to-date with master.

@rw
Copy link
Collaborator

rw commented Aug 16, 2018

Final checklist before PR:

  • Better union ergonomics in generated code.
  • More code comments.
  • Usage docs.
  • Final generator and runtime code cleanup.
  • VTable deduplication.
  • Basic benchmarks.

Edit: we aren't going to do this for this version: "Included/nested generated code imports (the code is generated, but the mod imports don't work yet)."

@rw
Copy link
Collaborator

rw commented Aug 29, 2018

Pull Request submitted: #4898

@rw
Copy link
Collaborator

rw commented Sep 3, 2018

@josephDunne Thank you for your hard work on this in 2016. You helped us start the long journey to getting a Rust port contributed to FlatBuffers. We really appreciate your efforts, and hope you stay involved in the project!

Closing in favor of #4898.

@rw rw closed this Sep 3, 2018
@rw
Copy link
Collaborator

rw commented Sep 3, 2018

@josephDunne Would you please send me an email about us getting control of the Cargo project? I don't see the notification you mentioned from April.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.