Serialization based models #1735

IamTheCarl · 2023-03-28T04:10:24Z

This idea started in #1569 and has generally been talked about in #1722 . I do not yet suggest closing #1722 as I think there will still be a a lot more to discuss on this idea. This is just the first proof of concept.

To simplify the implementation of #71, the communications between Fornjot and its clients has been converted to structs that are serialized with serde to the postcard format. This is done for both sending arguments to the client and for the client to send error messages or the generated shape structure to Fornjot.

This vastly simplifies the ABI for Fornjot, only requiring functions in the ABI to pass or accept pointers and integers.
I plan to make more breaking changes to the ABI later but once I have the vestigial code removed from this branch, I think we should merge it.

…need as big of changes.

hannobraun

Thank you for the PR, @IamTheCarl!

While I didn't find anything specifically wrong here (I left a comment for a small nitpick though), I have to admit, my eyes were glazing over quite a bit while reviewing this. Most commits contains a large amount of changed lines, a lot of that within unsafe functions or blocks.

Normally I'm not very strict about this kind of thing, but since we're dealing with unsafe (and thus, potentially unsound code) here, I'd like to be a bit more confident before merging this.

I think we have three options:

Just YOLO it, see how it works out. Hope that someone will find any soundness issues later, if there are any. I don't want to do this.
I could power through, carefully reviewing the end result line-by-line. This is an option, but I'd prefer option 3.
You could package your changes into smaller, more focused commits, each of which are self-contained and easy to review.

To give you some specific examples of what I'm looking for:

Your first commit is basically perfect: One small, focused change, explained in the commit message. I know I agree with the commit message, so the only thing I have to keep in mind while reviewing this, is to make sure the changes match that message. Pretty easy!
Your second commit on the other hand, contains multiple not-closely-related changes, and I have to juggle a lot of state in my head:
- It removes the redundant code that was replaced with a quote!. I happen to remember why this code is redundant and that removing it is good, so no big deal in itself, but it's one more thing to keep in mind.
- It removes some files outright, and I'm not sure: Have those already been useless? Are they replaced by the additions in this commit? Is it even correct to remove them? Again, no big deal in itself, but it adds to the difficulty.
- It changes the way parameters are handled. Do I like this approach? Is it flexible enough? That extra trait shouldn't be necessary, let's start writing a comment. Or is it used somewhere else? Oh, it's removed in the next commit. Guess I didn't have to think about any of that.
- It derives more traits for various existing structs. Again, not a big deal (actually a really tiny one), but in a commit that's already hard to get a hold of, this is one more source of visual noise, and something that could have easily been done in a separate commit.
Your fourth commit removes quite a bit of code that could probably have been removed in a follow-up commit, making each of the resulting smaller commits easier to review.

I don't mean to pick on you or your code here. I'm just trying to explain why this is difficult and time-consuming to review. On the other hand, learning how to create small and self-contained commits is a valuable skill, and one that becomes second nature once you're used to it.

If you want to see an example of how I do it, maybe #1711 and #1714 are decent examples. I'm far from perfect, but you can see there, I try to put each change into a self-contained commit. I sometimes commit refactorings that don't make much sense by themselves, but that are easy to understand; and that make a follow-on commit smaller, and thus also easier to understand.

If you're not experienced enough with Git to know how to do this, I encourage you to look into the following commands:

Precisely control what goes into the next commit: git add -p, git stash -p, git stash --keep-index
Roll back the last few commits, without changing the actual files: git reset --mixed HEAD~1 (replace 1 with the number of commits you want to roll back; very useful when combined with the previous item, to split a commit into smaller, more self-contained ones)
Instead of creating a new commit, add changes to the previous one: git commit --amend
Manipulate Git history (drop, merge, or re-order commits): git rebase -i

All of these are tools that I use pretty much daily. If you don't feel confident in your abilities yet, I recommend trying it in a separate branch, as manipulating Git history can get hairy (speaking of which, git cherry-pick is pretty useful when creating a new branch to re-organize what you did in another). If you truly mess up, git reflog can usually help you restore what you accidentally destroyed.

If you need any help with that, or don't feel up for the task, please let me know! I'm happy to help. And if you don't have the time or wherewithal to deal with this, also let me know. I can look into cleaning the pull request up myself, or just power through with the review 😄

once I have the vestigial code removed from this branch, I think we should merge it.

As I said earlier, I'm happy to merge this with vestigial code still in place. Please consider, the longer this pull request stays open, the higher the chance that other pull requests will cause merge conflicts here, causing you more work to get the pull request mergeable again.

hannobraun · 2023-03-28T09:01:50Z

crates/fj-proc/src/expand.rs

+                #[no_mangle]
+                unsafe extern "C" fn fj_model_construct(
+                    payload_pointer: *mut *mut u8,
+                    arguments_pointer: *const u8,
+                    arguments_length: usize,
+                ) -> usize {
+                    use fj::abi::SelfSerializing;
+
+                    fj::abi::initialize_panic_handling();
+
+                    let arguments = std::slice::from_raw_parts(arguments_pointer, arguments_length);
+                    let arguments = fj::abi::ArgumentTable::deserialize(arguments);
+
+                    let model_result = match arguments {
+                        Ok(arguments) => {
+                            match std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
+                                #model_name(#(#arguments),*)
+                            })) {
+                                Ok(shape) => fj::abi::ModelResult::Ok(
+                                    fj::abi::Model {
+                                        metadata: fj::models::Metadata::new(env!("CARGO_PKG_NAME"), env!("CARGO_PKG_VERSION"))
+                                            .with_short_description(env!("CARGO_PKG_DESCRIPTION"))
+                                            .with_homepage(env!("CARGO_PKG_HOMEPAGE"))
+                                            .with_repository(env!("CARGO_PKG_REPOSITORY"))
+                                            .with_license(env!("CARGO_PKG_LICENSE")),
+                                        shape,
+                                    }
+                                ),
+                                Err(payload) => {
+                                    fj::abi::ModelResult::Panic(fj::abi::on_panic(payload))
+                                }
+                            }
+                        },
+                        Err(error) => {
+                            fj::abi::ModelResult::Error("Failed to deserialize parameters from host.".to_string())
+                        }
+                    };
+
+                    match model_result.serialize() {
+                        Ok(model_result) => {
+                            let length = model_result.len();
+                            let boxed = model_result.into_boxed_slice();
+
+                            *payload_pointer = Box::into_raw(boxed) as *mut u8;
+                            length
+                        },
+                        Err(_error) => {
+                            *payload_pointer = std::ptr::null_mut();
+                            0
+                        }
+                    }
+                }


This is not a blocker for merging, but I don't like how much code we're generating here. Would it be possible, to move most of that code into functions, and only generate calls to those functions?

Yes, that can be done. I can also move a lot of this code into safe functions as the only things here that are actually unsafe is the fact that this function is possibly called from C, we construct a slice from a raw pointer, and we convert a box into a raw pointer to return to Fornjot.

IamTheCarl · 2023-03-28T14:16:59Z

It sounds like you would have an easier time committing this now rather than waiting for me to polish this off, so I'll remove the WIP status and let you do that whenever you feel ready.

I'll try to use a commit pattern more to your liking. These particular commits were a bit larger because larger changes were needed to get the project back into a compiling state.

hannobraun · 2023-03-29T07:43:04Z

It sounds like you would have an easier time committing this now rather than waiting for me to polish this off, so I'll remove the WIP status and let you do that whenever you feel ready.

Well, the code itself doesn't need more polish, but the commits could certainly use some (as described above). There are also merge conflicts with the main branch.

I can take care of that, but I'll have to find the time. Currently, I'm still working on #1713.

I'll try to use a commit pattern more to your liking. These particular commits were a bit larger because larger changes were needed to get the project back into a compiling state.

Yes, sometimes large commits (and thus a difficult review) can't be avoided. All the more reason to keep things as clean as possible, to not further add to the difficulty.

hannobraun · 2023-03-30T08:17:15Z

I tried to rebase this pull request on main (with the goal of cleaning it up and getting it merged), but that revealed a problem. PanicInfo contains a Backtrace now, which doesn't implement Serialize/Deserialize. So this pull request won't work as-is.

Reverting back to a draft for the time being. Please let me know how you want to proceed, @IamTheCarl. I don't currently have the bandwidth to pick this up.

IamTheCarl · 2023-03-30T14:23:19Z

I'll check it out tonight. If I remember correctly, that backtrace was removed and replaced with some strings and integers for this very problem. The backtrace probably got brought back in the rebase.

This should be a good learning experience for me.

IamTheCarl · 2023-03-30T22:09:28Z

Turns out I had forgotten to bring the changes merged into main into this branch. This branch was started on before the backtrace feature was merged into main.

Attempted to rebase and failed, tried a merge and that worked.
I certainly need to experiment more with this stuff. I'm wondering if it would be helpful to you to try and squash some of these commits. I'm going to try and clean up the commit history a little but if I haven't by the time you see this and you're happy with the state of the code, go ahead and merge it in anyway.

hannobraun · 2023-03-31T10:38:48Z

Attempted to rebase and failed, tried a merge and that worked.

Something's still not right. GitHub is showing various conflicts.

I certainly need to experiment more with this stuff. I'm wondering if it would be helpful to you to try and squash some of these commits. I'm going to try and clean up the commit history a little but if I haven't by the time you see this and you're happy with the state of the code, go ahead and merge it in anyway.

Any history cleanup would be very welcome! I don't think I can merge this very soon. Pretty sure I won't get to it today, and I'm on vacation starting Tuesday. (Being on vacation could mean that I'll look into this pull request, if the mood strikes me, or it could mean I have no time because I'm out hiking 😄 )

So there's definitely no time pressure from my side.

hannobraun · 2023-05-15T10:28:49Z

Hey @IamTheCarl! Sorry for not taking proper care of this pull request. It has been hard to review (for the reasons already mentioned above), and thus difficult to find enough time for, but it also happened to be a bit ill-timed. First, I was away for two weeks on vacation, then I was dealing with some more pressing problems and decisions regarding this project.

I've published a write-up about these decisions (see A New Direction). As a result, this feature is unfortunately no longer in scope for the project, and the code updated here has been removed.

Thanks again for your work, @IamTheCarl! I remain convinced that this is the right concept, but as of now, it will fall on another project to implement it.

IamTheCarl added 5 commits March 23, 2023 22:12

Serde is no longer optional for the fj crate.

ee7eb18

Wrote out the client side. Now I have to adapt the host to it.

ad9b2c4

Changed argument passing back to string parsing so the host wouldn't …

bf8417a

…need as big of changes.

The host has been adapted to the new client.

49c4517

Removed broken examples for register_model macro.

e1ca80c

IamTheCarl requested a review from hannobraun as a code owner March 28, 2023 04:10

IamTheCarl marked this pull request as draft March 28, 2023 04:11

hannobraun reviewed Mar 28, 2023

View reviewed changes

IamTheCarl marked this pull request as ready for review March 28, 2023 14:12

Merge branch 'main' into serialization-based-models

04b3729

Merge branch 'main' into serialization-based-models

f0a7052

hannobraun closed this May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialization based models #1735

Serialization based models #1735

IamTheCarl commented Mar 28, 2023

hannobraun left a comment

hannobraun Mar 28, 2023

IamTheCarl Mar 28, 2023

IamTheCarl commented Mar 28, 2023

hannobraun commented Mar 29, 2023

hannobraun commented Mar 30, 2023

IamTheCarl commented Mar 30, 2023

IamTheCarl commented Mar 30, 2023

hannobraun commented Mar 31, 2023

hannobraun commented May 15, 2023

Serialization based models #1735

Serialization based models #1735

Conversation

IamTheCarl commented Mar 28, 2023

hannobraun left a comment

Choose a reason for hiding this comment

hannobraun Mar 28, 2023

Choose a reason for hiding this comment

IamTheCarl Mar 28, 2023

Choose a reason for hiding this comment

IamTheCarl commented Mar 28, 2023

hannobraun commented Mar 29, 2023

hannobraun commented Mar 30, 2023

IamTheCarl commented Mar 30, 2023

IamTheCarl commented Mar 30, 2023

hannobraun commented Mar 31, 2023

hannobraun commented May 15, 2023