Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a doc comparing UniFFI with diplomat #1146

Merged
merged 7 commits into from
Jan 6, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 9 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,15 @@ A portmanteau word that also puns with "unify", to signify the joining of one co
uni - [Latin ūni-, from ūnus, one]
FFI - [Abbreviation, Foreign Function Interface]

## Alternative tools

Other tools we know of which try and solve a similarly shaped problem are:

* [Diplomat](https://github.com/rust-diplomat/diplomat/) - see our [writeup of
the different approach taken by that tool](docs/diplomat-and-macros.md)

(Please open a PR if you think other tools should be listed!)

## Contributing

If this tool sounds interesting to you, please help us develop it! You can:
Expand All @@ -47,34 +56,3 @@ If this tool sounds interesting to you, please help us develop it! You can:
## Code of Conduct

This project is governed by Mozilla's [Community Participation Guidelines](./CODE_OF_CONDUCT.md).

---

(Versions `v0.9.0` though `v0.11.0` include a deprecation notice that links to this README. Once those versions have
sufficiently aged out, this section can be removed from the top-level README.)

### Thread Safety

It is your responsibility to ensure the structs you expose via UniFFI are
all `Send+Sync`. This will be enforced by the Rust compiler, likely with an
inscrutable error from somewhere in UniFFI's generated Rust code.

Early versions of this crate automatically wrapped rust structs in a mutex,
thus implicitly making the interfaces thread-safe and safe to be called
over the FFI by any thread. However, in practice we found this to be a
mis-feature, so version 0.7 first introduced the ability for the component
author to opt-out of this implicit wrapping and take care of thread-safety
themselves by adding a `[Threadsafe]` attribute to the interface.

Version 0.9.0 took this further, and interfaces not marked as `[Threadsafe]`
started issuing a deprecation warning. If you are seeing these deprecation warnings,
you should upgrade your component as soon as possible. For an example of what kind of
effort is required to make your interfaces thread-safe, you might like to see
[this commit](https://github.com/mozilla/uniffi-rs/commit/454dfff6aa560dffad980a9258853108a44d5985)
where we made one the examples thread-safe.

As of version 0.11.0, all interfaces will be required to be `Send+Sync`, and the
`[Threadsafe]` attribute will be deprecated and ignored.

See also [adr-0004](https://github.com/mozilla/uniffi-rs/blob/main/docs/adr/0004-only-threadsafe-interfaces.md)
which outlines the reasoning behind this decision.
275 changes: 275 additions & 0 deletions docs/diplomat-and-macros.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,275 @@
# Comparing UniFFI with Diplomat

[Diplomat](https://github.com/rust-diplomat/diplomat/) and [UniFFI](https://github.com/mozilla/uniffi-rs/)
are both tools which expose a rust implemented API over an FFI.
At face value, these tools are solving the exact same problem, but their approach
is significantly different.

This document attempts to describe these different approaches and discuss the pros and cons of each.
It's not going to try and declare one better than the other, but instead just note how they differ.
If you are reading this hoping to find an answer to "what one should I use?", then that's easy -
each tool currently supports a unique set of foreign language bindings, so the tool you should
use is the one that supports the languages you care about!

Disclaimer: This document was written by one of the UniFFI developers, who has never used
diplomat in anger. Please feel free to open PRs if anything here misrepresents diplomat.

See also: This document was discussed in [this PR](https://github.com/mozilla/uniffi-rs/pull/1146),
which has some very interesting discussion - indeed, some of the content in this document has
been copy-pasted from that discussion, but there's further detail there which might be
of interest.

# The type systems

The key difference between these 2 tools is the "type system". While both are exposing Rust
code (which obviously comes with its own type system), the foreign bindings need to know
lots of details about all the types expressed by the tool.

For the sake of this document, we will use the term "type universe" to define the set of
all types known by each of the tools. Both of these tools build their own "type universe" then
use that to generate both Rust code and foreign bindings.

## UniFFI's type universe

UniFFI's model is to parse an external ffi description from a `.udl` file which describes the
entire "type universe". This type universe is then used to generate both the Rust scaffolding
(on disk as a `.rs` file) and the foreign bindings.

**What's good about this** is that the entire type system is known when generating both the rust code
and the foreign binding, and is known without parsing any Rust code. This is important because
things like field names and types in structs must be known on both sides of the FFI.

**What's bad about this** is that the external UDL is very ugly and redundant in terms of the
implemented rust API.

## Diplomat's type universe

Diplomat defines its "type universe" (ie, the external ffi) using macros.

**What's good about this** is that an "ffi module" (and there may be many) defines the canonical API
and it is defined in terms of Rust types - the redundant UDL is removed.
The Rust scaffolding can also be generated by the macros, meaning there are no generated `.rs`
files involved. Types can be shared among any of the ffi modules defined in the project -
for example, [this diplomat ffi module](https://github.com/unicode-org/icu4x/blob/7d9f89fcd7df4567e17ddd8c46810b0db287436a/ffi/diplomat/src/pluralrules.rs#L50-L51)
uses types from a [different ffi module](https://github.com/unicode-org/icu4x/blob/7d9f89fcd7df4567e17ddd8c46810b0db287436a/ffi/diplomat/src/locale.rs#L19).

Restricting the definition of the FFI to a single module instead of allowing that definition
to appear in any Rust code in the crate also offers better control over the stability of the API,
because where the FFI is defined is constrained. This is
[an explicit design decision](https://github.com/rust-diplomat/diplomat/blob/main/docs/design_doc.md#requirements) of diplomat.

While the process for defining the type universe is different, the actual in-memory
representation of that type universe isn't radically different from UniFFI - for example,
here's the [definition of a Rust struct](https://github.com/rust-diplomat/diplomat/blob/main/core/src/ast/structs.rs),
and while it is built from a `syn` struct, the final representation is independent of `syn`
and its ast representation.

## UniFFI's experience with the macro approach.

Ryan tried this same macro approach for UniFFI in [#416](https://github.com/mozilla/uniffi-rs/pull/416) -
but we struck a limitation in this approach for UniFFI's use-cases - the context in which the
macro runs doesn't know about types defined outside of that macro, which are what we need to
expose.

### Example of this limitation

Let's look at diplomat's simple example:

```rust
#[diplomat::bridge]
mod ffi {
pub struct MyFFIType {
pub a: i32,
pub b: bool,
}

impl MyFFIType {
pub fn create() -> MyFFIType { ... }
...
}
}
```

This works fine, but starts to come unstuck if you want the types defined somewhere else. In this trivial example, something like:
mhammond marked this conversation as resolved.
Show resolved Hide resolved

```Rust
pub struct MyFFIType {
mhammond marked this conversation as resolved.
Show resolved Hide resolved
pub a: i32,
pub b: bool,
}

#[diplomat::bridge]
mod ffi {
impl MyFFIType {
pub fn create() -> MyFFIType { ... }
...
}
}
```

fails - diplomat can't handle this scenario - in the same way and for the same reasons that Ryan's
[#416](https://github.com/mozilla/uniffi-rs/pull/416) can't - the contents of the struct aren't known.

From the Rust side of the world, this is probably solvable by sprinkling more macros around - eg, something like:

```Rust
#[uniffi::magic]
pub struct MyFFIType {
pub a: i32,
pub b: bool,
}
```

might be enough for the generation of the Rust scaffolding - in UniFFI's case, all we really need
is an implementation of `uniffi::RustBufferViaFfi` which is easy to derive, and UniFFI can
generate code which assumes that exists much like it does now.
However, the problems are in the foreign bindings, because those foreign bindings do not know
the names and types of the struct elements without re-parsing every bit of Rust code with those
annotations. As discussed below, re-parsing this code might be an option if we help Uniffi to
find it, but asking UniFFI to parse this and all dependent crates to auto-discover them
probably is not going to be viable.

### Why is this considered a limition for UniFFI but not diplomat?

As mentioned above, diplomat considers the limitation described above as an intentional design
feature. By limiting where FFI types can be described, there's no risk of changes made "far away"
from the FFI to change the FFI. This was born of experience in tools like `cbindgen`.

For Uniffi, all use-cases needed by Mozilla don't share this design goal, primarily because the
FFI is the primary consumer of the crate. The Rust API exists purely to service the FFI. It's not

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, this explains a lot the design decisions :)

really possible to accidentally change the API, because every API change made will be in service
of exposing that change over the FFI. The test suites written in the foreign languages are
considered canonical.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think this helps answer my question about about API stability in UniFFI vs Diplomat. I still feel like the .udl serves a similar purpose in the UniFFI world, but because of the way the .udl kind of maps directly onto the underlying Rust code, I can see an argument that changes in a UniFFI component are more likely to change the FFI surface than in Diplomat.

They won't change to FFI surface by stealth, because you'll have to update the .udl file...but changing the .udl file is likely to be the simplest way to accommodate a change in the underlying Rust code. In Diplomat you might instead write some adapter code in the FFI module in order to avoid changing the FFI surface, because there's an obvious place to do it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think we agree that UniFFI's use of the .udl file and Diplomat's decision to restrict where types are exposed do serve the same purpose in that regard. The broader point I'm trying to make though is that application-services would prefer to not have those guards in place - ie, it would probably prefer a world where the UDL file didn't exist and nor did any limitation about where these types could be defined.


## How the type universe is constructed for the macro approach.

In both diplomat and [#416](https://github.com/mozilla/uniffi-rs/pull/416), the approach taken
is very similar - it takes a path to a the Rust source file/tree, and uses `syn` to locate the special modules (ie, ones annotated with `#[diplomat:bridge]` in the case of diplomat.)

While some details differ, this is just a matter of implementation - #416 isn't quite as agressive
about consuming the entire crate to find multiple FFI modules (and even then, diplomat doesn't
actually *process* the entire crate, just modules tagged as a bridge), but could easily be
extended to do so.

But in both cases, for our problematic example above, this process never sees the layout of the
`MyFFIType` struct because it's not inside the processed module, so that layout can't be
communicated to the foreign bindings.
As noted above, this is considered a feature for diplomat, but a limitation for UniFFI.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work in Diplomat, does the FFI allow you to pass it around as a pointer but not construct one for yourself on the foreign-language side?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - diplomat doesn't have that problem because the struct definition must appear inside the module, so the foreign bindings, which parse that module, do know the struct elements. This is the same as we discovered in #416 - that forcing all type definitions into the single ffi module would technically work, but the impact that would have on how our code is organized made it less appealing than the status quo.

I added a few words here to try and make that clearer.


This is the problem which caused us to decide to stop working on
[#416](https://github.com/mozilla/uniffi-rs/pull/416) - the current world where the type universe
is described externally doesn't have this limitation - only the UDL file needs to be parsed when
generating the foreign bindings. The application-services team has
concluded that none of our non-trival use-cases for UniFFI could be described using macros,
so supporting both mechanisms is pain for no gain.

As noted in #416, `wasm-bindgen` has a similarly shaped problem, and solves it by having
the Rust macro arrange for the resulting library to have an extra data section with the
serialized "type universe" - foreign binding generation would then read this information from the
already built binary. This sounds more complex than the UniFFI team has appetite for at
the current time.

# Looking forward

## Adapting the diplomat/[#416](https://github.com/mozilla/uniffi-rs/pull/416) model to process the entire crate?

We noted that diplomat intentionally restricts where the ffi is generated,
whereas UniFFI considers that a limitation - but what if we can teach UniFFI to process
more of the Rust crate?

It might be reasonable for the foreign bindings to know that Rust "paths" to modules which should
be processed, and inside those modules find structs "marked up" as being used by the FFI.

In other words, borrowing the example above:

```Rust
#[uniffi::magic]
pub struct MyFFIType {
pub a: i32,
pub b: bool,
}
```

maybe can be made to work, so long as we are happy to help UniFFI discover where such annotations
may exist.

A complication here is that currently UniFFI allows types defined in external crates,
but that might still be workable - eg,
[diplomat has an issue open to support exactly this](https://github.com/rust-diplomat/diplomat/issues/34)

## Duplicating structs inside Rust

When reviewing the draft of this document, @rfk noted that we are already duplicating Rust structs
in UDL and in Rust. So instead of having:

```
// In a UDL file:
dictionary MyFFIType {
i32 a;
bool b;
};
// Then in Rust:
pub struct MyFFIType {
pub a: i32,
pub b: bool,
}
```

we could have:
```rust
// In the Rust implementation, in some other module.
pub struct MyFFIType {
pub a: i32,
pub b: bool,
}

// And to expose it over the FFI:
#[ffi::something]
mod ffi {

#[ffi::magic_external_type_declaration]
pub struct MyFFIType {
pub a: i32,
pub b: bool,
}

impl MyFFIType {
pub fn create() -> MyFFIType { ... }
...
}
}
```

So while we haven't exactly reduced the duplication, we have removed the UDL.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think the toy example is the worst-case for highlighting the duplication here because the two duplicate declarations are separated by just 7 lines of text. In a real-world crate I would expect the duplication to not feel quite so bad because the source struct and its redeclaration would be further apart.

(That doesn't help with some of the other contra points raised here though)

Copy link
Member Author

@mhammond mhammond Jan 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the duplication to not feel quite so bad

This is a good point, but from my POV, that can almost make the duplication feel worse - eg, you get a rust compiler error, and it can be hard to work out whether the UDL needs to change or the rust duplicate of that UDL needs to change - eg, making something optional means adding a ? to the udl and a corresponding Option<> to the Rust - it can be difficult to work out which one you screwed up :)

We probably also haven't helped with documentation, because the natural location for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could imagine explicitly telling the macro about the path to the redeclared struct, like:

    #[uniffi::declare_imported_type(super::MyFFIType)]
    pub struct MyFFIType {
        pub a: i32,
        pub b: bool,
    }

Then if it wanted to, the code processing this declaration could go find the corresponding super::myFFIType and pull out e.g. its docstring.

I think this would probably be more trouble than it's worth (we don't want to re-implement vast swathes of Rust's name lookup machinery, for example) but it's interesting to think about.

the documentation of `MyFFIType` is probably at the *actual* implementation.
Copy link
Contributor

@bendk bendk Jan 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is some serious food for thought.

For one, as @rfk points out, we're currently duplicating the struct/function definitions in the UDL. The way I see it: both projects have made a similar design decision. @Manishearth describes it as "we do not want changes far away to change the FFI API". I would maybe reword that to "the FFI should be fully defined in one place". For UniFFI, that place is the UDL. For diplomat, it's the ffi module. (Multiple ffi modules complicates this picture a bit, but doesn't fundamentally change things).

However, there is one difference: in some cases you only need code inside the ffi module. This has the potential to reduce duplication. For example, it would be very natural to move our Store definitions inside the ffi module. Also, since types defined inside the ffi module are visible to the rest of the rust code, so we could move types like Login, EncrytpedLogin, CreditCard, Address, etc. into the ffi module. I think external data types would still need duplicate definitions, but maybe that's it.

One potential issue with this is that it couples the library with the FFI code. But this doesn't seem to be a problem for our components. As @MarkH points out, our Rust APIs exists purely to service the FFIs.

It really makes me wonder about switching from a UDL-based approach to a macro based approach. At the start, maybe each consumer would simply refactor their current UDL file to a macro. This shouldn't be much work, maybe we could even automate it. After that, we have the ability to do to small refactors that eliminate the unneeded duplication.

BTW, this also could solve some documentation issues since a) we can actually see docstrings when using syn and b) if there's only 1 place where a type is defined then it's clear where to put the docstrings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a brief note in a86e949 which doesn't capture all of this comment, but does briefly say why we should consider this more.


While it might not solve all our problems, it is worthy of serious consideration - fewer problems
is still a worthwhile goal, and needing a UDL file and parser seems like one worth removing.

## Try and share some definitions with diplomat

We note above that the type universe described by diplomat is somewhat "leaner" than that
described by UniFFI, but in general they are very similar. Thus, there might be a future where
merging or otherwise creating some interoperability between these type universes might make
sense.

It seems likely that this would start to add unwelcome constraints - eg, diplomat would not want
its ability to refactor type representations limited by what UniFFI needs.

However, what you could see happening in the future is UniFFI becoming a kind of higher-level
wrapper around Diplomat. You can imagine a Diplomat backend for UniFFI that converts a .udl file
into a bridge module and then uses the Diplomat toolchain to generate bindings from it,
keeping some of the additional affordances/conveniences UniFFI built for its specific use-cases.

# Next steps for UniFFI

As much as some of the UniFFI team dislike the external UDL file, there's no clear path to
moving away from it. We could experiment with some of the options above and see if they are
both viable and worth the investment for the UniFFI use-cases. That sounds like a long-term
goal.

In the short term, the best we can probably do is to enumerate the perceived problems
with the UDL file and try to make them more ergonomic - for example, avoiding repetition of
`[Throws=SomeError]` would remove alot of noise, and some strategy for generating

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diplomat just treats documentation as "yet another backend", which works reasonably well, since the architecture of a diplomat backend is just "here's the type structure, you know what to expect on the FFI layer, do what you want".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By contrast, the only reason that we don't already have documentation as yet-another-backend in UniFFI, is that the off-the-shelf parser that we use for the IDL throws away comments by default :-(

documentation might go a long way.