Skip to content

Commit

Permalink
WIP Allow passing object instances as args, fields and return values.
Browse files Browse the repository at this point in the history
This is a significant re-working of how object instances are handled,
replacing our previous use of handlemaps with direct use of an `Arc<T>`
pointer. This both reduces overhead and allows us to must more easily
generate code for dealing with object references.

On the Rust side of the world, code that needs to deal with an
object reference will typically take an `Arc<T>`, in the spirit
of being as explicit as possible. The one exception is when passing
objects as arguments, where (like other types) you can annotate the
argument with `[ByRef]` and have UniFFI hand you an `&T`.

For an example of how this might be used in practice, the "todolist"
example has grown the ability to manage a global default list by
setting and retreiving a shared object reference.

Co-authored-by: Ryan Kelly <rfkelly@mozilla.com>
  • Loading branch information
mhammond and rfk committed May 27, 2021
1 parent 52ef91a commit 5346681
Show file tree
Hide file tree
Showing 49 changed files with 1,026 additions and 566 deletions.
4 changes: 2 additions & 2 deletions docs/manual/src/internals/lifting_and_lowering.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Calling this function from foreign language code involves the following steps:
| `record<DOMString, T>` | `RustBuffer` struct pointing to serialized bytes |
| `enum` and `[Enum] interface` | `RustBuffer` struct pointing to serialized bytes |
| `dictionary` | `RustBuffer` struct pointing to serialized bytes |
| `interface` | `uint64_t` opaque integer handle |
| `interface` | `void*` opaque pointer to object on the heap |


## Serialization Format
Expand All @@ -88,7 +88,7 @@ The details of this format are internal only and may change between versions of
| `record<DOMString, T>` | Serialized `i32` item count followed by serialized items; each item is a serialized `string` followed by a serialized `T` |
| `enum` and `[Enum] interface` | Serialized `i32` indicating variant, numbered in declaration order starting from 1, followed by the serialized values of the variant's fields in declaration order |
| `dictionary` | The serialized value of each field, in declaration order |
| `interface` | *Cannot currently be serialized* |
| `interface` | Fixed-width 8-byte unsigned integer encoding a pointer to the object on the heap |

Note that length fields in this format are serialized as *signed* integers
despite the fact that they will always be non-negative. This is to help
Expand Down
123 changes: 83 additions & 40 deletions docs/manual/src/internals/object_references.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Managing Object References

UniFFI [interfaces](../udl/interfaces.md) represent instances of objects
that have methods and contain shared mutable state. One of Rust's core innovations
that have methods and contain state. One of Rust's core innovations
is its ability to provide compile-time guarantees about working with such instances,
including:

Expand All @@ -16,6 +16,8 @@ system. UniFFI itself tries to take a hands-off approach as much as possible and
depends on the Rust compiler itself to uphold safety guarantees, without assuming
that foreign-language callers will be "well behaved".

## Concurrency

UniFFI's hands-off approach means that all object instances exposed by UniFFI must be safe to
access concurrently. In Rust terminology, they must be `Send+Sync` and must be useable
without taking any `&mut` references.
Expand All @@ -27,18 +29,22 @@ of the component - as much as possible, UniFFI tries to stay out of your way, si
that the object implementation is `Send+Sync` and letting the Rust compiler ensure that
this is so.

## Handle Maps
## Lifetimes

In order to allow for instances to be used as flexibly as possible from foreign-language code,
UniFFI wraps all object instances in an `Arc` and leverages their reference-count based lifetimes,
allowing UniFFI to largely stay out of handling lifetimes entirely for these objects.

For additional typechecking safety, UniFFI indirects all object access through a
"handle map", a mapping from opaque integer handles to object instances. This indirection
imposes a small runtime cost but helps us guard against errors or oversights
in the generated bindings.
When constructing a new object, UniFFI is able to add the `Arc` automatically, because it
knows that the return type of the Rust constructor must be a new uniquely-owned struct of
the corresponding type.

For each interface declared in the UDL, the UniFFI-generated Rust scaffolding
will create a global handlemap that is responsible for owning all instances
of that interface, and handing out references to them when methods are called.
The handlemap requires that its contents be `Send+Sync`, helping enforce requirements
around thread-safety.
When you want to return object instances from functions or methods, or store object instances
as fields in records, the underlying Rust code will need to work with `Arc<T>` directly, to ensure
that the code behaves in the way that UniFFI expects.

When accepting instances as arguments, the underlying Rust code can choose to accept it as an `Arc<T>`
or as the underlying struct `T`, as there are different use-cases for each scenario.

For example, given a interface definition like this:

Expand All @@ -50,53 +56,90 @@ interface TodoList {
};
```

The Rust scaffolding would define a lazyily-initialized global static like:
On the Rust side of the generated bindings, the instance constructor will create an instance of the
corresponding `TodoList` Rust struct, wrap it in an `Arc<>` and return the Arc's raw pointer to the
foreign language code:

```rust
lazy_static! {
static ref UNIFFI_HANDLE_MAP_TODOLIST: ArcHandleMap<TodoList> = ArcHandleMap::new();
pub extern "C" fn todolist_12ba_TodoList_new(
err: &mut uniffi::deps::ffi_support::ExternError,
) -> *const std::os::raw::c_void /* *const TodoList */ {
uniffi::deps::ffi_support::call_with_output(err, || {
let _new = TodoList::new();
let _arc = std::sync::Arc::new(_new);
<std::sync::Arc<TodoList> as uniffi::ViaFfi>::lower(_arc)
})
}
```

On the Rust side of the generated bindings, the instance constructor will create an instance of the
corresponding `TodoList` Rust struct, insert it into the handlemap, and return the resulting integer
handle to the foreign language code:
The UniFFI runtime implements lowering for object instances using `Arc::into_raw`:

```rust
pub extern "C" fn todolist_TodoList_new(err: &mut ExternError) -> u64 {
// Give ownership of the new instance to the handlemap.
// We will only ever operate on borrowed references to it.
UNIFFI_HANDLE_MAP_TODOLIST.insert_with_output(err, || TodoList::new())
unsafe impl<T: Sync + Send> ViaFfi for std::sync::Arc<T> {
type FfiType = *const std::os::raw::c_void;
fn lower(self) -> Self::FfiType {
std::sync::Arc::into_raw(self) as Self::FfiType
}
}
```

When invoking a method on the instance, the foreign-language code passes the integer handle back
to the Rust code, which borrows a reference to the instance from the handlemap for the duration
of the method call:
which does the "arc to pointer" dance for us. Note that this has "leaked" the
`Arc<>` reference out of Rusts ownership system and given it to the foreign-language code.
The foreign-language code must pass that pointer back into Rust in order to free it,
or our instance will leak.

When invoking a method on the instance, the foreign-language code passes the
raw pointer back to the Rust code, conceptually passing a "borrow" of the `Arc<>` to
the Rust scaffolding. The Rust side turns it back into a cloned `Arc<>` which
lives for the duration of the method call:

```rust
pub extern "C" fn todolist_TodoList_add_item(handle: u64, todo: RustBuffer, err: &mut ExternError) -> () {
let todo = <String as uniffi::ViaFfi>::try_lift(todo).unwrap()
// Borrow a reference to the instance so that we can call a method on it.
UNIFFI_HANDLE_MAP_TODOLIST.call_with_result_mut(err, handle, |obj| -> Result<(), TodoError> {
TodoList::add_item(obj, todo)
pub extern "C" fn todolist_12ba_TodoList_add_item(
ptr: *const std::os::raw::c_void,
todo: uniffi::RustBuffer,
err: &mut uniffi::deps::ffi_support::ExternError,
) -> () {
uniffi::deps::ffi_support::call_with_result(err, || -> Result<_, TodoError> {
let _obj = <std::sync::Arc<TodoList> as uniffi::ViaFfi>::try_lift(ptr).unwrap();
let _retval =
TodoList::add_item(&_obj, <String as uniffi::ViaFfi>::try_lift(todo).unwrap())?;
Ok(_retval)
})
}
```

Finally, when the foreign-language code frees the instance, it passes the integer handle to
a special destructor function so that the Rust code can delete it from the handlemap:
The UniFFI runtime implements lifting for object instances using `Arc::from_raw`:

```rust
pub extern "C" fn ffi_todolist_TodoList_object_free(handle: u64) {
UNIFFI_HANDLE_MAP_TODOLIST.delete_u64(handle);
}
unsafe impl<T: Sync + Send> ViaFfi for std::sync::Arc<T> {
type FfiType = *const std::os::raw::c_void;
fn try_lift(v: Self::FfiType) -> Result<Self> {
let v = v as *const T;
// We musn't drop the `Arc<T>` that is owned by the foreign-language code.
let foreign_arc = std::mem::ManuallyDrop::new(unsafe { Self::from_raw(v) });
// Take a clone for our own use.
Ok(std::sync::Arc::clone(&*foreign_arc))
}
```

This indirection gives us some important safety properties:
Notice that we take care to ensure the reference that is owned by the foreign-language
code remains alive.

Finally, when the foreign-language code frees the instance, it
passes the raw pointer a special destructor function so that the Rust code can
drop that initial reference (and if that happens to be the final reference,
the Rust object will be dropped.)

```rust
pub extern "C" fn ffi_todolist_12ba_TodoList_object_free(ptr: *const std::os::raw::c_void) {
if let Err(e) = std::panic::catch_unwind(|| {
assert!(!ptr.is_null());
unsafe { std::sync::Arc::from_raw(ptr as *const TodoList) };
}) {
uniffi::deps::log::error!("ffi_todolist_12ba_TodoList_object_free panicked: {:?}", e);
}
}
```

* If the generated bindings incorrectly pass an invalid handle, or a handle for a different type of object,
then the handlemap will throw an error with high probability, providing some amount of run-time typechecking
for correctness of the generated bindings.
* The handlemap can ensure we uphold Rust's requirements around unique mutable references and threadsafey,
by specifying that the contained type must be `Send+Sync`, and by refusing to hand out any mutable references.
Passing instances as arguments and returning them as values works similarly, except that
UniFFI does not automatically wrap/unwrap the containing `Arc`.
1 change: 1 addition & 0 deletions docs/manual/src/udl/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ The UDL file will look like:
namespace Example {
string hello_world();
}
```
102 changes: 100 additions & 2 deletions docs/manual/src/udl/interfaces.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,104 @@ For each alternate constructor, UniFFI will expose an appropriate static-method,
in the foreign language binding, and will connect it to the Rust method of the same name on the underlying
Rust struct.

## Managing Shared References

To the foreign-language consumer, UniFFI object instances are designed to behave as much like
regular language objects as possible. They can be freely passed as arguments or returned as values,
like this:

```idl
interface TodoList {
...
// Copy the items from another TodoList into this one.
void import_items(TodoList other);
// Make a copy of this TodoList as a new instance.
TodoList duplicate();
};
```

To ensure that this is safe, UniFFI allocates every object instance on the heap using
[`Arc`](https://doc.rust-lang.org/std/sync/struct.Arc.html), Rust's built-in smart pointer
type for managing shared references at runtime.

The use of `Arc` is transparent to the foreign-language code, but sometimes shows up
in the function signatures of the underlying Rust code. For example, the Rust code implementing
the `TodoList::duplicate` method would need to explicitly return an `Arc<TodoList>`, since UniFFI
doesn't know whether it will be returning a new object or an existing one:

```rust
impl TodoList {
fn duplicate(&self) -> Arc<TodoList> {
Arc::new(TodoList {
items: RwLock::new(self.items.read().unwrap().clone())
})
}
}
```

By default, object instances passed as function arguments will also be passed as an `Arc<T>`, so the
Rust implementation of `TodoList::import_items` would also need to accept an `Arc<TodoList>`:

```rust
impl TodoList {
fn import_items(&self, other: Arc<TodoList>) {
self.items.write().unwrap().append(other.get_items());
}
}
```

If the Rust code does not need an owned reference to the `Arc`, you can use the `[ByRef]` UDL attribute
to signal that a function accepts a borrowed reference:

```idl
interface TodoList {
...
// +-- indicate that we only need to borrow the other list
// V
void import_items([ByRef] TodoList other);
...
};
```

```rust
impl TodoList {
// +-- don't need to care about the `Arc` here
// V
fn import_items(&self, other: &TodoList) {
self.items.write().unwrap().append(other.get_items());
}
}
```

Conversely, if the Rust code explicitly *wants* to deal with an `Arc<T>` in the special case of
the `self` parameter, it can signal this using the `[Self=ByArc]` UDL attribute on the method:


```idl
interface TodoList {
...
// +-- indicate that we want the `Arc` containing `self`
// V
[Self=ByArc]
void import_items(TodoList other);
...
};
```

```rust
impl TodoList {
// `Arc`s everywhere! --+-----------------+
// V V
fn import_items(self: Arc<Self>, other: Arc<TodoList>) {
self.items.write().unwrap().append(other.get_items());
}
}
```

You can read more about the technical details in the docs on the
[internal details of managing object references](../internals/object_references.md).

## Concurrent Access

Expand Down Expand Up @@ -132,7 +230,7 @@ impl Counter {
Self { value: 0 }
}

// No mutable references to self allowed in in UniFFI interfaces.
// No mutable references to self allowed in UniFFI interfaces.
fn increment(&mut self) {
self.value = self.value + 1;
}
Expand Down Expand Up @@ -194,4 +292,4 @@ impl Counter {
```

You can read more about the technical details in the docs on the
[internal details of managing object references](../internals/object_references.md).
[internal details of managing object references](../internals/object_references.md).
43 changes: 41 additions & 2 deletions docs/manual/src/udl/structs.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

Dictionaries can be compared to POJOs in the Java world: just a data structure holding some data.

A Rust struct like this:

```rust
struct TodoEntry {
done: bool,
Expand All @@ -10,15 +12,52 @@ struct TodoEntry {
}
```

can be converted in UDL to:
Can be exposed via UniFFI using UDL like this:

```idl
dictionary TodoEntry {
boolean done;
u64 due_date;
string text;
};
```

The fields in a dictionary can be of almost any type, including objects or other dictionaries.
The current limitations are:

* They cannot recursively contain another intance of the *same* dictionary.
* They cannot contain references to callback interfaces.

## Fields holding Object References

If a dictionary contains a field whose type is an [interface](./interfaces.md), then that
field will hold a *reference* to an underlying instance of a Rust struct. The Rust code for
working with such fields must store them as an `Arc` in order to help properly manage the
lifetime of the instance. So if the UDL interface looked like this:

```idl
interface User {
// Some sort of "user" object that can own todo items
};
dictionary TodoEntry {
User owner;
string text;
}
```

Then the corresponding Rust code would need to look like this:

```rust
struct TodoEntry {
owner: std::sync::Arc<User>,
text: String,
}
```

Dictionaries can contain each other and every other data type available, except objects.
Depending on the languge, the foreign-language bindings may also need to be aware of
these embedded references. For example in Kotlin, each Object instance must be explicitly
destroyed to avoid leaking the underlying memory, and this also applies to Objects stored
in record fields.

You can read more about managing object references in the section on [interfaces](./interfaces.md).
1 change: 1 addition & 0 deletions examples/todolist/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ name = "uniffi_todolist"
uniffi_macros = {path = "../../uniffi_macros"}
uniffi = {path = "../../uniffi", features=["builtin-bindgen"]}
thiserror = "1.0"
lazy_static = "1.4"

[build-dependencies]
uniffi_build = {path = "../../uniffi_build", features=["builtin-bindgen"]}
Loading

0 comments on commit 5346681

Please sign in to comment.