Allow passing object instances as args, fields and return values. #462

rfk · 2021-05-25T04:54:18Z

This implements the proposal from ADR-0005,
following discussion in #419. It depends on the removal on non-threadsafe interfaces
from #452.

This is a significant re-working of how object instances are handled,
replacing our previous use of handlemaps with direct use of an Arc<T>
pointer. This both reduces overhead and allows us to must more easily
generate code for dealing with object references.

On the Rust side of the world, code that needs to deal with an
object reference will typically take an Arc<T>, in the spirit
of being as explicit as possible. The one exception is when passing
objects as arguments, where (like other types) you can annotate the
argument with [ByRef] and have UniFFI hand you an &T.

For an example of how this might be used in practice, the "todolist"
example has grown the ability to manage a global default list by
setting and retreiving a shared object reference.

rfk

So, this is a big change, but I think it has come together quite nicely and is just about ready for review!

@mhammond, thanks for getting this off to such a great start with your earlier PR. Much of what's here should look familiar, although I've removed the identity-map stuff from the Python backend because I wasn't yet prepared to tackle that in all languages, and it seemed worth trying to be consistent.

@jhugman, I'm not sure how closely you've been following this discussion, but would love to get your take on this, especially from the consumer perspective. Some self-review notes below on things to watch out for.

@saks, could you please take a look at what I've done with the Ruby stuff here? This was certainly an interesting way for me to start learning more about Ruby! 😅

rfk · 2021-05-27T07:19:53Z

examples/todolist/src/lib.rs

+    // There is a single "default" TodoList that can be shared
+    // by all consumers of this component.
+    static ref DEFAULT_LIST: RwLock<Option<Arc<TodoList>>> = RwLock::new(None);
+}


I've updated to "todolist" example to demonstrate how I think we could use this in something like Glean or Nimbus in the future, managing a global singleton that's initialized by the application code but can be accessed by multiple consumers.

I wonder if using a Weak here might be even more informative? If this was a "real" app, I imagine the default would be a strong ref, but it's an example app. Maybe just a comment along the lines of "depending on your requirements, you could use a Weak here and avoid the possible circular reference."?

rfk · 2021-05-27T07:22:05Z

examples/todolist/tests/bindings/test_todolist.kts

+    getDefaultList()!!.use { default ->
+        assert(default.getLast() == "Test shared state through local vs default reference")
+    }
+}


This test shows a bit of the awkwardness of working with object references in Kotlin, where we have to make sure each instance is explicitly destroyed when we're finished with it. It's manageable, but easy to forget about. I'd like to see us take a fresh look at e.g. weak references for doing this more automatically, but that's well outside the scope of this PR.

It might be worth summarizing this comment in the example - a casual reader might not realize the importance of the pattern used.

rfk · 2021-05-27T07:25:25Z

examples/todolist/tests/bindings/test_todolist.py

+assert(todo.get_items() == get_default_list().get_items())
+# In the future, we'll maintain object identity over the FFI.
+#assert(get_default_list() is todo)
+#assert(get_default_list() is not todo2)


Currently, every object instance returned over the FFI is a fresh foreign-language object. Mark's prototype PR implemented a reverse-identity-map for the Python backend using weak references and I think we could do something similar for the other languages, but on balance I think it will be better for us to try out the basic functionality without blocking on that extra complexity. Happy to work on it in a followup PR, and I'm open to the suggestion that we shouldn't land without it because this "every object is a new one" behaviour could be confusing for consumers.

(There's also the question of how a reverse-identity-map would interact with the Kotlin destroy()/.use pattern, which I really don't have my head around just yet.)

SGTM - maybe even just take that "promise" and commented code out entirely? It sounds like a commitment that I don't think we need to make. Alternatively, water it down "we might..." and link to an issue?

You're right, we should just delete the promise; if it's important to us we'll actually do it, and if it isn't important we don't need to make any promises about it.

rfk · 2021-05-27T07:25:56Z

fixtures/coverall/src/coverall.udl

+[Enum]
+interface MaybeSimpleDict {
+    Yeah(SimpleDict d);
+    Nah();


This tests codegen for nested object references, I don't have any tests that actually exercise it yet.

rfk · 2021-05-27T07:27:12Z

fixtures/coverall/tests/bindings/test_coverall.kts

-assert(d.maybeFloat64!!.almostEquals(1.0))
+// Test some_dict().
+// N.B. we need to `use` here to clean up the contained `Coveralls` reference.
+createSomeDict().use { d ->


Because the SimpleDict type contains an object reference, it too provides helpers for manual management of the object lifecycle.

rfk · 2021-05-27T07:37:07Z

uniffi_bindgen/src/bindings/ruby/templates/ObjectTemplate.rb

-    @handle = handle
+
+  # A private helper for initializing instances of the class from a raw pointer,
+  # bypassing any initialization logic and ensuring they are GC'd properly.


@saks I pulled out some "private" helpers here to make codegen a bit easier. Is there a pattern we should use to discourage consumers from calling this, similar to e.g. the leading-underscore pattern from Python?

@rfk The best way to hide it would be to mark these methods as private_class_method, but this is not quite feasible in our case. Leading-underscore is also pretty common in ruby community, so employing it here is rather a good idea.

Great, I'll by some leading underscores in here to more clearly communicate intent.

rfk · 2021-05-27T07:39:44Z

uniffi_bindgen/src/bindings/swift/templates/ObjectTemplate.swift

+
+    public func hash(into hasher: inout Hasher) {
+        hasher.combine(self.pointer)
+    }


Here I've implemented Equatable and Hashable for Object classes, mostly so that they can appear as fields in Records, which are in turn Equatable and Hashable. An alternative would be to only implement Equatable and Hashable for Records that do not contain any Object instances.

This requires some thought: I expect the fully featured right thing to do is only implement == and hash for Rust structs that implementPartialEq, and/or to allow fields to be excluded from equals and hash implementations from within the WebIDL.

Both of which is beyond the scope of this PR.

Recommend we don't implement Equatable and Hashable until we have a better sense of demand for this.

Recommend we don't implement Equatable and Hashable until we have a better sense of demand for this.

Aye, thanks. I think I knew that this was most likely the right thing to do but just didn't want to deal with it in this PR. I'll update.

Specifically, what I've done is:

Do not implement Equatable/Hashable for Object types.

Update Record and Enum types so that they only implement Equatable/Hashable if they do not contain any Object type fields.

rfk · 2021-05-27T07:40:35Z

uniffi_bindgen/src/interface/function.rs

+
+    pub fn all_arguments(&self) -> Vec<Argument> {
+        self.arguments.to_vec()
+    }


This is a little trick to simplify codegen, more below...

rfk · 2021-05-27T07:41:22Z

uniffi_bindgen/src/interface/object.rs

+            .into_iter()
+            .chain(self.arguments.iter().cloned())
+            .collect()
+    }


So here for methods, we're making a distinction between the "visible to the user" arguments() list and the "what you see in the actual call" all_arguments() list...

Oh, I see.

I think I'd suggest grouping arguments, first_argument and all_arguments together, and putting a brief comment next to them.

Some further noticings:

derive_ffi_func can now use all_arguments().

all_arguments is used by the _arg_list_rs_call macro. Can the naming reflect that?

first_argument() could (not sure if "should") be changed to "self_argument()"?

In fact, first_argument can be removed completely, because we only ever used it in order to build up the list that is now returned by all_arguments.

rfk · 2021-05-27T07:43:08Z

uniffi_bindgen/src/templates/macros.rs

-    {{ func.name() }}(
-    {{- prefix }}{% if func.arguments().len() > 0 %}, {% call _arg_list_rs_call(func) -%}{% endif -%}
-)
-{%- endmacro -%}


...and now we can get rid of this "call with prefix thing" in the Rust codegen macros! Because the Object type now lifts and lowers in the same way as the other types, we don't need to do any specially-structured codegen for it, and can just use the plain to_rs_call macro as long as we pass it the full list of arguments.

mhammond

I'm glad it was you and not me who had to work through the Kotlin thorns! It's exciting to see this ready to land 🚀

mhammond · 2021-05-28T01:00:16Z

CHANGELOG.md

+- It is now possible to use Object instances as fields in Records or Enums, to pass them as arguments,
+  and to return them from function and method calls. They should for the most part behave just like
+  a host language object, and their lifecycle is managed transparently using Rust's `Arc<T>` type.
+    - Cross-language reference cycles will not be garbage collected.


The "cross language" part here makes it sound like some cycles will be collected - don't have a concrete suggestion here because we don't want to give the impression we will somehow break existing GC, but thought I'd mention it in case you have inspiration.

mhammond · 2021-05-28T01:06:22Z

docs/manual/src/udl/structs.md

+The fields in a dictionary can be of almost any type, including objects or other dictionaries.
+The current limitations are:
+
+* They cannot recursively contain another intance of the *same* dictionary.


typo - intance

mhammond · 2021-05-28T01:08:18Z

examples/todolist/src/lib.rs

+    // There is a single "default" TodoList that can be shared
+    // by all consumers of this component.
+    static ref DEFAULT_LIST: RwLock<Option<Arc<TodoList>>> = RwLock::new(None);
+}


I wonder if using a Weak here might be even more informative? If this was a "real" app, I imagine the default would be a strong ref, but it's an example app. Maybe just a comment along the lines of "depending on your requirements, you could use a Weak here and avoid the possible circular reference."?

mhammond · 2021-05-28T01:12:04Z

examples/todolist/tests/bindings/test_todolist.kts

+    getDefaultList()!!.use { default ->
+        assert(default.getLast() == "Test shared state through local vs default reference")
+    }
+}


It might be worth summarizing this comment in the example - a casual reader might not realize the importance of the pattern used.

mhammond · 2021-05-28T01:12:33Z

examples/todolist/tests/bindings/test_todolist.kts

+        assert(default.getLast() == "Test shared state through local vs default reference")
+    }
+}
+
 // Ensure the kotlin version of deinit doesn't crash, and is idempotent.
 todo.destroy()
 todo.destroy()


and I guess if you are touching this file you might as well add the trailing \n :)

mhammond · 2021-05-28T01:13:45Z

examples/todolist/tests/bindings/test_todolist.py

+assert(todo.get_items() == get_default_list().get_items())
+# In the future, we'll maintain object identity over the FFI.
+#assert(get_default_list() is todo)
+#assert(get_default_list() is not todo2)


SGTM - maybe even just take that "promise" and commented code out entirely? It sounds like a commitment that I don't think we need to make. Alternatively, water it down "we might..." and link to an issue?

mhammond · 2021-05-28T01:19:33Z

fixtures/coverall/tests/bindings/test_coverall.py

+        self.assertEqual(get_num_alive(), 1)
+        # and check it's the correct object.
+        self.assertEqual(coveralls.get_other().get_name(), "test_arcs")
+        # TODO: when we implement an identity map, it will even be the same Python object


ditto here re future promises.

mhammond · 2021-05-28T01:20:46Z

fixtures/coverall/tests/bindings/test_coverall.swift

+        assert(c2.strongCount() == 3)
+
+        assert(coveralls.getOther() == c2)
+        // TODO: in future get_other() should return the exact same instance.


ditto re promises

mhammond · 2021-05-28T01:21:33Z

fixtures/coverall/tests/bindings/test_coverall.swift

+        // It's okay!
+    }
+    // TODO: kinda hard to test this, as it triggers a fatal error.
+    // coveralls!.takeOtherPanic(message: "expected panic: with an arc!")


hrmph - that's surprising to me, although I guess it shouldn't be.

saks · 2021-05-28T06:23:46Z

I'd like give a shout-out to @rfk for spending time to make these changes work for ruby adapter. Big thank you! 👍

rfk · 2021-06-03T03:43:15Z

I've addressed the feedback here, rebased and squashed ready for merge.

jhugman

I am liking this a lot. I'm not sure I'm qualified to give you a Big Green Button, but I've left some general feedback and some kotlin/swift specific.

I am very excited that we have a clear path to being able to support static constructor methods and singletons.

jhugman · 2021-06-09T10:09:00Z

docs/manual/src/udl/interfaces.md

+The use of `Arc` is transparent to the foreign-language code, but sometimes shows up
+in the function signatures of the underlying Rust code. For example, the Rust code implementing
+the `TodoList::duplicate` method would need to explicitly return an `Arc<TodoList>`, since UniFFI
+doesn't know whether it will be returning a new object or an existing one:


jhugman · 2021-06-09T10:13:49Z

docs/manual/src/udl/structs.md

@@ -2,6 +2,8 @@

 Dictionaries can be compared to POJOs in the Java world: just a data structure holding some data.


Java nit: POJOs have methods too, not like the structs here. I sincerely regret to suggest:

Suggested change

Dictionaries can be compared to POJOs in the Java world: just a data structure holding some data.

Dictionaries can be compared to JavaBeans in the Java world: just a data structure holding some data.

Actually, I'm not even sure that's correct.

I think the inference we're going for is that it's typed and structured data with no methods on it. But this is not for this PR.

Oh dear 😅

Honestly, "compared to POJO" and "compared to JavaBeans" both do absolutely nothing to help my personal understanding of what's happening here, which is maybe a signal that we should just reword this to use a general description rather than a language-specific analogy. Thanks for pointing it out.

docs/manual/src/udl/structs.md

jhugman · 2021-06-09T10:51:04Z

uniffi_bindgen/src/bindings/kotlin/templates/Helpers.kt

+        block(this)
+    } finally {
+        try {
+            this?.destroy()


I am puzzled by the question mark here, and in L95.

It's because we implement it on Destroyable? in order to be able to use it on Option<T> as well as just T.

jhugman · 2021-06-09T11:15:09Z

uniffi_bindgen/src/bindings/kotlin/templates/Helpers.kt

+// helper method to execute a block and destroy the object at the end.
+interface Destroyable {
+    fun destroy()
+}


Something bothers me about this framing/naming. Perhaps writing this comment will rubber duck me into agreeing with it, or suggesting something new.

As I understand it:

destroy() is supposed to be called as we finish with the object on the Kotlin side.

This destructible/destroyable Objects is really about decreasing their Arc ref count on the Rust side.

Because it's about ref counting, destroying a data class once too many times will cause difficult to track down bugs, and not enough times will cause a difficult to find memory leak.

I think I'm persuading myself that

a. we should call this Disposable (it's up to the user to tidy up after themselves)
b. for objects we should consider implementing AutoCloseable (so we can piggy back on the try-with-resources pattern, rather than implementing our own).

we should call this Disposable

This name makes sense to me; should the corresponding method be called dispose() for symmetry rather than destroy()?

Noting for completeness, that in my local copy I have updated this to implement AutoCloseable for objects.

jhugman · 2021-06-09T13:29:37Z

uniffi_bindgen/src/bindings/swift/templates/ObjectTemplate.swift

+
+    public func hash(into hasher: inout Hasher) {
+        hasher.combine(self.pointer)
+    }


This requires some thought: I expect the fully featured right thing to do is only implement == and hash for Rust structs that implementPartialEq, and/or to allow fields to be excluded from equals and hash implementations from within the WebIDL.

Both of which is beyond the scope of this PR.

Recommend we don't implement Equatable and Hashable until we have a better sense of demand for this.

jhugman · 2021-06-09T13:32:12Z

uniffi_bindgen/src/bindings/swift/templates/ObjectTemplate.swift

+}
+
+extension {{ obj.name()|class_name_swift }} : ViaFfi, Serializable {}


Should this be/can this be private or internal?

I wasn't able to make it so, for reasons that I don't really understand to do with Swift's object model. Perhaps I'll take a fresh look at it now that I've been away from the code for a little while. At the very least, I can leave a comment here about why it can't be private.

jhugman · 2021-06-10T17:39:02Z

uniffi_bindgen/src/interface/object.rs

+            .into_iter()
+            .chain(self.arguments.iter().cloned())
+            .collect()
+    }


Oh, I see.

I think I'd suggest grouping arguments, first_argument and all_arguments together, and putting a brief comment next to them.

Some further noticings:

derive_ffi_func can now use all_arguments().

all_arguments is used by the _arg_list_rs_call macro. Can the naming reflect that?

first_argument() could (not sure if "should") be changed to "self_argument()"?

jhugman · 2021-06-10T17:40:17Z

uniffi_bindgen/src/templates/macros.rs

-    {{ func.name() }}(
-    {{- prefix }}{% if func.arguments().len() > 0 %}, {% call _arg_list_rs_call(func) -%}{% endif -%}
-)
-{%- endmacro -%}


jhugman · 2021-06-10T17:41:56Z

uniffi_bindgen/src/interface/mod.rs

+                .unwrap_or(false),
+            _ => false,
+        }
+    }


cough if only there were some non-enum-based abstraction over all these bits of a ComponentInterface that we could use to simplify this...

This is a significant re-working of how object instances are handled, replacing our previous use of handlemaps with direct use of an `Arc<T>` pointer. This both reduces overhead and allows us to must more easily generate code for dealing with object references. On the Rust side of the world, code that needs to deal with an object reference will typically take an `Arc<T>`, in the spirit of being as explicit as possible. The one exception is when passing objects as arguments, where (like other types) you can annotate the argument with `[ByRef]` and have UniFFI hand you an `&T`. For an example of how this might be used in practice, the "todolist" example has grown the ability to manage a global default list by setting and retreiving a shared object reference. Co-authored-by: Ryan Kelly <rfkelly@mozilla.com>

rfk · 2021-06-14T07:18:17Z

I've tweaked the Kotlin and Swift bindings to accommodate James's feedback, and I think this is good to go!

While this is not technically a breaking change, it feels significant enough that I'm going to give it its own minor version increment.

rfk mentioned this pull request May 25, 2021

POC for step 1 of #419 - replace Handlemaps with Arcs. #420

Closed

rfk force-pushed the implement-ADR-0005-arc-pointers branch from 6b50030 to dddaf7c Compare May 25, 2021 04:58

rfk force-pushed the i449-interfaces-always-threadsafe branch from 2a958c7 to fa7aa1f Compare May 26, 2021 03:10

rfk force-pushed the implement-ADR-0005-arc-pointers branch from dddaf7c to 8ca94b7 Compare May 26, 2021 03:17

rfk mentioned this pull request May 26, 2021

Provide default IntoFfi impl for *c_void. mozilla/ffi-support#4

Merged

rfk force-pushed the i449-interfaces-always-threadsafe branch from fa7aa1f to 52ef91a Compare May 26, 2021 04:36

rfk force-pushed the implement-ADR-0005-arc-pointers branch 4 times, most recently from 5346681 to 3372036 Compare May 27, 2021 03:25

Base automatically changed from i449-interfaces-always-threadsafe to main May 27, 2021 04:33

rfk changed the base branch from main to kt-handle-destroy-race May 27, 2021 04:45

rfk force-pushed the implement-ADR-0005-arc-pointers branch 4 times, most recently from eaf3750 to f71b0a2 Compare May 27, 2021 07:09

rfk marked this pull request as ready for review May 27, 2021 07:13

rfk commented May 27, 2021

View reviewed changes

rfk force-pushed the implement-ADR-0005-arc-pointers branch from f71b0a2 to 0c40461 Compare May 27, 2021 07:43

rfk requested review from mhammond and jhugman May 27, 2021 07:44

rfk mentioned this pull request May 27, 2021

Propagate @ExperimentalUnsignedTypes annotations for APIs that use unsigned types #444

Closed

mhammond approved these changes May 28, 2021

View reviewed changes

Base automatically changed from kt-handle-destroy-race to main June 3, 2021 03:28

rfk force-pushed the implement-ADR-0005-arc-pointers branch from 0c40461 to 780b802 Compare June 3, 2021 03:43

rfk force-pushed the implement-ADR-0005-arc-pointers branch 3 times, most recently from ecda34f to d78fe1d Compare June 3, 2021 03:56

jhugman reviewed Jun 10, 2021

View reviewed changes

rfk force-pushed the implement-ADR-0005-arc-pointers branch from 76c8585 to 4d89291 Compare June 14, 2021 07:08

rfk merged commit 2c15607 into main Jun 14, 2021

rfk deleted the implement-ADR-0005-arc-pointers branch June 14, 2021 07:18

mhammond mentioned this pull request Jun 14, 2021

Improve situation around sharing types between crates. #478

Closed

rfk mentioned this pull request Jun 14, 2021

[discuss] A proposal for passing around object instances #419

Closed

data-sync-user mentioned this pull request Jun 23, 2021

Propagate @ExperimentalUnsignedTypes annotations for APIs that use unsigned types mozilla/application-services#4239

Closed

		@@ -2,6 +2,8 @@

		Dictionaries can be compared to POJOs in the Java world: just a data structure holding some data.

	Dictionaries can be compared to POJOs in the Java world: just a data structure holding some data.
	Dictionaries can be compared to JavaBeans in the Java world: just a data structure holding some data.

		}

		extension {{ obj.name()\|class_name_swift }} : ViaFfi, Serializable {}

Allow passing object instances as args, fields and return values. #462

Allow passing object instances as args, fields and return values. #462

Conversation

rfk commented May 25, 2021

rfk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhammond left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saks commented May 28, 2021

rfk commented Jun 3, 2021

jhugman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rfk commented Jun 14, 2021