Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing object instances as args, fields and return values. #462

Merged
merged 1 commit into from
Jun 14, 2021

Conversation

rfk
Copy link
Collaborator

@rfk rfk commented May 25, 2021

This implements the proposal from ADR-0005,
following discussion in #419. It depends on the removal on non-threadsafe interfaces
from #452.

This is a significant re-working of how object instances are handled,
replacing our previous use of handlemaps with direct use of an Arc<T>
pointer. This both reduces overhead and allows us to must more easily
generate code for dealing with object references.

On the Rust side of the world, code that needs to deal with an
object reference will typically take an Arc<T>, in the spirit
of being as explicit as possible. The one exception is when passing
objects as arguments, where (like other types) you can annotate the
argument with [ByRef] and have UniFFI hand you an &T.

For an example of how this might be used in practice, the "todolist"
example has grown the ability to manage a global default list by
setting and retreiving a shared object reference.

@rfk rfk force-pushed the implement-ADR-0005-arc-pointers branch from 6b50030 to dddaf7c Compare May 25, 2021 04:58
@rfk rfk force-pushed the i449-interfaces-always-threadsafe branch from 2a958c7 to fa7aa1f Compare May 26, 2021 03:10
@rfk rfk force-pushed the implement-ADR-0005-arc-pointers branch from dddaf7c to 8ca94b7 Compare May 26, 2021 03:17
@rfk rfk force-pushed the i449-interfaces-always-threadsafe branch from fa7aa1f to 52ef91a Compare May 26, 2021 04:36
@rfk rfk force-pushed the implement-ADR-0005-arc-pointers branch 4 times, most recently from 5346681 to 3372036 Compare May 27, 2021 03:25
Base automatically changed from i449-interfaces-always-threadsafe to main May 27, 2021 04:33
@rfk rfk changed the base branch from main to kt-handle-destroy-race May 27, 2021 04:45
@rfk rfk force-pushed the implement-ADR-0005-arc-pointers branch 4 times, most recently from eaf3750 to f71b0a2 Compare May 27, 2021 07:09
@rfk rfk marked this pull request as ready for review May 27, 2021 07:13
Copy link
Collaborator Author

@rfk rfk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is a big change, but I think it has come together quite nicely and is just about ready for review!

@mhammond, thanks for getting this off to such a great start with your earlier PR. Much of what's here should look familiar, although I've removed the identity-map stuff from the Python backend because I wasn't yet prepared to tackle that in all languages, and it seemed worth trying to be consistent.

@jhugman, I'm not sure how closely you've been following this discussion, but would love to get your take on this, especially from the consumer perspective. Some self-review notes below on things to watch out for.

@saks, could you please take a look at what I've done with the Ruby stuff here? This was certainly an interesting way for me to start learning more about Ruby! 😅

// There is a single "default" TodoList that can be shared
// by all consumers of this component.
static ref DEFAULT_LIST: RwLock<Option<Arc<TodoList>>> = RwLock::new(None);
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated to "todolist" example to demonstrate how I think we could use this in something like Glean or Nimbus in the future, managing a global singleton that's initialized by the application code but can be accessed by multiple consumers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if using a Weak here might be even more informative? If this was a "real" app, I imagine the default would be a strong ref, but it's an example app. Maybe just a comment along the lines of "depending on your requirements, you could use a Weak here and avoid the possible circular reference."?

getDefaultList()!!.use { default ->
assert(default.getLast() == "Test shared state through local vs default reference")
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test shows a bit of the awkwardness of working with object references in Kotlin, where we have to make sure each instance is explicitly destroyed when we're finished with it. It's manageable, but easy to forget about. I'd like to see us take a fresh look at e.g. weak references for doing this more automatically, but that's well outside the scope of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth summarizing this comment in the example - a casual reader might not realize the importance of the pattern used.

assert(todo.get_items() == get_default_list().get_items())
# In the future, we'll maintain object identity over the FFI.
#assert(get_default_list() is todo)
#assert(get_default_list() is not todo2)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, every object instance returned over the FFI is a fresh foreign-language object. Mark's prototype PR implemented a reverse-identity-map for the Python backend using weak references and I think we could do something similar for the other languages, but on balance I think it will be better for us to try out the basic functionality without blocking on that extra complexity. Happy to work on it in a followup PR, and I'm open to the suggestion that we shouldn't land without it because this "every object is a new one" behaviour could be confusing for consumers.

(There's also the question of how a reverse-identity-map would interact with the Kotlin destroy()/.use pattern, which I really don't have my head around just yet.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM - maybe even just take that "promise" and commented code out entirely? It sounds like a commitment that I don't think we need to make. Alternatively, water it down "we might..." and link to an issue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we should just delete the promise; if it's important to us we'll actually do it, and if it isn't important we don't need to make any promises about it.

[Enum]
interface MaybeSimpleDict {
Yeah(SimpleDict d);
Nah();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests codegen for nested object references, I don't have any tests that actually exercise it yet.

assert(d.maybeFloat64!!.almostEquals(1.0))
// Test some_dict().
// N.B. we need to `use` here to clean up the contained `Coveralls` reference.
createSomeDict().use { d ->
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the SimpleDict type contains an object reference, it too provides helpers for manual management of the object lifecycle.

@handle = handle

# A private helper for initializing instances of the class from a raw pointer,
# bypassing any initialization logic and ensuring they are GC'd properly.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saks I pulled out some "private" helpers here to make codegen a bit easier. Is there a pattern we should use to discourage consumers from calling this, similar to e.g. the leading-underscore pattern from Python?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rfk The best way to hide it would be to mark these methods as private_class_method, but this is not quite feasible in our case. Leading-underscore is also pretty common in ruby community, so employing it here is rather a good idea.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I'll by some leading underscores in here to more clearly communicate intent.


public func hash(into hasher: inout Hasher) {
hasher.combine(self.pointer)
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I've implemented Equatable and Hashable for Object classes, mostly so that they can appear as fields in Records, which are in turn Equatable and Hashable. An alternative would be to only implement Equatable and Hashable for Records that do not contain any Object instances.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires some thought: I expect the fully featured right thing to do is only implement == and hash for Rust structs that implementPartialEq, and/or to allow fields to be excluded from equals and hash implementations from within the WebIDL.

Both of which is beyond the scope of this PR.

Recommend we don't implement Equatable and Hashable until we have a better sense of demand for this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend we don't implement Equatable and Hashable until we have a better sense of demand for this.

Aye, thanks. I think I knew that this was most likely the right thing to do but just didn't want to deal with it in this PR. I'll update.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, what I've done is:

  • Do not implement Equatable/Hashable for Object types.
  • Update Record and Enum types so that they only implement Equatable/Hashable if they do not contain any Object type fields.


pub fn all_arguments(&self) -> Vec<Argument> {
self.arguments.to_vec()
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little trick to simplify codegen, more below...

.into_iter()
.chain(self.arguments.iter().cloned())
.collect()
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here for methods, we're making a distinction between the "visible to the user" arguments() list and the "what you see in the actual call" all_arguments() list...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see.

I think I'd suggest grouping arguments, first_argument and all_arguments together, and putting a brief comment next to them.

Some further noticings:

  • derive_ffi_func can now use all_arguments().
  • all_arguments is used by the _arg_list_rs_call macro. Can the naming reflect that?
  • first_argument() could (not sure if "should") be changed to "self_argument()"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, first_argument can be removed completely, because we only ever used it in order to build up the list that is now returned by all_arguments.

{{ func.name() }}(
{{- prefix }}{% if func.arguments().len() > 0 %}, {% call _arg_list_rs_call(func) -%}{% endif -%}
)
{%- endmacro -%}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and now we can get rid of this "call with prefix thing" in the Rust codegen macros! Because the Object type now lifts and lowers in the same way as the other types, we don't need to do any specially-structured codegen for it, and can just use the plain to_rs_call macro as long as we pass it the full list of arguments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

Copy link
Member

@mhammond mhammond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad it was you and not me who had to work through the Kotlin thorns! It's exciting to see this ready to land 🚀

CHANGELOG.md Outdated
- It is now possible to use Object instances as fields in Records or Enums, to pass them as arguments,
and to return them from function and method calls. They should for the most part behave just like
a host language object, and their lifecycle is managed transparently using Rust's `Arc<T>` type.
- Cross-language reference cycles will not be garbage collected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "cross language" part here makes it sound like some cycles will be collected - don't have a concrete suggestion here because we don't want to give the impression we will somehow break existing GC, but thought I'd mention it in case you have inspiration.

The fields in a dictionary can be of almost any type, including objects or other dictionaries.
The current limitations are:

* They cannot recursively contain another intance of the *same* dictionary.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo - intance

// There is a single "default" TodoList that can be shared
// by all consumers of this component.
static ref DEFAULT_LIST: RwLock<Option<Arc<TodoList>>> = RwLock::new(None);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if using a Weak here might be even more informative? If this was a "real" app, I imagine the default would be a strong ref, but it's an example app. Maybe just a comment along the lines of "depending on your requirements, you could use a Weak here and avoid the possible circular reference."?

getDefaultList()!!.use { default ->
assert(default.getLast() == "Test shared state through local vs default reference")
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth summarizing this comment in the example - a casual reader might not realize the importance of the pattern used.

assert(default.getLast() == "Test shared state through local vs default reference")
}
}

// Ensure the kotlin version of deinit doesn't crash, and is idempotent.
todo.destroy()
todo.destroy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I guess if you are touching this file you might as well add the trailing \n :)

assert(todo.get_items() == get_default_list().get_items())
# In the future, we'll maintain object identity over the FFI.
#assert(get_default_list() is todo)
#assert(get_default_list() is not todo2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM - maybe even just take that "promise" and commented code out entirely? It sounds like a commitment that I don't think we need to make. Alternatively, water it down "we might..." and link to an issue?

self.assertEqual(get_num_alive(), 1)
# and check it's the correct object.
self.assertEqual(coveralls.get_other().get_name(), "test_arcs")
# TODO: when we implement an identity map, it will even be the same Python object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here re future promises.

assert(c2.strongCount() == 3)

assert(coveralls.getOther() == c2)
// TODO: in future get_other() should return the exact same instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto re promises

// It's okay!
}
// TODO: kinda hard to test this, as it triggers a fatal error.
// coveralls!.takeOtherPanic(message: "expected panic: with an arc!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrmph - that's surprising to me, although I guess it shouldn't be.

@saks
Copy link
Contributor

saks commented May 28, 2021

I'd like give a shout-out to @rfk for spending time to make these changes work for ruby adapter. Big thank you! 👍

Base automatically changed from kt-handle-destroy-race to main June 3, 2021 03:28
@rfk rfk force-pushed the implement-ADR-0005-arc-pointers branch from 0c40461 to 780b802 Compare June 3, 2021 03:43
@rfk
Copy link
Collaborator Author

rfk commented Jun 3, 2021

I've addressed the feedback here, rebased and squashed ready for merge.

@rfk rfk force-pushed the implement-ADR-0005-arc-pointers branch 3 times, most recently from ecda34f to d78fe1d Compare June 3, 2021 03:56
Copy link
Contributor

@jhugman jhugman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am liking this a lot. I'm not sure I'm qualified to give you a Big Green Button, but I've left some general feedback and some kotlin/swift specific.

I am very excited that we have a clear path to being able to support static constructor methods and singletons.

Comment on lines +118 to +121
The use of `Arc` is transparent to the foreign-language code, but sometimes shows up
in the function signatures of the underlying Rust code. For example, the Rust code implementing
the `TodoList::duplicate` method would need to explicitly return an `Arc<TodoList>`, since UniFFI
doesn't know whether it will be returning a new object or an existing one:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@@ -2,6 +2,8 @@

Dictionaries can be compared to POJOs in the Java world: just a data structure holding some data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java nit: POJOs have methods too, not like the structs here. I sincerely regret to suggest:

Suggested change
Dictionaries can be compared to POJOs in the Java world: just a data structure holding some data.
Dictionaries can be compared to JavaBeans in the Java world: just a data structure holding some data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not even sure that's correct.

I think the inference we're going for is that it's typed and structured data with no methods on it. But this is not for this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear 😅

Honestly, "compared to POJO" and "compared to JavaBeans" both do absolutely nothing to help my personal understanding of what's happening here, which is maybe a signal that we should just reword this to use a general description rather than a language-specific analogy. Thanks for pointing it out.

docs/manual/src/udl/structs.md Outdated Show resolved Hide resolved
block(this)
} finally {
try {
this?.destroy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am puzzled by the question mark here, and in L95.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because we implement it on Destroyable? in order to be able to use it on Option<T> as well as just T.

// helper method to execute a block and destroy the object at the end.
interface Destroyable {
fun destroy()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something bothers me about this framing/naming. Perhaps writing this comment will rubber duck me into agreeing with it, or suggesting something new.

As I understand it:

  1. destroy() is supposed to be called as we finish with the object on the Kotlin side.
  2. This destructible/destroyable Objects is really about decreasing their Arc ref count on the Rust side.
  3. Because it's about ref counting, destroying a data class once too many times will cause difficult to track down bugs, and not enough times will cause a difficult to find memory leak.

I think I'm persuading myself that

a. we should call this Disposable (it's up to the user to tidy up after themselves)
b. for objects we should consider implementing AutoCloseable (so we can piggy back on the try-with-resources pattern, rather than implementing our own).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should call this Disposable

This name makes sense to me; should the corresponding method be called dispose() for symmetry rather than destroy()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting for completeness, that in my local copy I have updated this to implement AutoCloseable for objects.


public func hash(into hasher: inout Hasher) {
hasher.combine(self.pointer)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires some thought: I expect the fully featured right thing to do is only implement == and hash for Rust structs that implementPartialEq, and/or to allow fields to be excluded from equals and hash implementations from within the WebIDL.

Both of which is beyond the scope of this PR.

Recommend we don't implement Equatable and Hashable until we have a better sense of demand for this.

}

extension {{ obj.name()|class_name_swift }} : ViaFfi, Serializable {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be/can this be private or internal?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to make it so, for reasons that I don't really understand to do with Swift's object model. Perhaps I'll take a fresh look at it now that I've been away from the code for a little while. At the very least, I can leave a comment here about why it can't be private.

.into_iter()
.chain(self.arguments.iter().cloned())
.collect()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see.

I think I'd suggest grouping arguments, first_argument and all_arguments together, and putting a brief comment next to them.

Some further noticings:

  • derive_ffi_func can now use all_arguments().
  • all_arguments is used by the _arg_list_rs_call macro. Can the naming reflect that?
  • first_argument() could (not sure if "should") be changed to "self_argument()"?

{{ func.name() }}(
{{- prefix }}{% if func.arguments().len() > 0 %}, {% call _arg_list_rs_call(func) -%}{% endif -%}
)
{%- endmacro -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

.unwrap_or(false),
_ => false,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cough if only there were some non-enum-based abstraction over all these bits of a ComponentInterface that we could use to simplify this...

This is a significant re-working of how object instances are handled,
replacing our previous use of handlemaps with direct use of an `Arc<T>`
pointer. This both reduces overhead and allows us to must more easily
generate code for dealing with object references.

On the Rust side of the world, code that needs to deal with an
object reference will typically take an `Arc<T>`, in the spirit
of being as explicit as possible. The one exception is when passing
objects as arguments, where (like other types) you can annotate the
argument with `[ByRef]` and have UniFFI hand you an `&T`.

For an example of how this might be used in practice, the "todolist"
example has grown the ability to manage a global default list by
setting and retreiving a shared object reference.

Co-authored-by: Ryan Kelly <rfkelly@mozilla.com>
@rfk rfk force-pushed the implement-ADR-0005-arc-pointers branch from 76c8585 to 4d89291 Compare June 14, 2021 07:08
@rfk
Copy link
Collaborator Author

rfk commented Jun 14, 2021

I've tweaked the Kotlin and Swift bindings to accommodate James's feedback, and I think this is good to go!

While this is not technically a breaking change, it feels significant enough that I'm going to give it its own minor version increment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants