Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for factory functions and static methods. #384

Closed
wants to merge 1 commit into from

Conversation

rfk
Copy link
Collaborator

@rfk rfk commented Feb 15, 2021

It's a fairly common pattern for object-oriented interfaces to have
named functions or static methods that act as named "factories" for
producing object instances in ways that differ from the default
constructor. This commit adds basic support for such factories
by:

  • Allowing owned object instances to be returned from functions
    and methods.
  • Allowing interface methods to be marked as static.

The "owned object instances" part here is really key, since it
allows us to bypass a lot of the general concerns about returning
references to object instances that are outlined in #197 (and
that this commit does not at all attempt to tackle).

@rfk rfk requested a review from jhugman February 15, 2021 01:54
@@ -14,6 +14,24 @@ def __del__(self):
)

{% for meth in obj.methods() -%}
{%- if meth.is_static() -%}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super jazzed about the way this creates four different method implementation blocks (static void, static non-void, non-static void, non-static non-void) but I also don't have a suggestion for how to do it in a neater way.

I was hoping that I'd be able to just do like {% if meth.is_static() %}@staticmethod{% endif %}, but we also need to deal with whether or not to pass the extra handle as first argument. Future work might be able to consolidate the macros here into something that's a bit more general.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Macros and includes have got us a long way, but at some point, I expect we'll be embedding templates within templates.

@@ -64,6 +64,10 @@ already_AddRefed<{{ obj.name()|class_name_cpp(context) }}> {{ obj.name()|class_n
{%- endfor %}

{%- for meth in obj.methods() %}
{% if meth.is_static %}
% if e.has_associated_data() %}
MOZ_STATIC_ASSERT(false, "Sorry the gecko-js backend does not yet support static methods");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not attempted to implement static or factory methods for the gecko_js backend.

Naively, I don't see why they wouldn't work, since we're hewing pretty closely to standard WebIDL here.

But it's not clear to me what is the best stategic use of our collective time at the moment - to keep the gecko_js backend up-to-date as new features are added to uniffi, or to allow it to go a bit stale and bring it up to speed in a burst when we come to use it for an integration in earnest.

@jhugman @dmose thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to step on Dan's toes here, but I support letting it go a bit stale in anticipation of that burst happening sooner rather than later.

@rfk
Copy link
Collaborator Author

rfk commented Feb 15, 2021

I should also say - I'm working on this in service of the fxa-client component and its FirefoxAccount::from_json static method, but I also think this could be useful for things we've talked about with Nimbus. It seems like a step in the direction of being able to manage a global Nimbus singleton that's shared between different foreign-language consumers (although it's not quite there yet).

It's a fairly common pattern for object-oriented interfaces to have
named functions or static methods that act as named "factories" for
producing object instances in ways that differ from the default
constructor. This commit adds basic support for such factories
by:

* Allowing owned object instances to be returned from functions
  and methods.
* Allowing interface methods to be marked as `static`.

The "owned object instances" part here is really key, since it
allows us to bypass a lot of the general concerns about returning
*references* to object instances that are outlined in #197 (and
that this commit does not at all attempt to tackle).
Copy link
Contributor

@jhugman jhugman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really excited to get static methods, and to be able to pass interface objects over the FFI, but I don't think this is quite what we need to fix the problem as described in the PR comment 0.

In #197, some amount of book keeping is required to keep track of objects that may have already gone over the FFI.

In this PR, you're not yet doing that bookkeeping, so we end up with a leaky abstraction:

val thing1 = getOrCreateThing(id = "id0")
val thing2 = Thing.getOrCreateThing(id = "id0")
val thing3 = thing2.getOrCreateThing(id = "id0")
val thing4 = thing2.getOrCreateThing(id = "id0")

assert(setOf(thing1,  thing2, thing3, thing4).length == 1) { "All things with the same id should be  the same" }

I wonder if we should be looking at #37 instead. It doesn't propose syntax for the UDL, but does cite the same example you're interested in solving.

[name=from_json]
constructor(string json_string)

FWIW: a bi-directional handlemap may be unavoidable at some point with this Prestige/The Disappearing Man style FFI. It's not necessary for #37, but if it's needed here, then it would unlock #40 for [Threadsafe] objects.

@@ -14,6 +14,24 @@ def __del__(self):
)

{% for meth in obj.methods() -%}
{%- if meth.is_static() -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Macros and includes have got us a long way, but at some point, I expect we'll be embedding templates within templates.

{% endfor %}
}

public class {{ obj.name() }}: {{ obj.name() }}Protocol {
private let handle: UInt64

init(fromRawHandle handle: UInt64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: reduce visibility, perhaps.

@@ -24,8 +30,27 @@ public class {{ obj.name() }}: {{ obj.name() }}Protocol {
}
}

// TODO: Maybe merge the two templates (i.e the one with a return type and the one without)
static func lift(_ handle: UInt64) throws -> {{ obj.name()|class_name_swift }} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: reduce visibility.

Comment on lines +73 to +76
fn lower(self) -> Self::FfiType {
// Note that this consumes `self`, transferring ownership to the handlemap.
{{ handle_map }}.insert(self).into_u64()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't developed the ownership mental model enough to know if this is true: my suspicion is that we need to do a bit more error reporting, or book keeping to make sure this doesn't get misused: either deliberately or not.

fn get_self(self) -> Self { self }
val thing1 = Thing()
val thing2 = thing1.getSelf()

assert thing1 == thing2 // fail!

@rfk
Copy link
Collaborator Author

rfk commented Feb 15, 2021

@jhugman two excellent potential counter-examples to the safety of this approach! I have responses for both, but will think about both (a) how to better document that it relies on these properties and (b) whether my own intuition about these properties being sufficient is correct.

First, the "get or create" pattern:

val thing1 = getOrCreateThing(id = "id0")
val thing2 = Thing.getOrCreateThing(id = "id0")
val thing3 = thing2.getOrCreateThing(id = "id0")
val thing4 = thing2.getOrCreateThing(id = "id0")

assert(setOf(thing1,  thing2, thing3, thing4).length == 1) { "All things with the same id should be  the same" }

Rust's popular once_cell crate is a good small example of what this pattern looks like in Rust. Notably, the ownership model prevents OnceCell<T>::get() from returning owned instances of T - if it could directly return the T contained in the cell, then there could be multiple things that think they own the T, which is not allowed. It must instead return a reference &T. (In fact, we saw a similar pattern when trying to make the less-locking handlemap, and we had to solve it there by handing out clones of T, then using Arc to make the clones cheap).

This PR only allows functions to return owned instances T, not references &T, so it doesn't yet allow the Rust code to produce an API that would violate expectations in the above way. If your Rust code tries to return a reference &T then the scaffolding will fail to compile.

Second, the "tunneled ownership" pattern (which I don't really have a ready name form but feels like it should have a name):

fn get_self(self) -> Self { self }
val thing1 = Thing()
val thing2 = thing1.getSelf()

assert thing1 == thing2 // fail!

It's true that one way to return an owned instance is to receive an owned instance as an argument, do something with it, then return that same owned instance back to the caller.

The UniFFI-generated bindings will never pass an owned instance of an object as an argument to a function or method:

  • In the general case we error out when parsing the UDL if you try to pass an Object.
  • In the special case of the self parameter, the scaffolding always passes &self rather than self, so the Rust code is never in a position to return it as an owned instance. If you try to write the fn get_self(self) -> Self { self } method on a UniFFI object, the scaffolding will fail to compile with a complaint about calling get_self with a reference.

So I believe the approach here to be safe in its current form. Does my reasoning make sense to you?

It's also clear that the safety could be fairly fragile, e.g. if we start allowing to receive an owned self at some point in the future, we might forget that we were depending on that as a safety invariant elsewhere. I propose taking the three places above where I've said "if you try to do this you will get an error" and turning them into concrete testcases, so if it ever stops being an error, we'll have some breadcrumbs to trace back to the safety properties. Specifically:

  • A test that the generated scaffolding fails if a Rust method returns a reference rather than an owned instance.
  • A test that passing Type::Object(_) as an argument to a function or method fails.
  • A test that the generated scaffolding fails to compile against a method like fn get_self(self).

Each of which with a note explaining how we rely on this to uphold a safety/soundness property.

What do you think @jhugman?

Prestige/Disappearing Map style FFI

BTW I really, really like this metaphor 👍

@rfk
Copy link
Collaborator Author

rfk commented Feb 15, 2021

(I'll also add that, if we ever do come to deal with returning references rather than owned instances, we've got a heck of a lot of lifetime-management stuff to reason through before we can do it safely).

@jhugman
Copy link
Contributor

jhugman commented Feb 16, 2021

I've thought about this today: thank you so much for you explanations.

I think I understand what you're saying: both fn get_self(self) -> Self and the get_or_create rely on passing owned references, that a lower(obj: T) function can pass into the handlemap.

A function like fn get_or_create(id: String) -> &T would be impossible to wrap because rustc would stop you passing a &T as a T.

I'm trying to work out why this feels wrong to me. My guesses:

  • we (uniffi) are generating code, and rustc will show a mismatch between the generated code and the user's code. The user feels like they should be able pass objects across the FFI, but only in very limited and extremely specific ways, but most of the time, it doesn't work in the way they might like. It makes uniffi feel broken.
  • there is already a mechanism which is semantically exactly what we're asking for, i.e. a class function passing exactly one owned object back to the foreign language: constructors.

If it were up to me, I think I'd propose you:

The one thing that I worry about with this plan is how to generate the WebIDL. I'd guess we could implement these are static factory methods too.

@rfk
Copy link
Collaborator Author

rfk commented Feb 16, 2021

Let me see if I can re-phrase your feedback to check if I've understood you correctly:

  • The way this is currently structured with static methods feels to the consumer like quite a general mechanism, but it has a bunch of limitations that may be non-obvious, and only works in very specific circumstances.
  • The specific circumstances in which it works are more-or-less exactly those of a "constructor".
  • So we should expose it to consumers via something that looks like a constructor, so they're more likely to understand and have a good time with this feature.

Right-ish?

make this about fixing #37. introduce named constructors, which generate bindings and scaffolding as static factory methods.

In an earlier version of this I was actually treating them as alternate constructors, but it got weird when generating the bindings because different languages have very different ideas about how to do multiple constructors. I think I'd be OK with having these be declared as constructors in the UDL but actually show up as static factory methods in the generated bindings though.

continue to disallow passing of objects across the FFI, and solve for &T in #197, without the fragility here.

FWIW, I think if we try to allow general-purpose API functions returning &T we're going to get into a lot of fights between Rust's strict single-ownership model and the foreign language's flexible GC model. Note entirely sure what a good solution is here yet, and I really don't want to have to teach the bindings generated about lifetimes :-/

Copy link
Member

@mhammond mhammond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to play a little more with this and try to understand how arbitrary object instances etc work in this patch and whether it introduces the possibility of footguns, but thought I'd publish my existing comments first.

@@ -64,6 +64,10 @@ already_AddRefed<{{ obj.name()|class_name_cpp(context) }}> {{ obj.name()|class_n
{%- endfor %}

{%- for meth in obj.methods() %}
{% if meth.is_static %}
% if e.has_associated_data() %}
MOZ_STATIC_ASSERT(false, "Sorry the gecko-js backend does not yet support static methods");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to step on Dan's toes here, but I support letting it go a bit stale in anticipation of that burst happening sooner rather than later.

obj = cls.__new__(cls)
obj._handle = handle
return obj
{% endif %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: trailing newline


{% if ci.iter_object_definitions().len() > 0 %}
def liftObject(cls, handle):
"""Helper to create an object instace from a handle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 'instace'.

@@ -181,7 +181,7 @@ mod filters {
Type::Boolean => format!("(True if {} else False)", nm),
Type::String => format!("{}.consumeIntoString()", nm),
Type::Enum(name) => format!("{}({})", class_name_py(name)?, nm),
Type::Object(_) => panic!("No support for lifting objects, yet"),
Type::Object(name) => format!("liftObject({}, {})", class_name_py(name)?, nm),
Type::CallbackInterface(_) => panic!("No support for lifting callback interfaces, yet"),
Type::Error(_) => panic!("No support for lowering errors, yet"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated to this patch, but the "lowering" here looks like copy-pasta?

@@ -181,7 +181,7 @@ mod filters {
Type::Boolean => format!("(True if {} else False)", nm),
Type::String => format!("{}.consumeIntoString()", nm),
Type::Enum(name) => format!("{}({})", class_name_py(name)?, nm),
Type::Object(_) => panic!("No support for lifting objects, yet"),
Type::Object(name) => format!("liftObject({}, {})", class_name_py(name)?, nm),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mildly surprised that this is a name and not an already created instance, and also a bit confused about how this is "used to support factory functions and methods" as the comment in liftObject, but that says more about my lack of uniffi experience than anything else...

}
{% endfor %}

// We proide a `ViaFfi` impl for each object, but the only thing it can currently
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: 'proide'

// do is lower an owned instance into a handle. This is useful for returning new
// instances of the object from static methods, and we know it is safe and sound
// because Rust's ownership system ensures that:
// * the thing we're operating on is an owned instance (not a reference) and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to get my head around this - an alternative shorter phrasing that probably wouldn't have stopped me for quite as long is that "We are consuming the object to transfer ownership into the handle-map, so that means there can't be any outstanding references" or similar.

@rfk
Copy link
Collaborator Author

rfk commented Feb 17, 2021

Thanks @mhammond; just for visibility, I wanted to note that I'm going to timebox an experiment with the "alternate constructors" approach suggested by James and see how it feels. I will most likely push the result as a separate PR.

@mhammond
Copy link
Member

Let me see if I can re-phrase your feedback to check if I've understood you correctly:

Somehow I missed that comment when having a look, but I think that phrasing is exactly what I was struggling a bit with, and also that I didn't see how we guarded against the not-supported uses cases. I'll look forward to the new PR.

@rfk
Copy link
Collaborator Author

rfk commented Feb 17, 2021

I'm going to timebox an experiment with the "alternate constructors" approach suggested by James and see how it feels.
I will most likely push the result as a separate PR.

I've now pushed this here: #387

I think it came together pretty well!

@jhugman
Copy link
Contributor

jhugman commented Feb 17, 2021

@rfk I think your restatement of my feedback looks accurate.

FTR, I hope passing &T back and forth and #197 gets fixed one day, but I'm not optimistic!

Will review #387, but a quick read looks very promising. Thanks for helping me working through the issues here.

@rfk
Copy link
Collaborator Author

rfk commented Feb 17, 2021

Will review #387, but a quick read looks very promising.

Thanks! I think the fact that the changes to the templates are substantially cleaner in that PR than in this one, is a good sign that it's a better incremental change.

I'm going to close this in favour of #387

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants