Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add iterable/iterator support for wasm-bindgen. #1036

Closed
derekdreery opened this issue Nov 17, 2018 · 19 comments
Closed

Add iterable/iterator support for wasm-bindgen. #1036

derekdreery opened this issue Nov 17, 2018 · 19 comments

Comments

@derekdreery
Copy link
Contributor

derekdreery commented Nov 17, 2018

EDIT I'm going to write some notes for myself about iterators in ECMAScript and webidl. The original content of the issue follows below.

Iterators and iterables in ECMAScript

In ECMAScript, there are 2 special interfaces that objects can implement, iterator and iterable. A type is an iterator if it has a method called next that returns an object of the form { done: boolean, value: T } for any type T that ECMAScript supports.

A type is iterable if it has a property with the key Symbol.iterator, and the value of that property is an iterator.

These 2 types match nicely to Iterator and FromIterator in rust, and work in a similar way.

Iterables in webidl

In webidl there is also the concept of an iterable that is different to both iterators and iterables in ECMAScript. In webidl, an iterable is an array-type or a map-type. It has the Symbol.iterator key, so it is an ECMAScript iterable, but also 4 methods: entries, forEach, keys, and values.

The semantics of these functions are:

  • entries - An iterator over arrays of length 2, where the first value is the key (or index in the case of array-like iterables), and the second value is the value.
  • forEach - Run the supplied function on each value - in the case of map-like keys are discarded.
  • keys - An iterator over the keys or indices of the iterable, depending on whether it is array-like or map-like
  • values - An iterator over the values of the iterable, discarding keys.

If the iterable is array-type, the iterator at Symbol.iterator is the same as values, whereas if the iterable is map-type, then the iterator at Symbol.iterator is the same as entries.

Original issue

I've been playing with solutions to #514, but I think the neatest is to have support for iterables/iterators within wasm-bindgen. It would work something like the following:

  1. For now, the binding is for imports only (js -> rust). It would be possible the other way as well if that were desirable.

  2. An imported type can be marked as iterator, something like the following

    [wasm_bindgen]
    extern {
        #[wasm_bindgen(iterator(item = "OtherType"))]
        type MyIterator;
    }

    which would mean that the type MyIterator has a method next in js, returning an object { done: bool, value: OtherType }.

  3. We use this information to generate

    impl Iterator for MyIterator {
        type Item = OtherType;
    
        fn next(&mut self) -> Option<Self::Type> { .. }
    }

    There would be some work to do behind the scenes, things like checking there isn't already a method called next on the type, then adding one and generating/using a struct of the form

    struct IteratorNext<T> {
        done: bool
        value: T
    }
    
    impl<T> IteratorNext<T> {
        fn to_option(self) -> Option<T> {
            if self.done { Some(self.value) } else { None }
        }
    }

    that the next method will return.

Then for iterable, we need to either

  1. Generate an anonymous type implementing (iterator in js, IntoIterator::IntoIter in rust) as above and then implement IntoIterator for the iterable, or
  2. Require an annotation on the #[wasm_bindgen] attribute specifying which type IntoIter is.

For the iterable option, we need to generate the js code for indexing the object at Symbol.iterator (obj[Symbol.iterator]).

Ok so this is really an RFC, I don't think I should just implement this because it's making some design choices that should be agreed (or not). I'm also happy to implement it, but may need some mentoring, especially around the actual wasm-bindgen tool, as I've only worked with the compiler plugin so far.

EDIT There is an alternative approach to the fn next(&mut self) -> Option<Self::Item>: we could do the conversion in javascript - so during rust compilation we insert something that the cli tool picks up, and actually implements this method on the js side, thereby avoiding the need for the IteratorNext type.

@derekdreery
Copy link
Contributor Author

I'm not sure how this interacts with js_sys::Iterator and friends. Could we just generate code that uses those types? They seem to be dynamically typed, when at least sometimes you statically know the types (as in webidl).

@derekdreery
Copy link
Contributor Author

@Pauan
Copy link
Contributor

Pauan commented Nov 17, 2018

I don't think we need macro support for this, a wrapper like this should work (untested):

struct Iterable<A> {
    value: js_sys::Iterator,
    phantom: PhantomData<A>,
}

impl<A> for Iterable<A> {
    fn new(value: &JsValue) -> Result<Self, JsValue> {
        let value = js_sys::Reflect::get(value, js_sys::Symbol::iterator().as_ref())?;

        let value = js_sys::Reflect::apply(value.unchecked_ref(), &value, &js_sys::Array::new())?;

        Self {
            value: value.into(),
            phantom: PhantomData,
        }
    }
}

impl<A> Iterator for Iterable<A> {
    type Item = A;

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        self.value.next().map(|x| {
            JsCast::unchecked_from_js(x.unwrap())
        })
    }
}

It can be made more efficient (e.g. by creating some bindings that lookup and call Symbol.iterator in one step, without using Reflect), but the functionality should be correct.

And now you can easily make other types Iterable:

impl NodeList {
    #[inline]
    fn iter(&self) -> Iterable<Node> {
        Iterable::new(self.as_ref()).unwrap()
    }
}

impl IntoIterator for NodeList {
    type Item = Node;
    type IntoIter = Iterable<Self::Item>;

    #[inline]
    fn into_iter(self) -> Self::IntoIter {
        self.iter()
    }
}

Notice that it returns a Node, not a Result<JsValue, JsValue>, that's because Iterable uses JsCast::unchecked_from_js to do the type conversions.

As its name implies, this is unchecked, so there are no runtime checks (it is zero-cost), but that means it's up to you to ensure that the static types are correct.

@Pauan
Copy link
Contributor

Pauan commented Nov 17, 2018

(To be clear, I think having Iterable, or something similar to it like TryFrom<JsValue> for Iterator would be good, and probably should live in js-sys, I just don't think we need macro/attribute support for it).

@alexcrichton
Copy link
Contributor

I think the best strategy might be to have an Iterator<T>-style type in the web-sys crate which is used for various iterators on types, and that way we'd only have one type to deal with and all types could share it!

@derekdreery
Copy link
Contributor Author

I think the best strategy might be to have an Iterator-style type in the web-sys crate which is used for various iterators on types, and that way we'd only have one type to deal with and all types could share it!

Do you mean to replace Iterator in js-sys, or to live alongside it?

@alexcrichton
Copy link
Contributor

I think it's probably worthwhile to have a new one in web-sys with a type parameter since we know the types, which means we can't use js_sys::Iterator for the same purpose

@Pauan
Copy link
Contributor

Pauan commented Nov 19, 2018

@alexcrichton It should be in js-sys: Iteration is defined in ES6 (which WebIDL uses), so it can work on any JS object.

Also, we can just name it something else to avoid conflict with js_sys::Iterator (like how I named it Iterable<T> in my above post).

@alexcrichton
Copy link
Contributor

Hm actually alternatively I think the best option here may be to generate methods that return impl Iterator<Item = ...>, it'd be great to not have to deal with naming concerns

@derekdreery
Copy link
Contributor Author

derekdreery commented Nov 19, 2018 via email

@Pauan
Copy link
Contributor

Pauan commented Nov 19, 2018

@alexcrichton Sure, that's fine for methods like iter, but:

  1. We still need to have something in js-sys (as a convenience for downstream users to create their own iterable types).

  2. It won't work for IntoIterator, which requires you to name the type.

So we still need to decide on a name. I think Iterable is fine, though maybe FromIterable is more accurate?

@derekdreery
Copy link
Contributor Author

derekdreery commented Nov 20, 2018

Here is my understanding of the existing iterator types in js-sys:

  1. IntoIter - A type that wraps the iterator and keeps track if we've seen done: true on the js side.

    pub struct IntoIter {
        js: Iterator,
        state: IterState,
    }

    In theory this shouldn't be necessary store state here (done: bool) since iterators should continue to return done: true from the next method, but it's probably nice to have the extra safety net.

  2. Iter - The same as IntoIter, but borrows rather than owns the external type.

    pub struct Iter<'a> {
        js: &'a Iterator,
        state: IterState,
    }
  3. Iterator - The external type from js.

  4. IteratorNext - The shape of a response from js, like {done: bool, value: JsValue}.

So we could introduce some more types here for where we know the type

pub struct TypedIterator<T> {
    js: Iterator,
    state: IterState,
    ty: PhantomData<T>,
}

that panics when the type is incorrect. Is this what you prefer? Or should these types live in web-sys, since in js you can never trust types except when using with webidl?

@derekdreery
Copy link
Contributor Author

derekdreery commented Nov 20, 2018

Also would you consider renaming these types to maybe

  • Iterator -> Iterator
  • Iter -> IntoIterBorrowed
  • IntoIter -> IntoIterOwned

And mention in the docs that you don't really need to look at these types at all - they are just there as things that impl Iterator.

There is also try_iter that kinda does the job of js iterable.

@Pauan
Copy link
Contributor

Pauan commented Nov 20, 2018

@derekdreery Here is my understanding of the existing iterator types in js-sys

That sounds correct.


So we could introduce some more types here for where we know the type [...] that panics when the type is incorrect.

In the general case I don't think we can panic when the type is incorrect (since JS doesn't have a general-purpose type-checking feature).

I think using JsCast::unchecked_from_js is fine. If the user wants runtime checking, they can use IntoIter and then pattern match on the Result and then do runtime checks on the JsValue.

I do like the name TypedIterator, though I'd call it TypedIter (for consistency with the others).

@alexcrichton Do you have a different opinion about this?


Or should these types live in web-sys, since in js you can never trust types except when using with webidl?

To reiterate: this should not live in web-sys, because any JS object can implement the Iterable protocol, it's not WebIDL-specific.

As for the types, that's not a problem, it can just have an Iterable::unchecked_new function, making it clear that it's up to the caller to ensure the correct types. It's no different from JsCast::unchecked_from_js (which already exists in wasm-bindgen).


There is also try_iter that kinda does the job of js iterable.

Oh! I didn't realize try_iter exists. That does simplify things a bit. In that case, it's possible to rewrite Iterable<T> like this:

struct Iterable<A> {
    value: js_sys::IntoIter,
    phantom: PhantomData<A>,
}

impl<A> for Iterable<A> {
    fn unchecked_new(value: &JsValue) -> Self {
        Self {
            value: js_sys::try_iter(value).unwrap().unwrap(),
            phantom: PhantomData,
        }
    }
}

impl<A> Iterator for Iterable<A> {
    type Item = A;

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        self.value.next().map(|x| {
            JsCast::unchecked_from_js(x.unwrap())
        })
    }
}

Though TypedIter is good too. Either is fine, as long as they allow for zero-cost statically typed iteration.

@derekdreery
Copy link
Contributor Author

derekdreery commented Nov 20, 2018

@Pauan I feel like we're both moving towards consensus. I just have a few more questions:

Iterator can just impl Iterator (we don't need the other 3 types):

From MDN (todo check this is an authoritative source):

The next method always has to return an object with appropriate properties including done and value. If a non-object value gets returned (such as false or undefined), a TypeError ("iterator.next() returned a non-object value") will be thrown.

This seems to suggest we can just go right ahead and implement std::iter::Iterator for Iterator, without needing the IntoIterator impls or the types they return?

I've implemented what this would look like in #1043. If you want the old behavior (rust tracks whether we are done), I think you can just fuse the iterator.

I forgot what my other questions were - I'll put them on when I remember!

EDIT Looking at js-sys/src/lib.rs, it seems that the iterable webidl decl is similar to Array Iterator and Map Iterator in js land.

EDIT2 I've also added a demonstration of a typed iterator to the PR, basically copied from @Pauan's comment.

Also, all names for bikeshedding obviously if you like the semantics.

@alexcrichton
Copy link
Contributor

Sorry I've been a bit slow to catch up on this, but are there still unresolved questions for how we might expose this?

@derekdreery
Copy link
Contributor Author

They are both kinda supported already - there is a try_iter method for iterable and the Iterator type for iterator. In the attached PR I'm exploring how to make these constructs simpler, more discoverable, and what a typed version would look like.

@zimond
Copy link

zimond commented Nov 10, 2020

Friendly asking if there's any progress on this? Is the related PR still valid?

@daxpedda
Copy link
Collaborator

Fixed by #3962.
Though we decided to not encode the type in the iterator for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants