[WIP] Demonstrate simplification of Iterator in js-sys #1043

richard-uk1 · 2018-11-20T15:51:53Z

Simplify iterator implementation in js-sys.

This PR is currently for demonstration purposes, please do not merge.

Simplify iterator implementation in `js-sys`.

alexcrichton · 2018-11-26T04:11:02Z

Sorry I haven't quite caught up on the surrounding discussion here, and sorry for the delay! Can you expand a bit on how this simplification is possible though and why it works the same as before?

richard-uk1 · 2018-11-26T11:00:55Z

Essentially it is because JavaScript iterators have a next method that already matches the semantics of Iterator - it just has a different pattern (an object with done book rather than an option). This means we can just implement Iterator directly.

The current implementation keeps some state for when the iterator has ended, like the Fuse adapter does, so I mentioned that to recover current behaviour you can fuse the iterator.

richard-uk1 · 2018-11-26T11:05:33Z

I've also added a typed version based on the comments in the other thread.

It's not ready for merge because there are no tests - currently it's to demonstrate proof of concept.

alexcrichton · 2018-11-28T18:06:30Z

There's some previous discussion about why the fusing is implementing, maybe @fitzgen you have thoughts on this?

fitzgen · 2018-11-28T18:33:45Z

Essentially it is because JavaScript iterators have a next method that already matches the semantics of Iterator - it just has a different pattern (an object with done book rather than an option). This means we can just implement Iterator directly.

This isn't quite true becuase JS can always throw on each next invocation, where as the equivalent in rust would be if Iterator::Item is a Result or not.

Basically we have to assume that Iterator::Item is always a Result for JS which then can cause issues for combinators like collect. 99% of the time, one shouldn't continue to call next after it returns an error, but most combinators don't handle this correctly (hence the fallible_iterator crate's existence). This isn't a theoretical bug, I've run into this multiple times before.

fitzgen · 2018-11-28T18:36:53Z

crates/js-sys/src/lib.rs

+/// An `Iterator` wrapper for when we know the type that will be returned.
+///
+/// We also assume that errors from the use of this iterator are not possible.
+pub struct TypedIterator<T> {


I do like the idea of building typed iterators on top, though.

Unsure if it belongs in js-sys or a third party crate.

Maybe want to make Item = Result<T> to handle potential exceptions and also use checked casts that return Err on cast failures?

The primary purpose (but not the only purpose!) of the typed iterator is to provide for iteration on web-sys types.

For example, NodeList is an iterator of Node. And so we want to provide a TypedIterator<Node> for NodeList.

Since we know that the NodeList iterator doesn't error, we don't want it to return Result.

That's why it's called TypedIterator::from_iterator_unchecked, to emphasize the unchecked nature of it.

In practice, it's very rare for iterators to error (in general), and in the case of web-sys specifically I believe it's guaranteed that they don't error.

I think it should be in js-sys, because:

web-sys needs it, and...

...a lot of JS types outside of web-sys also define iterators, so we should have a convenient zero-cost way for users to create iterators for those types.

It fits in nicely with the existing iterator types.

1. web-sys needs it, and... 2. ...a lot of JS types outside of web-sys also define iterators, so we should have a convenient zero-cost way for users to create iterators for those types.

I think these two use cases call for different iterator types, though. For web-sys we can reasonably trust that the iterator will yield items that it says it will, and therefore it makes sense to use unchecked casts. For (2) I don't think that is the case at all, and we should be using checked casts by default.

For (2) I don't think that is the case at all, and we should be using checked casts by default.

I disagree. When somebody creates bindings for an existing JS iterator, they know whether the JS iterator will throw or not.

Requiring all users of the iterator to unwrap the Result is really bad for ergonomics, and it doesn't give any extra safety either (since they know that the iterator cannot error).

It's not any different from how JsCast provides both unchecked_into and dyn_into. This is the unchecked_into version of iterators.

There is prior precedent for it as well: the wasm_bindgen macro by default does not catch errors, you have to specifically use catch to enable catching errors.

Having a checked iterator type sounds reasonable, but there should definitely be an unchecked iterator type as well (which is what this type is).

So are you proposing to have both checked and unchecked typed iterators? Because I would be completely fine with that.

So are you proposing to have both checked and unchecked typed iterators? Because I would be completely fine with that.

Yes.

richard-uk1 · 2018-11-28T18:59:23Z

Thanks for the link to discussion - I hadn't seen that before.

I think this becomes a discussion about the exact semantics of javascript. Say I write my own iterator:

function repeat(value) {
  let iterator = {
    next: () => ({done: false, value})
  };
  let iterable = {};
  iterable[Symbol.iterator] = () => iterator;
  return iterable;
}

// Turn your computer into a room heater.
for(let val of repeat(5)) {
  console.log(val);
}

Then I can make it do any crazy thing I want. For example, I could make next return done: false the first time and then carry on working after, or it could return 42.

// iterator.next() returns 42
for(let val of repeat(1))
console.log(val)
// > TypeError: iterator.next() returned a non-object value

Isn't this the same situation as you would have in rust - an iterator only has to return None once, then it can return anything, so

struct DecRange(pos: i32);

impl Iterator for DecRange {
  type Item = i32;
  fn next(&mut self) -> Self::Item {
    let out = self.0;
    self.0 -= 1;
    if out == 0 {
      None
    } else {
      Some(out)
    }
  }
}

where you get garbage if you use it after it has returned None.

In rust, its up to the user to fuse the iterator if they need that behavior, why is it different here?

Is it because you're saying that if there is an exception (result is err), then you don't want to call the iterator again? I think if you want to use that logic you should return None on any exception, and then the Item can be the type (JsValue) rather than a Result. Or, alternatively, just return the error but allow next to be called again, so not fusing the iterator. I think the current solution is a bit 1/2 in 1/2 out.

richard-uk1 · 2018-11-28T19:03:01Z

I think I would argue that in this crate you are correct to use Result<JsValue, JsValue>, but that you should not fuse the iterator, leaving that to a higher level library. There may conceivably be some terrible use of the JavaScript iterator protocol that requires recovering from an error.

This also has the advantage of both simplifying the implementation and the API (there's no need for 3 types you would need otherwise), and more closely matching the semantics of javascript (an iterator is an Iterator, not an IntoIterator, if you like). Any user can recover the current behaviour by just fusing the iterator, or if they want to expose that behavior, wrapping it in a type that fuses it.

EDIT I may well have missed something obvious or misunderstood something, I'm just bouncing some ideas around.

Pauan · 2018-12-01T10:48:10Z

@derekdreery Or, alternatively, just return the error but allow next to be called again, so not fusing the iterator.

I think the goal should be to replicate the JS iteration protocol as closely as reasonable.

The JS iteration protocol is designed so that as soon as the iterator errors, it will end the iteration (it will never call next again after that).

So unfortunately allowing next to be called again isn't a good idea, especially because the iterator may be in an inconsistent state, so calling next again is especially bad.

Speaking of consistency with JS, currently js-sys isn't quite correct: the JS iteration protocol is designed to call the return method when the iteration is completed (steps 5j and 5m).

The spec is hard to read, but looking at the output of Babel is useful. This JS code...

for (let foo of bar()) {
    console.log(foo);
}

...gets translated into this:

var _iteratorNormalCompletion = true;
var _didIteratorError = false;
var _iteratorError = undefined;

try {
    for (var _iterator = bar()[Symbol.iterator](), _step; !(_iteratorNormalCompletion = (_step = _iterator.next()).done); _iteratorNormalCompletion = true) {
        var foo = _step.value;

        console.log(foo);
    }
} catch (err) {
    _didIteratorError = true;
    _iteratorError = err;
} finally {
    try {
        if (!_iteratorNormalCompletion && _iterator.return) {
            _iterator.return();
        }
    } finally {
        if (_didIteratorError) {
            throw _iteratorError;
        }
    }
}

Essentially, when the iteration is over (either because the iterator has finished, or because an error happened), it will call the return() method on the iterator (if it exists).

This is important for generators, which have a return method.

richard-uk1 · 2018-12-02T13:56:48Z

Continues in #1077

For demonstration purposes, simplify Iterator

df32c8f

Simplify iterator implementation in `js-sys`.

richard-uk1 mentioned this pull request Nov 20, 2018

Add iterable/iterator support for wasm-bindgen. #1036

Closed

Demonstrate typed iterator.

3cebb85

fitzgen reviewed Nov 28, 2018

View reviewed changes

richard-uk1 mentioned this pull request Dec 2, 2018

Rename iterator types and add some docs #1077

Closed

richard-uk1 closed this Dec 2, 2018

Pauan mentioned this pull request Jan 2, 2019

web-sys Missing method #1143

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Demonstrate simplification of Iterator in js-sys #1043

[WIP] Demonstrate simplification of Iterator in js-sys #1043

richard-uk1 commented Nov 20, 2018

alexcrichton commented Nov 26, 2018

richard-uk1 commented Nov 26, 2018 •

edited

Loading

richard-uk1 commented Nov 26, 2018

alexcrichton commented Nov 28, 2018

fitzgen commented Nov 28, 2018

fitzgen Nov 28, 2018

Pauan Nov 29, 2018 •

edited

Loading

fitzgen Nov 29, 2018

Pauan Nov 30, 2018 •

edited

Loading

fitzgen Nov 30, 2018

richard-uk1 commented Nov 28, 2018

richard-uk1 commented Nov 28, 2018 •

edited

Loading

Pauan commented Dec 1, 2018

richard-uk1 commented Dec 2, 2018

[WIP] Demonstrate simplification of Iterator in js-sys #1043

[WIP] Demonstrate simplification of Iterator in js-sys #1043

Conversation

richard-uk1 commented Nov 20, 2018

alexcrichton commented Nov 26, 2018

richard-uk1 commented Nov 26, 2018 • edited Loading

richard-uk1 commented Nov 26, 2018

alexcrichton commented Nov 28, 2018

fitzgen commented Nov 28, 2018

fitzgen Nov 28, 2018

Choose a reason for hiding this comment

Pauan Nov 29, 2018 • edited Loading

Choose a reason for hiding this comment

fitzgen Nov 29, 2018

Choose a reason for hiding this comment

Pauan Nov 30, 2018 • edited Loading

Choose a reason for hiding this comment

fitzgen Nov 30, 2018

Choose a reason for hiding this comment

richard-uk1 commented Nov 28, 2018

richard-uk1 commented Nov 28, 2018 • edited Loading

Pauan commented Dec 1, 2018

richard-uk1 commented Dec 2, 2018

richard-uk1 commented Nov 26, 2018 •

edited

Loading

Pauan Nov 29, 2018 •

edited

Loading

Pauan Nov 30, 2018 •

edited

Loading

richard-uk1 commented Nov 28, 2018 •

edited

Loading