-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Demonstrate simplification of Iterator in js-sys #1043
Conversation
Simplify iterator implementation in `js-sys`.
Sorry I haven't quite caught up on the surrounding discussion here, and sorry for the delay! Can you expand a bit on how this simplification is possible though and why it works the same as before? |
Essentially it is because JavaScript iterators have a The current implementation keeps some state for when the iterator has ended, like the |
I've also added a typed version based on the comments in the other thread. It's not ready for merge because there are no tests - currently it's to demonstrate proof of concept. |
There's some previous discussion about why the fusing is implementing, maybe @fitzgen you have thoughts on this? |
This isn't quite true becuase JS can always throw on each Basically we have to assume that |
/// An `Iterator` wrapper for when we know the type that will be returned. | ||
/// | ||
/// We also assume that errors from the use of this iterator are not possible. | ||
pub struct TypedIterator<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like the idea of building typed iterators on top, though.
Unsure if it belongs in js-sys or a third party crate.
Maybe want to make Item = Result<T>
to handle potential exceptions and also use checked casts that return Err
on cast failures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary purpose (but not the only purpose!) of the typed iterator is to provide for iteration on web-sys types.
For example, NodeList
is an iterator of Node
. And so we want to provide a TypedIterator<Node>
for NodeList
.
Since we know that the NodeList
iterator doesn't error, we don't want it to return Result
.
That's why it's called TypedIterator::from_iterator_unchecked
, to emphasize the unchecked nature of it.
In practice, it's very rare for iterators to error (in general), and in the case of web-sys specifically I believe it's guaranteed that they don't error.
I think it should be in js-sys, because:
-
web-sys needs it, and...
-
...a lot of JS types outside of web-sys also define iterators, so we should have a convenient zero-cost way for users to create iterators for those types.
-
It fits in nicely with the existing iterator types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. web-sys needs it, and... 2. ...a lot of JS types outside of web-sys also define iterators, so we should have a convenient zero-cost way for users to create iterators for those types.
I think these two use cases call for different iterator types, though. For web-sys we can reasonably trust that the iterator will yield items that it says it will, and therefore it makes sense to use unchecked casts. For (2) I don't think that is the case at all, and we should be using checked casts by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For (2) I don't think that is the case at all, and we should be using checked casts by default.
I disagree. When somebody creates bindings for an existing JS iterator, they know whether the JS iterator will throw or not.
Requiring all users of the iterator to unwrap
the Result
is really bad for ergonomics, and it doesn't give any extra safety either (since they know that the iterator cannot error).
It's not any different from how JsCast
provides both unchecked_into
and dyn_into
. This is the unchecked_into
version of iterators.
There is prior precedent for it as well: the wasm_bindgen
macro by default does not catch errors, you have to specifically use catch
to enable catching errors.
Having a checked iterator type sounds reasonable, but there should definitely be an unchecked iterator type as well (which is what this type is).
So are you proposing to have both checked and unchecked typed iterators? Because I would be completely fine with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So are you proposing to have both checked and unchecked typed iterators? Because I would be completely fine with that.
Yes.
Thanks for the link to discussion - I hadn't seen that before. I think this becomes a discussion about the exact semantics of javascript. Say I write my own iterator: function repeat(value) {
let iterator = {
next: () => ({done: false, value})
};
let iterable = {};
iterable[Symbol.iterator] = () => iterator;
return iterable;
}
// Turn your computer into a room heater.
for(let val of repeat(5)) {
console.log(val);
} Then I can make it do any crazy thing I want. For example, I could make // iterator.next() returns 42
for(let val of repeat(1))
console.log(val)
// > TypeError: iterator.next() returned a non-object value Isn't this the same situation as you would have in rust - an iterator only has to return None once, then it can return anything, so struct DecRange(pos: i32);
impl Iterator for DecRange {
type Item = i32;
fn next(&mut self) -> Self::Item {
let out = self.0;
self.0 -= 1;
if out == 0 {
None
} else {
Some(out)
}
}
} where you get garbage if you use it after it has returned In rust, its up to the user to Is it because you're saying that if there is an exception (result is err), then you don't want to call the iterator again? I think if you want to use that logic you should return None on any exception, and then the Item can be the type (JsValue) rather than a Result. Or, alternatively, just return the error but allow next to be called again, so not fusing the iterator. I think the current solution is a bit 1/2 in 1/2 out. |
I think I would argue that in this crate you are correct to use This also has the advantage of both simplifying the implementation and the API (there's no need for 3 types you would need otherwise), and more closely matching the semantics of javascript (an iterator is an Iterator, not an IntoIterator, if you like). Any user can recover the current behaviour by just fusing the iterator, or if they want to expose that behavior, wrapping it in a type that fuses it. EDIT I may well have missed something obvious or misunderstood something, I'm just bouncing some ideas around. |
I think the goal should be to replicate the JS iteration protocol as closely as reasonable. The JS iteration protocol is designed so that as soon as the iterator errors, it will end the iteration (it will never call So unfortunately allowing Speaking of consistency with JS, currently js-sys isn't quite correct: the JS iteration protocol is designed to call the The spec is hard to read, but looking at the output of Babel is useful. This JS code... for (let foo of bar()) {
console.log(foo);
} ...gets translated into this: var _iteratorNormalCompletion = true;
var _didIteratorError = false;
var _iteratorError = undefined;
try {
for (var _iterator = bar()[Symbol.iterator](), _step; !(_iteratorNormalCompletion = (_step = _iterator.next()).done); _iteratorNormalCompletion = true) {
var foo = _step.value;
console.log(foo);
}
} catch (err) {
_didIteratorError = true;
_iteratorError = err;
} finally {
try {
if (!_iteratorNormalCompletion && _iterator.return) {
_iterator.return();
}
} finally {
if (_didIteratorError) {
throw _iteratorError;
}
}
} Essentially, when the iteration is over (either because the iterator has finished, or because an error happened), it will call the This is important for generators, which have a |
Continues in #1077 |
Simplify iterator implementation in
js-sys
.This PR is currently for demonstration purposes, please do not merge.