Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialize StepBy<Range(Inclusive)> #51601

Merged
merged 1 commit into from
Jun 21, 2018

Conversation

Emerentius
Copy link
Contributor

@Emerentius Emerentius commented Jun 16, 2018

Part of #51557, related to #43064, #31155

As discussed in the above issues, step_by optimizes very badly on ranges which is related to

  1. the special casing of the first StepBy::next() call
  2. the need to do 2 additions of n - 1 and 1 inside the range's next()

This PR eliminates both by overriding next() to always produce the current element and also step ahead by n elements in one go. The generated code is much better, even identical in the case of a Range with constant start and end where start+step can't overflow. Without constant bounds it's a bit longer than the manual loop. RangeInclusive doesn't optimize as nicely but is still much better than the original asm.
Unsigned integers optimize better than signed ones for some reason.

See the following two links for a comparison.

godbolt: specialization for ..
godbolt: specialization for ..=

RangeFrom, the only other range with an Iterator implementation can't be specialized like this without changing behaviour due to overflow. There is no way to save "finished-ness".

The approach can not be used in general, because it would produce side effects of the underlying iterator too early.

May obsolete #51435, haven't checked.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[01:05:39] travis_fold:start:test_stage1-core
travis_time:start:test_stage1-core
Testing core stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:05:39]    Compiling core v0.0.0 (file:///checkout/src/libcore)
[01:05:46] error[E0271]: type mismatch resolving `<std::iter::StepBy<std::ops::Range<i32>> as std::iter::Iterator>::Item == isize`
[01:05:46]      |
[01:05:46]      |
[01:05:46] 1602 |     assert_eq!((0..20).step_by(5).collect::<Vec<isize>>(), [0, 5, 10, 15]);
[01:05:46]      |                                   ^^^^^^^ expected i32, found isize
[01:05:46]      = note: expected type `i32`
[01:05:46]                 found type `isize`
[01:05:46] 
[01:05:46] 
[01:05:46] error[E0271]: type mismatch resolving `<std::iter::StepBy<std::ops::Range<i32>> as std::iter::Iterator>::Item == u8`
[01:05:46]      |
[01:05:46]      |
[01:05:46] 1605 |     assert_eq!((200..255).step_by(50).collect::<Vec<u8>>(), [200, 250]);
[01:05:46]      |                                       ^^^^^^^ expected i32, found u8
[01:05:46]      = note: expected type `i32`
[01:05:46]                 found type `u8`
[01:05:46] 
[01:05:46] 
[01:05:46] error[E0271]: type mismatch resolving `<std::iter::StepBy<std::ops::Range<i32>> as std::iter::Iterator>::Item == isize`
[01:05:46]      |
[01:05:46]      |
[01:05:46] 1606 |     assert_eq!((200..-5).step_by(1).collect::<Vec<isize>>(), []);
[01:05:46]      |                                     ^^^^^^^ expected i32, found isize
[01:05:46]      = note: expected type `i32`
[01:05:46]                 found type `isize`
[01:05:46] 
[01:05:46] 
[01:05:46] error[E0271]: type mismatch resolving `<std::iter::StepBy<std::ops::Range<i32>> as std::iter::Iterator>::Item == isize`
[01:05:46]      |
[01:05:46]      |
[01:05:46] 1607 |     assert_eq!((200..200).step_by(1).collect::<Vec<isize>>(), []);
[01:05:46]      |                                      ^^^^^^^ expected i32, found isize
[01:05:46]      = note: expected type `i32`
[01:05:46]                 found type `isize`
[01:05:46] 
[01:05:51] error: aborting due to 4 previous errors
[01:05:51] error: aborting due to 4 previous errors
[01:05:51] 
[01:05:51] For more information about this error, try `rustc --explain E0271`.
[01:05:51] error: Could not compile `core`.
[01:05:51] 
[01:05:51] To learn more, run the command again with --verbose.
[01:05:51] 
[01:05:51] 
[01:05:51] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "core" "--" "--quiet"
[01:05:51] 
[01:05:51] 
[01:05:51] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:05:51] Build completed unsuccessfully in 0:25:25
[01:05:51] Build completed unsuccessfully in 0:25:25
[01:05:51] make: *** [check] Error 1
[01:05:51] Makefile:58: recipe for target 'check' failed

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:13d40a90
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Emerentius
Copy link
Contributor Author

#36262 strikes again. And here I thought I'd be safe 'cause it's always clear which range type we have and all integers implement Step.

@leonardo-m
Copy link

Making Specialize StepBy<Range(Inclusive)> efficient has a high priority in system language design.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[01:17:42] 
[01:17:42] running 845 tests
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:42] ..........................................................F.........................................
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:43] .............................................
[01:17:43] failures:
[01:17:43] 
[01:17:43] ---- iter::test_range_step stdout ----
[01:17:43] thread 'iter::test_range_step' panicked at 'assertion failed: `(left == right)`
[01:17:43]   left: `[200]`,
[01:17:43]  right: `[200, 250]`', libcore/../libcore/tests/iter.rs:1605:5
[01:17:43] 
[01:17:43] failures:
[01:17:43]     iter::test_range_step
[01:17:43] 
[01:17:43] 
[01:17:43] test result: FAILED. 842 passed; 1 failed; 2 ignored; 0 measured; 0 filtered out
[01:17:43] 
[01:17:43] error: test failed, to rerun pass '--test coretests'
[01:17:43] 
[01:17:43] 
[01:17:43] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "core" "--" "--quiet"
[01:17:43] 
[01:17:43] 
[01:17:43] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:17:43] Build completed unsuccessfully in 0:29:51
[01:17:43] Build completed unsuccessfully in 0:29:51
[01:17:43] Makefile:58: recipe for target 'check' failed
[01:17:43] make: *** [check] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:2eb90abf
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Emerentius
Copy link
Contributor Author

Pro-tip: if your local tests don't run against your own code, they are much more likely to succeed.

#[inline]
fn next(&mut self) -> Option<Self::Item> {
self.first_take = false;
if self.iter.start >= self.iter.end {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step only requires PartialOrd, so using >= is technically a behaviour change. It wouldn't change anything that's stable today, but I think the original way is better, with an NAN-like endpoint making the range empty. (See similar discussion for Range::contains in #49130 (comment))

@pietroalbini pietroalbini added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 18, 2018
@pietroalbini
Copy link
Member

Picking someone from the libs team to review this. r? @sfackler

@Emerentius
Copy link
Contributor Author

Emerentius commented Jun 18, 2018

I flipped the comparisons back around to keep the original behaviour for NaN-like values (for when they are possible) and squashed the commits together.
Updated and cleaned up the the godbolt code and exchanged the links in the OP. Signed integers don't optimize as well as unsigned ones, most likely because of the additional checks in Step::add_usize but it's still an improvement.

@@ -737,6 +733,76 @@ impl<I> Iterator for StepBy<I> where I: Iterator {
}
}

// hidden trait for specializing iterator methods
// current use is limited to next() on StepBy
trait SpecIter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this trait mention StepBy in its name just to make it a bit more clear that it's limited to those impls?

#[inline]
fn next(&mut self) -> Option<Self::Item> {
self.first_take = false;
if !(self.iter.start < self.iter.end) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comparison is structured weirdly to handle NaN properly?

Copy link
Contributor Author

@Emerentius Emerentius Jun 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's if start < end { return stuff } else { None } flipped around so the positive case has no indent. Should I change it?

@sfackler
Copy link
Member

Seems reasonable to me!

@pietroalbini pietroalbini added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 19, 2018
the originally generated code was highly suboptimal
this brings it close to the same code or even exactly the same as a
manual while-loop by eliminating a branch and the
double stepping of n-1 + 1 steps

The intermediate trait lets us circumvent the specialization
type inference bugs
@Emerentius Emerentius force-pushed the step_by_range_diet branch from 2879174 to 000aff6 Compare June 19, 2018 17:34
@Emerentius
Copy link
Contributor Author

@sfackler Renamed SpecIter to StepBySpecIterator and its next() method to spec_next() following SpecExtend.

@pietroalbini pietroalbini added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 19, 2018
@sfackler
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Jun 21, 2018

📌 Commit 000aff6 has been approved by sfackler

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 21, 2018
@bors
Copy link
Contributor

bors commented Jun 21, 2018

⌛ Testing commit 000aff6 with merge 95979dc...

bors added a commit that referenced this pull request Jun 21, 2018
Specialize StepBy<Range(Inclusive)>

Part of #51557, related to #43064, #31155

As discussed in the above issues, `step_by` optimizes very badly on ranges which is related to
1. the special casing of the first `StepBy::next()` call
2. the need to do 2 additions of `n - 1` and `1` inside the range's `next()`

This PR eliminates both by overriding `next()` to always produce the current element and also step ahead by `n` elements in one go. The generated code is much better, even identical in the case of a `Range` with constant `start` and `end` where `start+step` can't overflow. Without constant bounds it's a bit longer than the manual loop. `RangeInclusive` doesn't optimize as nicely but is still much better than the original asm.
Unsigned integers optimize better than signed ones for some reason.

See the following two links for a comparison.

[godbolt: specialization for ..](https://godbolt.org/g/haHLJr)
[godbolt: specialization for ..=](https://godbolt.org/g/ewyMu6)

`RangeFrom`, the only other range with an `Iterator` implementation can't be specialized like this without changing behaviour due to overflow. There is no way to save "finished-ness".

The approach can not be used in general, because it would produce side effects of the underlying iterator too early.

May obsolete #51435, haven't checked.
@bors
Copy link
Contributor

bors commented Jun 21, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: sfackler
Pushing 95979dc to master...

@bors bors merged commit 000aff6 into rust-lang:master Jun 21, 2018
@jethrogb
Copy link
Contributor

This is buggy: #55985

@newpavlov
Copy link
Contributor

So should we temporary revert this PR? Or should we leave it as-is and wait for the proper fix? I personally prefer the first option.

@sfackler
Copy link
Member

Let's revert until it's fixed, yeah.

@newpavlov newpavlov mentioned this pull request Nov 18, 2018
@leonardo-m
Copy link

The performance of this patch was not good enough :-(

bors added a commit that referenced this pull request Nov 20, 2018
Revert #51601

Closes: #55985

Specialization of `StepBy<Range(Inclusive)>` results in an incorrectly behaving code when `step_by` is combined with `skip` or `nth`.

If this will get merged we probably should reopen issues previously closed by #51601 (if there was any).
@nikomatsakis nikomatsakis mentioned this pull request Nov 20, 2018
20 tasks
bors added a commit that referenced this pull request Nov 20, 2018
beta backport rollup

Backports of some beta-approved PRs

- [x] #55385: NLL: cast causes failure to promote to static
- [x] #56043: remove "approx env bounds" if we already know from trait
- [x] #56003: do not propagate inferred bounds on trait objects if they involve `Self`
- [x] #55852: Rewrite `...` as `..=` as a `MachineApplicable` 2018 idiom lint
- [x] #55804: rustdoc: don't inline `pub use some_crate` unless directly asked to
- [x] #56059: Increase `Duration` approximate equal threshold to 1us
- [x]  Keep resolved defs in path prefixes and emit them in save-analysis #54145
- [x]  Adjust Ids of path segments in visibility modifiers #55487
- [x]  save-analysis: bug fix and optimisation. #55521
- [x]   save-analysis: be even more aggressive about ignorning macro-generated defs #55936
- [x]  save-analysis: fallback to using path id #56060
- [x]  save-analysis: Don't panic for macro-generated use globs #55879
- [x]  Add temporary renames to manifests for rustfmt/clippy #56081
- [x] Revert #51601 #56049
- [x]  Fix stability hole with `static _` #55983
- [x] #56077
- [x] Fix Rustdoc ICE when checking blanket impls #55258
- [x]  Updated RELEASES.md for 1.31.0 #55678
- [x] ~~#56061~~ #56111
- [x]  Stabilize `extern_crate_item_prelude` #56032

Still running tests locally, and I plan to backport @nrc's other PRs too

(cc @petrochenkov -- thanks for the advice)
Centril added a commit to Centril/rust that referenced this pull request Sep 9, 2019
… r=scottmcm

Override `StepBy::{try_fold, try_rfold}`

Previous PR: rust-lang#51435

The previous PR was closed in favor of rust-lang#51601, which was later reverted. I don't think these implementations will make it harder to specialize `StepBy<Range<_>>` later, so we should be able to land this without any consequences.

This should fix rust-lang#57517 – in my benchmarks `iter` and `iter.step_by(1)` now perform equally well, provided internal iteration is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants