Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent data structures by im-rs #6336

Merged
merged 4 commits into from
Nov 21, 2018
Merged

Persistent data structures by im-rs #6336

merged 4 commits into from
Nov 21, 2018

Conversation

Eh2406
Copy link
Contributor

@Eh2406 Eh2406 commented Nov 19, 2018

There has been a long standing "TODO: We should switch to persistent hash maps if we can" in the resolver. This PR introduces a dependency on im-rs to provide persistent hash maps. It then uses im_rc::OrdSet to store the priority list of dependencies, instead of std::BinaryHeap, as cloning the Heap was one of the most expensive things we did. In Big O terms these changes are very big wins, in practice the improvement is small. This is do to a number of factors like, std::collections are very fast, N is usually only in the hundreds, we only clone when the borrow checker insists on it, and we use RC and other tricks to avoid cloning.

I would advocate that we accept this PR, as:

  • Things are only going to get more complicated in the future. It would be nice to have Big O on our side.
  • When developing a new feature finding each place to add RC should be an afterthought not absolutely required before merge. (cc Tracking issue for RFC 1977: public & private dependencies, as it relates to the resolver #6129 witch is stuck on this kind of per tick performance problem as well as other things).
  • As the code gets more complex, making sure we never break any of the tricks becomes harder. It would be easier to work on this if such mistakes are marginal instead of show stoppers. (cc Regression in resolver performance #5130 a bug report of one of these bing removed and affecting ./x.py build enough to trigger an investigation).
  • The resolver is already very complicated with a lot of moving parts to keep in your head at the same time, this will only get worse as we add features. There are a long list of tricks that are critical before and ~5% after, it would be nice to consider if each one is worth the code complexity. (I can list some of the ones I have tried to remove but are still small wins, if that would help the discussion).

Overall, I think it is worth doing now, but can see that if is not a slam dunk.

Copy link
Member

@dwijnand dwijnand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool work, @Eh2406!

src/cargo/core/resolver/types.rs Outdated Show resolved Hide resolved
@Eh2406
Copy link
Contributor Author

Eh2406 commented Nov 20, 2018

I added some comments, and some variable names. Does that help?

@dwijnand
Copy link
Member

Yeah, it does, thank you!

I had assumed that the increment was relevant to the ordering. If it's just a question of not making elements disappear, why not just have a wrapper with a custom Eq that's never true? Say we call it NeverEqDepsFrame that would re-simplify RemainingDeps to RemainingDeps(im_rc::OrdSet<NeverEqDepsFrame>).

@dwijnand
Copy link
Member

Outside of that, I'm convinced by your arguments so I'm happy to land this. 😄

@Eh2406
Copy link
Contributor Author

Eh2406 commented Nov 20, 2018

... let me try that. BRB.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Nov 20, 2018

That is what I remembered, lots of the test fail to pass. According to Ord "Implementations of PartialEq, PartialOrd, and Ord must agree with each other." So I changed PartialEq to false and added .then_with(|| Ordering::Less) to Ord. I don't know why that did not work, but it dose not. Not with im_rc::OrdSet nor with std::BTreeSet. I think it may brake the antisymmetry definition of PartialOrd.

@dwijnand
Copy link
Member

dwijnand commented Nov 20, 2018

Yeah im_rc::OrdSet requires a total Ord, so we can't implement my idea. I agree with you it's probably breaking the antisymmetry law that's breaking the behaviour.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Nov 20, 2018

It also needs to be deterministic, so we can't use the memory address to duplicate.

@alexcrichton
Copy link
Member

This looks pretty awesome to me, having drop-in replacements is indeed really nice!

Out of curiosity, have you taken a look at the perf as-is today? (just to make sure we're not accidentally having a regression)

@Eh2406
Copy link
Contributor Author

Eh2406 commented Nov 21, 2018

According to the discussion in bodil/im-rs#26 I have had a version of this branch since Jun 21. I have at various times done detailed analysis of micro benchmark performance and experiments with cargo performance. Overall both std and im-rs are very fast. Definitively saying witch is the better for a given situation is hard and very dependent on exactly what you are bench-marking. There is a large range where there performance are functionally identical. This PR only includes the changes where it was a measurable improvement, when I last seriously tested on the benchmarks I happened to be using. I seriously doubt there is a big change in perf as-is today

However I have not rerun the perf recently, if you have something you would like me to try let me know. (Servo is ridiculously slow on my computer, so not that.)

@alexcrichton
Copy link
Member

@bors: r+

Ok that sounds reasonable to me! I definitely looks like you've been far more thorough here than I would have been :)

@bors
Copy link
Collaborator

bors commented Nov 21, 2018

📌 Commit 1699a5b has been approved by alexcrichton

@bors
Copy link
Collaborator

bors commented Nov 21, 2018

⌛ Testing commit 1699a5b with merge c93e847...

bors added a commit that referenced this pull request Nov 21, 2018
Persistent data structures by im-rs

There has been a long standing "TODO: We should switch to persistent hash maps if we can" in the resolver. This PR introduces a dependency on [im-rs](bodil/im-rs#26) to provide persistent hash maps. It then uses `im_rc::OrdSet` to store the priority list of dependencies, instead of `std::BinaryHeap`, as cloning the `Heap` was one of the most expensive things we did. In Big O terms these changes are very big wins, in practice the improvement is small. This is do to a number of factors like, `std::collections` are very fast, N is usually only in the hundreds, we only clone when the borrow checker insists on it, and we use `RC` and other tricks to avoid cloning.

I would advocate that we accept this PR, as:
- Things are only going to get more complicated in the future. It would be nice to have Big O on our side.
- When developing a new feature finding each place to add `RC` should be an afterthought not absolutely required before merge. (cc #6129 witch is stuck on this kind of per tick performance problem as well as other things).
- As the code gets more complex, making sure we never break any of the tricks becomes harder. It would be easier to work on this if such mistakes are marginal instead of show stoppers. (cc #5130 a bug report of one of these bing removed and affecting `./x.py build` enough to trigger an investigation).
- The resolver is already very complicated with a lot of moving parts to keep in your head at the same time, this will only get worse as we add  features. There are a long list of tricks that are *critical* before and `~5%`  after, it would be nice to consider if each one is worth the code complexity. (I can list some of the ones I have tried to remove but are still small wins, if that would help the discussion).

Overall, I think it is worth doing now, but can see that if is not a slam dunk.
@bors
Copy link
Collaborator

bors commented Nov 21, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing c93e847 to master...

@bors bors merged commit 1699a5b into rust-lang:master Nov 21, 2018
@Eh2406 Eh2406 deleted the im-rs branch November 21, 2018 18:57
kennytm added a commit to kennytm/rust that referenced this pull request Nov 27, 2018
Update cargo

14 commits in b3d0b2e545b61d4cd08096911724b7d49d213f73..1ff5975b96b3d395bb962394596998dfb485f793
2018-11-15 19:13:04 +0000 to 2018-11-25 14:59:12 +0000
- Intern SourceId (rust-lang/cargo#6342)
- Tweak Layout to allow for non json file targets with internal "." (rust-lang/cargo#6255)
- Correct Target Directory command-line option (rust-lang/cargo#6343)
- Persistent data structures by im-rs (rust-lang/cargo#6336)
- Move command prelude into main library (rust-lang/cargo#6335)
- Distinguish custom build invocations (rust-lang/cargo#6331)
- Allow crate_type=bin examples to run (rust-lang/cargo#6330)
- Make verify-project honour unstable features (rust-lang/cargo#6326)
- Make autodiscovery disable inferred targets (rust-lang/cargo#6329)
- Add `c` alias for `check` (rust-lang/cargo#6218)
- Allow user aliases to override built-in aliases (rust-lang/cargo#6259)
- Fix renaming directory project using build scripts with cross-compiling. (rust-lang/cargo#6328)
- Fix add_plugin_deps-related tests. (rust-lang/cargo#6327)
- Add a glossary. (rust-lang/cargo#6321)
kennytm added a commit to kennytm/rust that referenced this pull request Nov 30, 2018
Update cargo

14 commits in b3d0b2e545b61d4cd08096911724b7d49d213f73..1ff5975b96b3d395bb962394596998dfb485f793
2018-11-15 19:13:04 +0000 to 2018-11-25 14:59:12 +0000
- Intern SourceId (rust-lang/cargo#6342)
- Tweak Layout to allow for non json file targets with internal "." (rust-lang/cargo#6255)
- Correct Target Directory command-line option (rust-lang/cargo#6343)
- Persistent data structures by im-rs (rust-lang/cargo#6336)
- Move command prelude into main library (rust-lang/cargo#6335)
- Distinguish custom build invocations (rust-lang/cargo#6331)
- Allow crate_type=bin examples to run (rust-lang/cargo#6330)
- Make verify-project honour unstable features (rust-lang/cargo#6326)
- Make autodiscovery disable inferred targets (rust-lang/cargo#6329)
- Add `c` alias for `check` (rust-lang/cargo#6218)
- Allow user aliases to override built-in aliases (rust-lang/cargo#6259)
- Fix renaming directory project using build scripts with cross-compiling. (rust-lang/cargo#6328)
- Fix add_plugin_deps-related tests. (rust-lang/cargo#6327)
- Add a glossary. (rust-lang/cargo#6321)
Centril added a commit to Centril/rust that referenced this pull request Dec 1, 2018
Update cargo

14 commits in b3d0b2e545b61d4cd08096911724b7d49d213f73..1ff5975b96b3d395bb962394596998dfb485f793
2018-11-15 19:13:04 +0000 to 2018-11-25 14:59:12 +0000
- Intern SourceId (rust-lang/cargo#6342)
- Tweak Layout to allow for non json file targets with internal "." (rust-lang/cargo#6255)
- Correct Target Directory command-line option (rust-lang/cargo#6343)
- Persistent data structures by im-rs (rust-lang/cargo#6336)
- Move command prelude into main library (rust-lang/cargo#6335)
- Distinguish custom build invocations (rust-lang/cargo#6331)
- Allow crate_type=bin examples to run (rust-lang/cargo#6330)
- Make verify-project honour unstable features (rust-lang/cargo#6326)
- Make autodiscovery disable inferred targets (rust-lang/cargo#6329)
- Add `c` alias for `check` (rust-lang/cargo#6218)
- Allow user aliases to override built-in aliases (rust-lang/cargo#6259)
- Fix renaming directory project using build scripts with cross-compiling. (rust-lang/cargo#6328)
- Fix add_plugin_deps-related tests. (rust-lang/cargo#6327)
- Add a glossary. (rust-lang/cargo#6321)
bors added a commit to rust-lang/rust that referenced this pull request Dec 4, 2018
Update cargo, rls

26 commits in b3d0b2e545b61d4cd08096911724b7d49d213f73..5e85ba14aaa20f8133863373404cb0af69eeef2c
2018-11-15 19:13:04 +0000 to 2018-12-02 14:37:25 +0000
- ConflictStoreTrie: Faster filtered search (rust-lang/cargo#6366)
- Remove `cmake` as a requirement (rust-lang/cargo#6368)
- progress: display "Downloading 1 crate" instead of "Downloading 1 crates" (rust-lang/cargo#6369)
- Use expect over unwrap, for panic-in-panic aborts (rust-lang/cargo#6364)
- Switch to pretty_env_logger, under --features pretty-env-logger (rust-lang/cargo#6362)
- use allow-dirty option in `cargo package` to skip vcs checks (rust-lang/cargo#6280)
- remove clones made redundant by Intern PackageId (rust-lang/cargo#6352)
- docs: correct profile usage of `cargo test --release` (rust-lang/cargo#6345)
- Improve doc for `cargo install` (rust-lang/cargo#6354)
- Intern PackageId (rust-lang/cargo#6332)
- Clean only release artifacts if --release option is set (rust-lang/cargo#6349)
- remove clones made redundant by Intern SourceId (rust-lang/cargo#6347)
- Intern SourceId (rust-lang/cargo#6342)
- Tweak Layout to allow for non json file targets with internal "." (rust-lang/cargo#6255)
- Correct Target Directory command-line option (rust-lang/cargo#6343)
- Persistent data structures by im-rs (rust-lang/cargo#6336)
- Move command prelude into main library (rust-lang/cargo#6335)
- Distinguish custom build invocations (rust-lang/cargo#6331)
- Allow crate_type=bin examples to run (rust-lang/cargo#6330)
- Make verify-project honour unstable features (rust-lang/cargo#6326)
- Make autodiscovery disable inferred targets (rust-lang/cargo#6329)
- Add `c` alias for `check` (rust-lang/cargo#6218)
- Allow user aliases to override built-in aliases (rust-lang/cargo#6259)
- Fix renaming directory project using build scripts with cross-compiling. (rust-lang/cargo#6328)
- Fix add_plugin_deps-related tests. (rust-lang/cargo#6327)
- Add a glossary. (rust-lang/cargo#6321)
@ehuss ehuss added this to the 1.32.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants