Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factor query arena allocation out from query caches #107833

Merged
merged 5 commits into from
Feb 17, 2023

Conversation

Zoxc
Copy link
Contributor

@Zoxc Zoxc commented Feb 9, 2023

This moves the logic for arena allocation out from the query caches into conditional code in the query system. The specialized arena caches are removed. A new QuerySystem type is added in rustc_middle which contains the arenas, providers and query caches.

Performance seems to be slightly regressed:

BenchmarkBeforeAfter
TimeTime%
🟣 clap:check1.8053s1.8109s 0.31%
🟣 hyper:check0.2600s0.2597s -0.10%
🟣 regex:check0.9973s1.0006s 0.34%
🟣 syn:check1.6048s1.6051s 0.02%
🟣 syntex_syntax:check6.2992s6.3159s 0.26%
Total10.9664s10.9922s 0.23%
Summary1.0000s1.0017s 0.17%

Incremental performance is a bit worse:

BenchmarkBeforeAfter
TimeTime%
🟣 clap:check:initial2.2103s2.2247s 0.65%
🟣 hyper:check:initial0.3335s0.3349s 0.41%
🟣 regex:check:initial1.2597s1.2650s 0.42%
🟣 syn:check:initial2.0521s2.0613s 0.45%
🟣 syntex_syntax:check:initial7.8275s7.8583s 0.39%
Total13.6832s13.7442s 0.45%
Summary1.0000s1.0046s 0.46%

It does seem like LLVM optimizers struggle a bit with the current state of the query system.

Based on top of #107782 and #107802.

r? @cjgillot

@rustbot rustbot added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 9, 2023
Copy link
Contributor

@cjgillot cjgillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced by the direction of this last commit.
My proposal would be to:

  1. migrate all queries to arena-allocate manually and drop arena_cache for them,
  2. remove all the arena support.

@@ -67,6 +71,24 @@ use std::sync::Arc;
pub(crate) use rustc_query_system::query::QueryJobId;
use rustc_query_system::query::*;

pub struct QuerySystem<'tcx> {
pub local_providers: Box<Providers>,
pub extern_providers: Box<ExternProviders>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the providers moved here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was convenient to access them using a TyCtxt somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do want to try moving the rest of the rustc_query_impl::Queries state to rustc_middle though, as that slightly reduces the size of the query hot path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I didn't see where you use them. Could you point it?

This data structure has been put into rustc_query_impl in an effort to shrink rustc_middle and its compile time. I'd rather see stuff moved to rustc_query_impl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To conclude this discussion, could you keep the current setup of fields in GlobalCtxt? To be re-discussed in another PR that needs those fields moved.

@Zoxc
Copy link
Contributor Author

Zoxc commented Feb 9, 2023

I kind of view this as an incremental step towards removing arena_cache, though it also goes a bit sideways. I think the provider move can be easily avoided, is there anything other problematic?

@Zoxc Zoxc mentioned this pull request Feb 11, 2023
@bors
Copy link
Contributor

bors commented Feb 12, 2023

☔ The latest upstream changes (presumably #107643) made this pull request unmergeable. Please resolve the merge conflicts.

@@ -795,6 +793,8 @@ pub fn create_global_ctxt<'tcx>(
untracked,
dep_graph,
queries.on_disk_cache.as_ref().map(OnDiskCache::as_dyn),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side note: is there a reason we don't make this a method on QueryEngine? seems weird to pass in a reference to the query engine and to a field of it.

Comment on lines 796 to 797
local_providers,
extern_providers,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these could be exposed from the QueryEngine, or if that is a performance issue, we could make the current QueryEngine uses be a struct with a trailing dyn QueryEngine field to have quick access to the fields.

Copy link
Contributor

@cjgillot cjgillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this PR needs to be based on top of #107782 and #107802.
Can it be landed separately, or separated from the WorkerLocal changes?

(job, result)
// Mark as complete before we remove the job from the active state
// so no other thread can re-execute this query.
cache.complete(key.clone(), result, dep_node_index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes behaviour for the parallel compiler. IIUC, this is a bugfix, so we should probably land this separately.

value: query_provided::$name<'tcx>,
) -> query_values::$name<'tcx> {
query_if_arena!([$($modifiers)*]
(&*_tcx.query_system.arenas.$name.alloc(value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the global tcx.arena instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this to use the dropless arena for types without destructors at least, which improves performance. We have to add all the query types back to tcx.arena before fully using that.

#[allow(nonstandard_style, unused_lifetimes, unused_parens)]
pub mod query_storage {
#[allow(nonstandard_style, unused_lifetimes)]
pub mod query_provided {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you doc-comment on the module that this type is?

)*
}
#[allow(nonstandard_style, unused_lifetimes)]
pub mod query_provided_to_value {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you doc-comment on the module that this does?


$(
#[inline]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline always?

@@ -67,6 +71,24 @@ use std::sync::Arc;
pub(crate) use rustc_query_system::query::QueryJobId;
use rustc_query_system::query::*;

pub struct QuerySystem<'tcx> {
pub local_providers: Box<Providers>,
pub extern_providers: Box<ExternProviders>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To conclude this discussion, could you keep the current setup of fields in GlobalCtxt? To be re-discussed in another PR that needs those fields moved.

qcx,
dep_node
);
value.map(|value| query_provided_to_value::$name(qcx.tcx, value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need a call to query_provided_to_value? Could we work with only the query_value here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arena_cache decodes <V as Deref>::Target so we need to put it on an arena. We can't decode plain V anymore since #73851.

@Zoxc Zoxc force-pushed the arena-query-clean branch from 65f0ce0 to 2d24908 Compare February 14, 2023 13:43
Copy link
Contributor

@oli-obk oli-obk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some clones (even if usually no ops/copies) can be avoided now

compiler/rustc_query_system/src/query/caches.rs Outdated Show resolved Hide resolved
compiler/rustc_query_system/src/query/caches.rs Outdated Show resolved Hide resolved
compiler/rustc_query_system/src/query/caches.rs Outdated Show resolved Hide resolved
@Zoxc Zoxc force-pushed the arena-query-clean branch from 2d24908 to afa9de7 Compare February 14, 2023 13:59
@Zoxc
Copy link
Contributor Author

Zoxc commented Feb 14, 2023

I think I've address all comments now.

The use of the dropless arena was sufficient to offset the regression for non-incremental check:

BenchmarkBeforeAfter
TimeTime%
🟣 clap:check1.8226s1.8204s -0.12%
🟣 hyper:check0.2663s0.2654s -0.30%
🟣 regex:check1.0079s1.0062s -0.17%
🟣 syn:check1.6336s1.6352s 0.10%
🟣 syntex_syntax:check6.3726s6.3772s 0.07%
Total11.1029s11.1045s 0.01%
Summary1.0000s0.9992s -0.08%

The incremental case still regresses a bit:

BenchmarkBeforeAfter
TimeTime%
🟣 clap:check:initial2.2309s2.2401s 0.41%
🟣 hyper:check:initial0.3414s0.3413s -0.01%
🟣 regex:check:initial1.2786s1.2840s 0.43%
🟣 syn:check:initial2.0676s2.0772s 0.47%
🟣 syntex_syntax:check:initial7.9006s7.9394s 0.49%
Total13.8191s13.8821s 0.46%
Summary1.0000s1.0036s 0.36%

@oli-obk
Copy link
Contributor

oli-obk commented Feb 14, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 14, 2023
@bors
Copy link
Contributor

bors commented Feb 14, 2023

⌛ Trying commit 257a5a4d57637401596ae8d047b0f576a0dda672 with merge 86f67440d16aa2f389da87483866d18ca64a9c3e...

@bors
Copy link
Contributor

bors commented Feb 14, 2023

☀️ Try build successful - checks-actions
Build commit: 86f67440d16aa2f389da87483866d18ca64a9c3e (86f67440d16aa2f389da87483866d18ca64a9c3e)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (86f67440d16aa2f389da87483866d18ca64a9c3e): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.3% [0.3%, 1.8%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.1% [0.4%, 1.8%] 2
Regressions ❌
(secondary)
2.0% [1.3%, 2.7%] 6
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-4.5% [-4.5%, -4.5%] 1
All ❌✅ (primary) 1.1% [0.4%, 1.8%] 2

Cycles

This benchmark run did not return any relevant results for this metric.

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 14, 2023
@cjgillot
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Feb 14, 2023

📌 Commit 257a5a4d57637401596ae8d047b0f576a0dda672 has been approved by cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 14, 2023
@bors
Copy link
Contributor

bors commented Feb 15, 2023

⌛ Testing commit 257a5a4d57637401596ae8d047b0f576a0dda672 with merge 76cde7c771243a8bd7e125bc4486fb484327f7a7...

@bors
Copy link
Contributor

bors commented Feb 15, 2023

💔 Test failed - checks-actions

@bors
Copy link
Contributor

bors commented Feb 16, 2023

☔ The latest upstream changes (presumably #108116) made this pull request unmergeable. Please resolve the merge conflicts.

@Zoxc Zoxc force-pushed the arena-query-clean branch from 257a5a4 to caf29b2 Compare February 16, 2023 13:58
@cjgillot
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Feb 16, 2023

📌 Commit caf29b2 has been approved by cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 16, 2023
@@ -271,6 +267,7 @@ where
QueryResult::Poisoned => panic!(),
}
};
cache.complete(key, result, dep_node_index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this required any more?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the extra block that I removed? Not sure when it became redundant.

@bors
Copy link
Contributor

bors commented Feb 16, 2023

⌛ Testing commit caf29b2 with merge 947b696...

@bors
Copy link
Contributor

bors commented Feb 17, 2023

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing 947b696 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 17, 2023
@bors bors merged commit 947b696 into rust-lang:master Feb 17, 2023
@rustbot rustbot added this to the 1.69.0 milestone Feb 17, 2023
@Zoxc Zoxc deleted the arena-query-clean branch February 17, 2023 01:15
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (947b696): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.0% [0.2%, 1.9%] 4
Improvements ✅
(primary)
-0.3% [-0.3%, -0.3%] 1
Improvements ✅
(secondary)
-0.3% [-0.4%, -0.2%] 3
All ❌✅ (primary) -0.3% [-0.3%, -0.3%] 1

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.0% [1.0%, 1.0%] 1
Regressions ❌
(secondary)
1.9% [1.8%, 2.0%] 2
Improvements ✅
(primary)
-2.1% [-2.9%, -1.3%] 2
Improvements ✅
(secondary)
-3.1% [-4.4%, -1.9%] 5
All ❌✅ (primary) -1.1% [-2.9%, 1.0%] 3

Cycles

This benchmark run did not return any relevant results for this metric.

@rustbot rustbot added the perf-regression Performance regression. label Feb 17, 2023
tautschnig added a commit to tautschnig/kani that referenced this pull request Apr 17, 2023
Upstream PRs that require local changes:

- Switch to EarlyBinder for type_of query rust-lang/rust#107753
- Factor query arena allocation out from query caches rust-lang/rust#107833

Co-authored-by: Qinheping Hu <qinhh@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants