Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use indexmap::Index{Map,Set} in the compiler instead of Vec+reverse map. #60608

Closed
eddyb opened this issue May 7, 2019 · 9 comments · Fixed by #75278
Closed

Use indexmap::Index{Map,Set} in the compiler instead of Vec+reverse map. #60608

eddyb opened this issue May 7, 2019 · 9 comments · Fixed by #75278
Labels
C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@eddyb
Copy link
Member

eddyb commented May 7, 2019

Examples of pairs of fields we can replace:

rust/src/librustc/ty/mod.rs

Lines 905 to 909 in c6ac575

pub params: Vec<GenericParamDef>,
/// Reverse map to the `index` field of each `GenericParamDef`
#[stable_hasher(ignore)]
pub param_def_id_to_index: FxHashMap<DefId, u32>,

names: FxHashMap<&'static str, Symbol>,
strings: Vec<&'static str>,

spans: FxHashMap<SpanData, u32>,
span_data: Vec<SpanData>,

(any interner that relies on indices can be switched to IndexSet, really)

We could also wrap Index{Set,Map}, like IndexedVec wraps Vec, to use non-usize indices.
(Or we could make our own that stores indices smaller than usize, not sure how much work that is, or if we want to do it at all, cc @bluss)

@Centril Centril added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-cleanup Category: PRs that clean code up or issues documenting cleanup. labels May 7, 2019
@cuviper
Copy link
Member

cuviper commented May 8, 2019

FWIW, indexmap already does a little bit of size hacking with its Pos type:
https://github.com/bluss/indexmap/blob/0a06966af88c0f48f2d69d20dacfc89cebfbbf3f/src/map.rs#L76-L91

The API always presents usize though.

@eddyb
Copy link
Member Author

eddyb commented May 8, 2019

Huh, so this is quite different from the builtin HashMap.
cc @Amanieu I wonder how the novel parts of SwissTable/hashbrown would play with this.

@eddyb
Copy link
Member Author

eddyb commented May 8, 2019

Also, we never need more than 32-bit indices so we could skip the capacity-based logic.

@cuviper
Copy link
Member

cuviper commented May 8, 2019

I wonder how the novel parts of SwissTable/hashbrown would play with this.

I actually played with that too:
indexmap-rs/indexmap@master...cuviper:hashbrown

cargo benchcmp
$ cargo benchcmp master hashbrown
 name                                         master ns/iter  hashbrown ns/iter  diff ns/iter   diff %  speedup
 entry_hashmap_150                            2,873           2,927                        54    1.88%   x 0.98
 entry_orderedmap_150                         4,329           5,393                     1,064   24.58%   x 0.80
 few_retain_hashmap_100_000                   951,637         963,837                  12,200    1.28%   x 0.99
 few_retain_ordermap_100_000                  1,520,027       2,130,924               610,897   40.19%   x 0.71
 grow_fnv_hashmap_100_000                     1,857,807       1,849,671                -8,136   -0.44%   x 1.00
 grow_fnv_ordermap_100_000                    4,671,162       4,274,375              -396,787   -8.49%   x 1.09
 half_retain_hashmap_100_000                  1,033,810       1,033,141                  -669   -0.06%   x 1.00
 half_retain_ordermap_100_000                 1,590,546       2,125,692               535,146   33.65%   x 0.75
 hashmap_merge_shuffle                        589,718         593,282                   3,564    0.60%   x 0.99
 hashmap_merge_simple                         438,375         484,048                  45,673   10.42%   x 0.91
 insert_hashmap_100_000                       2,358,045       2,356,133                -1,912   -0.08%   x 1.00
 insert_hashmap_10_000                        204,200         203,430                    -770   -0.38%   x 1.00
 insert_hashmap_150                           3,048           3,026                       -22   -0.72%   x 1.01
 insert_hashmap_int_bigvalue_10_000           244,731         242,986                  -1,745   -0.71%   x 1.01
 insert_hashmap_str_10_000                    229,241         227,691                  -1,550   -0.68%   x 1.01
 insert_hashmap_string_10_000                 1,089,635       1,083,683                -5,952   -0.55%   x 1.01
 insert_hashmap_string_10_000                 1,170,007       1,068,880              -101,127   -8.64%   x 1.09
 insert_hashmap_string_oneshot_10_000         1,120,157       1,017,050              -103,107   -9.20%   x 1.10
 insert_orderedmap_100_000                    2,555,141       4,359,721             1,804,580   70.63%   x 0.59
 insert_orderedmap_10_000                     259,896         370,496                 110,600   42.56%   x 0.70
 insert_orderedmap_150                        3,788           5,477                     1,689   44.59%   x 0.69
 insert_orderedmap_int_bigvalue_10_000        339,440         353,765                  14,325    4.22%   x 0.96
 insert_orderedmap_str_10_000                 295,646         276,151                 -19,495   -6.59%   x 1.07
 insert_orderedmap_string_10_000              1,000,420       1,064,138                63,718    6.37%   x 0.94
 insert_orderedmap_string_10_000              1,019,860       1,073,530                53,670    5.26%   x 0.95
 iter_black_box_hashmap_10_000                26,777          26,172                     -605   -2.26%   x 1.02
 iter_black_box_orderedmap_10_000             3,258           3,539                       281    8.62%   x 0.92
 iter_sum_hashmap_10_000                      16,425          16,179                     -246   -1.50%   x 1.02
 iter_sum_orderedmap_10_000                   2,912           2,982                        70    2.40%   x 0.98
 lookup_hashmap_100_000_inorder_multi         100,554         99,595                     -959   -0.95%   x 1.01
 lookup_hashmap_100_000_multi                 99,856          99,167                     -689   -0.69%   x 1.01
 lookup_hashmap_100_000_single                21              21                            0    0.00%   x 1.00
 lookup_hashmap_10_000_exist                  92,344          92,134                     -210   -0.23%   x 1.00
 lookup_hashmap_10_000_exist_string           158,699         161,272                   2,573    1.62%   x 0.98
 lookup_hashmap_10_000_exist_string_oneshot   128,610         128,177                    -433   -0.34%   x 1.00
 lookup_hashmap_10_000_noexist                83,615          83,064                     -551   -0.66%   x 1.01
 lookup_orderedmap_10_000_exist               143,015         124,320                 -18,695  -13.07%   x 1.15
 lookup_orderedmap_10_000_noexist             143,917         98,109                  -45,808  -31.83%   x 1.47
 lookup_ordermap_100_000_inorder_multi        110,905         130,411                  19,506   17.59%   x 0.85
 lookup_ordermap_100_000_multi                154,486         182,357                  27,871   18.04%   x 0.85
 lookup_ordermap_100_000_single               33              37                            4   12.12%   x 0.89
 lookup_ordermap_10_000_exist_string          210,793         222,767                  11,974    5.68%   x 0.95
 lookup_ordermap_10_000_exist_string_oneshot  192,294         190,735                  -1,559   -0.81%   x 1.01
 many_retain_hashmap_100_000                  448,060         449,876                   1,816    0.41%   x 1.00
 many_retain_ordermap_100_000                 1,217,638       1,569,468               351,830   28.89%   x 0.78
 new_hashmap                                  6               6                             0    0.00%   x 1.00
 new_orderedmap                               5               27                           22  440.00%   x 0.19
 ordermap_clone_for_sort_s                    446,804         474,899                  28,095    6.29%   x 0.94
 ordermap_clone_for_sort_u32                  14,368          36,350                   21,982  152.99%   x 0.40
 ordermap_merge_shuffle                       642,188         638,453                  -3,735   -0.58%   x 1.01
 ordermap_merge_simple                        443,371         438,121                  -5,250   -1.18%   x 1.01
 ordermap_simple_sort_s                       2,840,960       2,953,374               112,414    3.96%   x 0.96
 ordermap_simple_sort_u32                     798,633         830,050                  31,417    3.93%   x 0.96
 ordermap_sort_s                              2,292,873       2,403,795               110,922    4.84%   x 0.95
 ordermap_sort_u32                            631,663         646,202                  14,539    2.30%   x 0.98
 pop_ordermap_100_000                         1,752,485       2,500,007               747,522   42.65%   x 0.70
 remove_ordermap_100_000                      4,484,051       6,166,361             1,682,310   37.52%   x 0.73
 with_capacity_10e5_hashmap                   933             894                         -39   -4.18%   x 1.04
 with_capacity_10e5_orderedmap                4,103           404                      -3,699  -90.15%  x 10.16

That was hacked together, so I'm sure there's room for improvement.

@Amanieu
Copy link
Member

Amanieu commented May 8, 2019

I wonder if the slowdown might be due to the use of usize indicies instead of u32. Could you try changing the index type to u32 to see if this improves performance?

Another thing to watch out for is to ensure that all methods are being properly inlined, since the implementation is more complex now. You can try running one of the lookup benchmarks under perf to see if something isn't getting inlined.

@bluss
Copy link
Member

bluss commented Oct 8, 2019

@cuviper Nice work on that experiment!
@eddyb I'm delighted you didn't know IndexMap is just a simple bespoke hashmap with zero unsafe code (everything on top of Vec). 😊 Of course, if there is are big enough wins, it can use unsafe.

@nnethercote
Copy link
Contributor

Examples of pairs of fields we can replace:

names: FxHashMap<&'static str, Symbol>,
strings: Vec<&'static str>,

I tried doing this one and it hurt performance on several benchmarks. Here are the check-clean instruction counts for benchmarks that saw a change.

unused-warnings-check
	avg: 21.4%	min: 21.4%	max: 21.4%
unicode_normalization-check
	avg: 4.6%	min: 4.6%	max: 4.6%
helloworld-check
	avg: 3.1%	min: 3.1%	max: 3.1%
issue-46449-check
	avg: 1.3%	min: 1.3%	max: 1.3%
token-stream-stress-check
	avg: 1.0%	min: 1.0%	max: 1.0%
unify-linearly-check
	avg: 0.8%	min: 0.8%	max: 0.8%
await-call-tree-check
	avg: 0.8%	min: 0.8%	max: 0.8%
deeply-nested-check
	avg: 0.7%	min: 0.7%	max: 0.7%
tokio-webpush-simple-check
	avg: 0.6%	min: 0.6%	max: 0.6%
regression-31157-check
	avg: 0.3%	min: 0.3%	max: 0.3%
issue-58319-check
	avg: 0.3%	min: 0.3%	max: 0.3%
ripgrep-check
	avg: 0.3%	min: 0.3%	max: 0.3%
many-assoc-items-check
	avg: 0.3%	min: 0.3%	max: 0.3%
encoding-check
	avg: 0.3%	min: 0.3%	max: 0.3%
webrender-wrench-check
	avg: 0.2%	min: 0.2%	max: 0.2%
coercions-check
	avg: 0.2%?	min: 0.2%?	max: 0.2%?
tuple-stress-check
	avg: 0.2%	min: 0.2%	max: 0.2%
syn-check
	avg: 0.2%	min: 0.2%	max: 0.2%
regex-check
	avg: 0.2%	min: 0.2%	max: 0.2%
deep-vector-check
	avg: 0.1%	min: 0.1%	max: 0.1%
piston-image-check
	avg: 0.1%	min: 0.1%	max: 0.1%
ucd-check
	avg: 0.1%	min: 0.1%	max: 0.1%
webrender-check
	avg: 0.1%	min: 0.1%	max: 0.1%
hyper-2-check
	avg: 0.1%	min: 0.1%	max: 0.1%
cranelift-codegen-check
	avg: 0.1%	min: 0.1%	max: 0.1%
futures-check
	avg: 0.1%	min: 0.1%	max: 0.1%
cargo-check
	avg: 0.1%	min: 0.1%	max: 0.1%
serde-serde_derive-check
	avg: 0.1%	min: 0.1%	max: 0.1%
clap-rs-check
	avg: 0.1%	min: 0.1%	max: 0.1%

Here is the top part of the Cachegrind diff for unused-warnings:

--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
438,295,034 (100.0%)  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir                    file:function
--------------------------------------------------------------------------------
185,518,839 (42.33%)  /home/njn/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/indexmap-1.0.2/src/map.rs:indexmap::map::IndexMap<K,V,S>::get_full
 74,094,065 (16.91%)  /build/glibc-t7JzpG/glibc-2.30/string/../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S:__memcmp_avx2_movbe
 64,470,873 (14.71%)  /home/njn/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/indexmap-1.0.2/src/map.rs:indexmap::map::IndexMap<K,V,S>::entry
 35,332,677 ( 8.06%)  /home/njn/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/indexmap-1.0.2/src/map.rs:indexmap::map::VacantEntry<K,V>::insert
 28,585,267 ( 6.52%)  /home/njn/moz/rustN/src/libcore/slice/mod.rs:indexmap::map::IndexMap<K,V,S>::get_full
 26,516,813 ( 6.05%)  /home/njn/moz/rustN/src/libcore/slice/mod.rs:indexmap::map::IndexMap<K,V,S>::entry
 19,993,527 ( 4.56%)  /home/njn/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/indexmap-1.0.2/src/map.rs:indexmap::map::OrderMapCore<K,V>::double_capacity
 18,393,298 ( 4.20%)  /home/njn/moz/rustN/src/libcore/num/mod.rs:indexmap::map::IndexMap<K,V,S>::get_full
  5,567,548 ( 1.27%)  /home/njn/moz/rustN/src/libcore/num/mod.rs:indexmap::map::IndexMap<K,V,S>::entry

So that one currently isn't viable.

@cuviper
Copy link
Member

cuviper commented Jun 10, 2020

FYI, I've ported indexmap to hashbrown's RawTable in indexmap-rs/indexmap#131, and so far it does appear to mitigate the loss @nnethercote saw. If we land that, and maybe the custom index support I'm playing with too, then I'll open a rust PR making use of it.

@cuviper
Copy link
Member

cuviper commented Aug 8, 2020

pub params: Vec<GenericParamDef>,
pub param_def_id_to_index: FxHashMap<DefId, u32>,

This one isn't actually mapping to Vec indexes, but rather param.index, which isn't always the same.

The other examples and more are converted in #75278.

@bors bors closed this as completed in 18f3be7 Aug 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants