Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ascii::escape_default #94776

Merged
merged 3 commits into from
Mar 11, 2022
Merged

Conversation

martingms
Copy link
Contributor

@martingms martingms commented Mar 9, 2022

ascii::escape_default showed up as a hot function when compiling deunicode-1.3.1 in @nnethercote's analysis of @lqd's rustc-benchmarking-data.
After taking a look at the generated assembly it looked like a LUT-based approach could be faster for hexify()-ing ascii characters, so that's what this PR implements

The patch looks like it provides about a 1-2% improvement in instructions for that particular crate. This should definitely be verified with a perf run as I'm still getting used to the rustc-perf tooling and might easily have made an error!

@rust-highfive
Copy link
Collaborator

r? @scottmcm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 9, 2022
@lqd
Copy link
Member

lqd commented Mar 9, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 9, 2022
@bors
Copy link
Contributor

bors commented Mar 9, 2022

⌛ Trying commit 8761424 with merge cf6cd92ed676344f27c15baa799fa0d5092a60c5...

@bors
Copy link
Contributor

bors commented Mar 9, 2022

☀️ Try build successful - checks-actions
Build commit: cf6cd92ed676344f27c15baa799fa0d5092a60c5 (cf6cd92ed676344f27c15baa799fa0d5092a60c5)

@rust-timer
Copy link
Collaborator

Queued cf6cd92ed676344f27c15baa799fa0d5092a60c5 with parent 10dccdc, future comparison URL.

@nnethercote
Copy link
Contributor

Nice work! Do you have measurements from rustc-perf? You can copy and paste the tables from the "compare" page into a GitHub comment and it should auto-format the markdown for you (example).

I think there may be scope for further improvements here. I ran Cachegrind, the profile has this:

        .           impl Iterator for EscapeDefault {
        .               type Item = u8;
7,040,008 ( 2.39%)      fn next(&mut self) -> Option<u8> {
8,224,494 ( 2.80%)          self.range.next().map(|i| self.data[i as usize])
7,040,008 ( 2.39%)      }

for library/core/src/ascii.rs which suggests that inlining that method might help.

Beyond that, the analysis identified that memory allocation rates are quite high for this benchmark. I ran DHAT to get more info. (I had to increase the --num-callers value from 4 to 8 here to get the stack traces this detailed.)

  │   │   │   ├── PP 1.2.1.1.1/2 {
  │   │   │   │     Total:     4,194,296 bytes (10.29%, 14,056.51/Minstr) in 19 blocks (0.04%, 0.06/Minstr), avg size 220,752.42 bytes, avg lifetime 3,771,447.68 instrs (1.26% of program duration)
  │   │   │   │     Max:       2,097,152 bytes in 1 blocks, avg size 2,097,152 bytes
  │   │   │   │     At t-gmax: 0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
  │   │   │   │     At t-end:  0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
  │   │   │   │     Reads:     4,838,642 bytes (6.95%, 16,215.93/Minstr), 1.15/byte
  │   │   │   │     Writes:    3,467,893 bytes (9.43%, 11,622.08/Minstr), 0.83/byte
  │   │   │   │     Allocated at {
  │   │   │   │       ^1: 0xCBA1CF4: alloc::raw_vec::finish_grow::<alloc::alloc::Global> (raw_vec.rs:0)
  │   │   │   │       ^2: 0xCBA18BB: grow_amortized<u8, alloc::alloc::Global> (raw_vec.rs:400)
  │   │   │   │       ^3: 0xCBA18BB: <alloc::raw_vec::RawVec<u8>>::reserve_for_push (raw_vec.rs:298)
  │   │   │   │       ^4: 0xCBE4813: push<u8, alloc::alloc::Global> (mod.rs:1729)
  │   │   │   │       ^5: 0xCBE4813: push (string.rs:1148)
  │   │   │   │       ^6: 0xCBE4813: {closure#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>> (string.rs:1955)
  │   │   │   │       ^7: 0xCBE4813: {closure#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>> (iterator.rs:770)
  │   │   │   │       ^8: 0xCBE4813: {closure#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>> (map.rs:84)
  │   │   │   │       ^9: 0xCBE4813: call_mut<((), u8), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>> (function.rs:269)
  │   │   │   │       ^10: 0xCBE4813: <core::ascii::EscapeDefault as core::iter::traits::iterator::Iterator>::fold::<(), &mut core::iter::adapters::map::map_fold<u8, char, (), <u8 as core::convert::Into<char>>::into, core::iter::traits::iterator::Iterator::for_each::call<char, <alloc::string::String as core::iter::traits::collect::Extend<char>>::extend<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, core::ascii::escape_default>, <u8 as core::convert::Into<char>>::into>>::{closure#0}>::{closure#0}>::{closure#0}> (iterator.rs:2285)
  │   │   │   │       ^11: 0xCBE6120: {closure#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>> (flatten.rs:379)
  │   │   │   │       ^12: 0xCBE6120: {closure#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>> (map.rs:84)
  │   │   │   │       ^13: 0xCBE6120: {closure#0}<&u8, u8, (), fn(&u8) -> u8, core::iter::adapters::map::map_fold::{closure_env#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>>> (map.rs:84)
  │   │   │   │       ^14: 0xCBE6120: fold<core::slice::iter::Iter<u8>, (), core::iter::adapters::map::map_fold::{closure_env#0}<&u8, u8, (), fn(&u8) -> u8, core::iter::adapters::map::map_fold::{closure_env#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>>>> (iterator.rs:2285)
  │   │   │   │       ^15: 0xCBE6120: fold<u8, core::slice::iter::Iter<u8>, fn(&u8) -> u8, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>>> (map.rs:124)
  │   │   │   │       ^16: 0xCBE6120: fold<core::slice::iter::Iter<u8>, u8, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>>> (cloned.rs:60)
  │   │   │   │       ^17: 0xCBE6120: fold<core::ascii::EscapeDefault, core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, fn(u8) -> core::ascii::EscapeDefault, (), core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>> (map.rs:124)
  │   │   │   │       ^18: 0xCBE6120: fold<core::iter::adapters::map::Map<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, fn(u8) -> core::ascii::EscapeDefault>, (), core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>> (fuse.rs:118)
  │   │   │   │       ^19: 0xCBE6120: fold<core::iter::adapters::map::Map<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, fn(u8) -> core::ascii::EscapeDefault>, core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>> (flatten.rs:386)
  │   │   │   │       ^20: 0xCBE6120: fold<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>> (flatten.rs:74)
  │   │   │   │       ^21: 0xCBE6120: fold<char, core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char, (), core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>> (map.rs:124)
  │   │   │   │       ^22: 0xCBE6120: for_each<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>> (iterator.rs:773)
  │   │   │   │       ^23: 0xCBE6120: extend<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>> (string.rs:1955)
  │   │   │   │       ^24: 0xCBE6120: <alloc::string::String as core::iter::traits::collect::FromIterator<char>>::from_iter::<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, core::ascii::escape_default>, <u8 as core::convert::Into<char>>::into>> (string.rs:1874)
  │   │   │   │       ^25: 0xCB8CFDE: collect<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>, alloc::string::String> (iterator.rs:1778)
  │   │   │   │       ^26: 0xCB8CFDE: <rustc_ast::ast::LitKind>::to_lit_token (literal.rs:167)
  │   │   │   │       #27: 0xCB8DE56: <rustc_ast::ast::Lit>::from_lit_kind (literal.rs:242)
  │   │   │   │       #28: 0xC08F991: <rustc_expand::base::ExtCtxt>::expr_lit (build.rs:297)
  │   │   │   │     }
  │   │   │   │   }
  │   │   │   └── PP 1.2.1.1.2/2 {
  │   │   │         Total:     4,194,296 bytes (10.29%, 14,056.51/Minstr) in 19 blocks (0.04%, 0.06/Minstr), avg size 220,752.42 bytes, avg lifetime 3,796,222.26 instrs (1.27% of program duration)
  │   │   │         Max:       2,097,152 bytes in 1 blocks, avg size 2,097,152 bytes
  │   │   │         At t-gmax: 0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
  │   │   │         At t-end:  0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
  │   │   │         Reads:     4,838,642 bytes (6.95%, 16,215.93/Minstr), 1.15/byte
  │   │   │         Writes:    3,467,893 bytes (9.43%, 11,622.08/Minstr), 0.83/byte
  │   │   │         Allocated at {
  │   │   │           ^1: 0xCBA1CF4: alloc::raw_vec::finish_grow::<alloc::alloc::Global> (raw_vec.rs:0)
  │   │   │           ^2: 0xCBA18BB: grow_amortized<u8, alloc::alloc::Global> (raw_vec.rs:400)
  │   │   │           ^3: 0xCBA18BB: <alloc::raw_vec::RawVec<u8>>::reserve_for_push (raw_vec.rs:298)
  │   │   │           ^4: 0xCBE4813: push<u8, alloc::alloc::Global> (mod.rs:1729)
  │   │   │           ^5: 0xCBE4813: push (string.rs:1148)
  │   │   │           ^6: 0xCBE4813: {closure#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>> (string.rs:1955)
  │   │   │           ^7: 0xCBE4813: {closure#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>> (iterator.rs:770)
  │   │   │           ^8: 0xCBE4813: {closure#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>> (map.rs:84)
  │   │   │           ^9: 0xCBE4813: call_mut<((), u8), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>> (function.rs:269)
  │   │   │           ^10: 0xCBE4813: <core::ascii::EscapeDefault as core::iter::traits::iterator::Iterator>::fold::<(), &mut core::iter::adapters::map::map_fold<u8, char, (), <u8 as core::convert::Into<char>>::into, core::iter::traits::iterator::Iterator::for_each::call<char, <alloc::string::String as core::iter::traits::collect::Extend<char>>::extend<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, core::ascii::escape_default>, <u8 as core::convert::Into<char>>::into>>::{closure#0}>::{closure#0}>::{closure#0}> (iterator.rs:2285)
  │   │   │           ^11: 0xCBE6120: {closure#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>> (flatten.rs:379)
  │   │   │           ^12: 0xCBE6120: {closure#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>> (map.rs:84)
  │   │   │           ^13: 0xCBE6120: {closure#0}<&u8, u8, (), fn(&u8) -> u8, core::iter::adapters::map::map_fold::{closure_env#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>>> (map.rs:84)
  │   │   │           ^14: 0xCBE6120: fold<core::slice::iter::Iter<u8>, (), core::iter::adapters::map::map_fold::{closure_env#0}<&u8, u8, (), fn(&u8) -> u8, core::iter::adapters::map::map_fold::{closure_env#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>>>> (iterator.rs:2285)
  │   │   │           ^15: 0xCBE6120: fold<u8, core::slice::iter::Iter<u8>, fn(&u8) -> u8, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>>> (map.rs:124)
  │   │   │           ^16: 0xCBE6120: fold<core::slice::iter::Iter<u8>, u8, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, core::ascii::EscapeDefault, (), fn(u8) -> core::ascii::EscapeDefault, core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>>> (cloned.rs:60)
  │   │   │           ^17: 0xCBE6120: fold<core::ascii::EscapeDefault, core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, fn(u8) -> core::ascii::EscapeDefault, (), core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>> (map.rs:124)
  │   │   │           ^18: 0xCBE6120: fold<core::iter::adapters::map::Map<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, fn(u8) -> core::ascii::EscapeDefault>, (), core::iter::adapters::flatten::{impl#17}::fold::flatten::{closure_env#0}<core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>>> (fuse.rs:118)
  │   │   │           ^19: 0xCBE6120: fold<core::iter::adapters::map::Map<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, fn(u8) -> core::ascii::EscapeDefault>, core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>> (flatten.rs:386)
  │   │   │           ^20: 0xCBE6120: fold<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault, (), core::iter::adapters::map::map_fold::{closure_env#0}<u8, char, (), fn(u8) -> char, core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>>> (flatten.rs:74)
  │   │   │           ^21: 0xCBE6120: fold<char, core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char, (), core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<char, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>>> (map.rs:124)
  │   │   │           ^22: 0xCBE6120: for_each<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>, alloc::string::{impl#11}::extend::{closure_env#0}<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>>> (iterator.rs:773)
  │   │   │           ^23: 0xCBE6120: extend<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>> (string.rs:1955)
  │   │   │           ^24: 0xCBE6120: <alloc::string::String as core::iter::traits::collect::FromIterator<char>>::from_iter::<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, core::ascii::escape_default>, <u8 as core::convert::Into<char>>::into>> (string.rs:1874)
  │   │   │           ^25: 0xCB8CFDE: collect<core::iter::adapters::map::Map<core::iter::adapters::flatten::FlatMap<core::iter::adapters::cloned::Cloned<core::slice::iter::Iter<u8>>, core::ascii::EscapeDefault, fn(u8) -> core::ascii::EscapeDefault>, fn(u8) -> char>, alloc::string::String> (iterator.rs:1778)
  │   │   │           ^26: 0xCB8CFDE: <rustc_ast::ast::LitKind>::to_lit_token (literal.rs:167)
  │   │   │           #27: 0xBFC0A9F: print_literal (lib.rs:1288)
  │   │   │           #28: 0xBFC0A9F: <rustc_hir_pretty::State>::print_expr (lib.rs:1442)
  │   │   │           #29: 0xBE87BC4: {closure#0} (encoder.rs:1376)
  │   │   │           #30: 0xBE87BC4: rustc_hir_pretty::to_string::<<rustc_metadata::rmeta::encoder::EncodeContext>::encode_rendered_const_for_body::{closure#0}> (lib.rs:190)

Frame 26 is interesting in both PPs, the relevant code in to_lit_token looks like this:

            LitKind::ByteStr(ref bytes) => {
                let string = bytes
                    .iter()
                    .cloned()
                    .flat_map(ascii::escape_default)
                    .map(Into::<char>::into)
                    .collect::<String>();
                (token::ByteStr, Symbol::intern(&string), None)
            }

So it's building up a very large (4MB) string, and it does it twice. That's quite the iterator chain, I wonder if it can be improved. Alternatively, I wonder if looking at the stack frames beneath could be useful -- are both of these calls to to_lit_token truly necessary? It's related to the large pointers.bin file in deunicode that gets included via include_str!.

Comment on lines 113 to 118
unsafe {
(
*hex_digits.get_unchecked((b >> 4) as usize),
*hex_digits.get_unchecked((b & 0xf) as usize),
)
}
Copy link
Contributor

@paolobarbolini paolobarbolini Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by review: this might do without use of unsafe

https://rust.godbolt.org/z/5q3cW1Gox

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, I didn't even try that, I've yet to get a good feeling for what gets optimized and not. I'll do a local perf run with that and see if there's any changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were right of course, addressed in 7f4f4fc

Based on @paolobarbolini's tip that the unsafe block was unnecessary in
this case.

Not much left of `hexify()` after this, so seemed clearer to just inline
it.
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (cf6cd92ed676344f27c15baa799fa0d5092a60c5): comparison url.

Summary: This benchmark run shows 1 relevant improvement 🎉 to instruction counts.

  • Largest improvement in instruction counts: -3.0% on incr-patched: add static arr item builds of coercions debug

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 9, 2022
@nnethercote
Copy link
Contributor

@martingms Your results are so much better than the CI results that I suspect they are too good to be true. I've never seen escape_default be hot for any benchmark other than deunicode-1.3.1.

Perhaps something went wrong with the measurements. Are you sure you are comparing two compilers that are identical other than this change? I typically have two rust repository clones checked out to the same revision. I leave the first one unchanged and then make my changes in the second one, and I make sure I compile them in the same way, with the same config.toml.

@martingms
Copy link
Contributor Author

I do the same, but I might've changed config on one without doing a full rebuild. I'll do full rebuilds and try again 👍

@martingms
Copy link
Contributor Author

Looks like rebuilding llvm did it, this looks more realistic perhaps :)

Benchmark & Profile Scenario % Change Significance Factor ?    
deep-vector debug full -2.63% 13.13x 11313258353.00 11016111491.00
deunicode-1.3.1 check incr-unchanged -1.53% 7.65x 296984418.00 292443286.00
deunicode-1.3.1 check full -1.37% 6.83x 332261934.00 327723858.00
deunicode-1.3.1 opt incr-unchanged -1.19% 5.96x 380435762.00 375897578.00
deunicode-1.3.1 debug incr-unchanged -1.18% 5.88x 388526958.00 383960627.00
deunicode-1.3.1 check incr-full -1.08% 5.38x 416786964.00 412303369.00
deunicode-1.3.1 debug full -0.49% 2.45x 900889393.00 896480493.00
deep-vector debug incr-full -0.47% 2.35x 12086080265.00 12029283594.00
unify-linearly debug incr-patched: dummy fn 0.46% 2.30x 627514481.00 630398114.00
unify-linearly debug incr-unchanged 0.44% 2.22x 623247450.00 626012742.00
deunicode-1.3.1 debug incr-full -0.37% 1.83x 1177005118.00 1172706963.00
deep-vector debug incr-patched: add vec item -0.36% 1.79x 11855408909.00 11812968471.00
unify-linearly debug incr-full 0.35% 1.74x 782488607.00 785215778.00
deunicode-1.3.1 opt full -0.30% 1.52x 1631810834.00 1626836580.00

@martingms
Copy link
Contributor Author

You were right @nnethercote, inlining next() was very fruitful 👍

Benchmark & Profile Scenario % Change Significance Factor ?    
deunicode-1.3.1 check incr-unchanged -12.47% 62.34x 292443286.00 255979774.00
deunicode-1.3.1 check full -11.13% 55.66x 327723858.00 291240653.00
deunicode-1.3.1 opt incr-unchanged -9.70% 48.51x 375897578.00 339428840.00
deunicode-1.3.1 debug incr-unchanged -9.49% 47.47x 383960627.00 347508475.00
deunicode-1.3.1 check incr-full -8.85% 44.25x 412303369.00 375816166.00
deunicode-1.3.1 debug full -4.05% 20.26x 896480493.00 860151895.00
deunicode-1.3.1 debug incr-full -3.13% 15.65x 1172706963.00 1136008899.00
deunicode-1.3.1 opt full -2.20% 11.01x 1626836580.00 1591021711.00
deunicode-1.3.1 opt incr-full -2.01% 10.07x 1800748831.00 1764464608.00
coercions debug incr-patched: println -0.31% 1.54x 3965591305.00 3953400814.00

@martingms martingms changed the title Optimize ascii::escape_default by using a digit LUT Optimize ascii::escape_default Mar 10, 2022
@lqd
Copy link
Member

lqd commented Mar 10, 2022

Let's do this for peace of mind again, even if perf.rlo doesn't track deunicode

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 10, 2022
@bors
Copy link
Contributor

bors commented Mar 10, 2022

⌛ Trying commit c62ab42 with merge 41e02b6896c4e3953ae231dfc8553bab2b2b8e25...

@bors
Copy link
Contributor

bors commented Mar 10, 2022

☀️ Try build successful - checks-actions
Build commit: 41e02b6896c4e3953ae231dfc8553bab2b2b8e25 (41e02b6896c4e3953ae231dfc8553bab2b2b8e25)

@rust-timer
Copy link
Collaborator

Queued 41e02b6896c4e3953ae231dfc8553bab2b2b8e25 with parent ba14a83, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (41e02b6896c4e3953ae231dfc8553bab2b2b8e25): comparison url.

Summary: This benchmark run did not return any relevant results.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 10, 2022
@nnethercote
Copy link
Contributor

This looks good. If there are additional improvements (in or beneath to_lit_token) they can happen in a follow-up.

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Mar 11, 2022

📌 Commit c62ab42 has been approved by nnethercote

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 11, 2022
@martingms
Copy link
Contributor Author

If there are additional improvements (in or beneath to_lit_token) they can happen in a follow-up.

I'll take a look at that next week if no-one beats me to it :)

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 11, 2022
Rollup of 5 pull requests

Successful merges:

 - rust-lang#93283 (Fix for localized windows editions in testcase fn read_link() Issue#93211)
 - rust-lang#94592 (Fallback to top-level config.toml if not present in current directory, and remove fallback for env vars and CLI flags)
 - rust-lang#94776 (Optimize ascii::escape_default)
 - rust-lang#94840 (update `replace_bound_vars_with_placeholders` doc comment)
 - rust-lang#94842 (Remove unnecessary try_opt for operations that cannot fail)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit cdd6d39 into rust-lang:master Mar 11, 2022
@rustbot rustbot added this to the 1.61.0 milestone Mar 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants