core/char: Speed up `to_digit()` for `radix <= 10` #55932

Turbo87 · 2018-11-13T17:37:34Z

I noticed that char::to_digit() seemed to do a bit of extra work for handling [a-zA-Z] characters. Since to_digit(10) seems to be the most common case (at least in the rust codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the radix < 10 case, which also seems to have a positive effect.

It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something!

Before

# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16      ... bench:      13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2       ... bench:      13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36      ... bench:      14,236 ns/iter (+/- 1,949)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16      ... bench:      13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2       ... bench:      13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36      ... bench:      14,206 ns/iter (+/- 1,312)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16      ... bench:      14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2       ... bench:      13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36      ... bench:      14,388 ns/iter (+/- 1,068)

After

# Run 1
test char::methods::bench_to_digit_radix_10      ... bench:      11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16      ... bench:      12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2       ... bench:      11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36      ... bench:      14,213 ns/iter (+/- 614)

# Run 2
test char::methods::bench_to_digit_radix_10      ... bench:      11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16      ... bench:      12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2       ... bench:      11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36      ... bench:      14,249 ns/iter (+/- 3,514)

# Run 3
test char::methods::bench_to_digit_radix_10      ... bench:      11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16      ... bench:      12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2       ... bench:      11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36      ... bench:      14,188 ns/iter (+/- 1,501)

I ran the benchmark using:

python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit"

rust-highfive · 2018-11-13T17:37:44Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

scottmcm · 2018-11-13T19:02:08Z

Consider whether there should also be a benchmark for non-constant radix. I assume this will make that path slower, which is probably a good trade-off, but sounds worth quantifying.

Turbo87 · 2018-11-13T19:21:44Z

@scottmcm yeah, I was wondering about that too. I'm not sure how to write such a benchmark though. I assume if I use the following code it would just inline the variable and use the constant radix path too:

let radix = 8u32;
c.to_digit(radix);

any clues on how I can make sure that the compiler does not consider this a constant? would "32".parse::<u32>() be an option?

Xanewok · 2018-11-13T20:43:14Z

@Turbo87 since we can use libtest here, maybe black_box is worth a shot?

Turbo87 · 2018-11-13T20:46:29Z

@Xanewok good idea. I assume that was just for blackboxing the output, but it seems it might also work for input. I'll try it out and report back 🤔

### Before ``` # Run 1 test char::methods::bench_to_digit_radix_10 ... bench: 16,265 ns/iter (+/- 1,774) test char::methods::bench_to_digit_radix_16 ... bench: 13,938 ns/iter (+/- 2,479) test char::methods::bench_to_digit_radix_2 ... bench: 13,090 ns/iter (+/- 524) test char::methods::bench_to_digit_radix_36 ... bench: 14,236 ns/iter (+/- 1,949) # Run 2 test char::methods::bench_to_digit_radix_10 ... bench: 16,176 ns/iter (+/- 1,589) test char::methods::bench_to_digit_radix_16 ... bench: 13,896 ns/iter (+/- 3,140) test char::methods::bench_to_digit_radix_2 ... bench: 13,158 ns/iter (+/- 1,112) test char::methods::bench_to_digit_radix_36 ... bench: 14,206 ns/iter (+/- 1,312) # Run 3 test char::methods::bench_to_digit_radix_10 ... bench: 16,221 ns/iter (+/- 2,423) test char::methods::bench_to_digit_radix_16 ... bench: 14,361 ns/iter (+/- 3,926) test char::methods::bench_to_digit_radix_2 ... bench: 13,097 ns/iter (+/- 671) test char::methods::bench_to_digit_radix_36 ... bench: 14,388 ns/iter (+/- 1,068) ``` ### After ``` # Run 1 test char::methods::bench_to_digit_radix_10 ... bench: 11,521 ns/iter (+/- 552) test char::methods::bench_to_digit_radix_16 ... bench: 12,926 ns/iter (+/- 684) test char::methods::bench_to_digit_radix_2 ... bench: 11,266 ns/iter (+/- 1,085) test char::methods::bench_to_digit_radix_36 ... bench: 14,213 ns/iter (+/- 614) # Run 2 test char::methods::bench_to_digit_radix_10 ... bench: 11,424 ns/iter (+/- 1,042) test char::methods::bench_to_digit_radix_16 ... bench: 12,854 ns/iter (+/- 1,193) test char::methods::bench_to_digit_radix_2 ... bench: 11,193 ns/iter (+/- 716) test char::methods::bench_to_digit_radix_36 ... bench: 14,249 ns/iter (+/- 3,514) # Run 3 test char::methods::bench_to_digit_radix_10 ... bench: 11,469 ns/iter (+/- 685) test char::methods::bench_to_digit_radix_16 ... bench: 12,852 ns/iter (+/- 568) test char::methods::bench_to_digit_radix_2 ... bench: 11,275 ns/iter (+/- 1,356) test char::methods::bench_to_digit_radix_36 ... bench: 14,188 ns/iter (+/- 1,501) ```

Turbo87 · 2018-11-13T21:08:21Z

@scottmcm @Xanewok these are the results including the new bench_to_digit_radix_var benchmark:

Before

# Run 1
test char::methods::bench_to_digit_radix_10        ... bench:      16,087 ns/iter (+/- 846)
test char::methods::bench_to_digit_radix_16        ... bench:      14,161 ns/iter (+/- 721)
test char::methods::bench_to_digit_radix_2         ... bench:      14,269 ns/iter (+/- 4,107)
test char::methods::bench_to_digit_radix_36        ... bench:      14,195 ns/iter (+/- 1,169)
test char::methods::bench_to_digit_radix_var       ... bench:      23,104 ns/iter (+/- 2,296)

# Run 2
test char::methods::bench_to_digit_radix_10        ... bench:      16,122 ns/iter (+/- 2,736)
test char::methods::bench_to_digit_radix_16        ... bench:      14,165 ns/iter (+/- 4,147)
test char::methods::bench_to_digit_radix_2         ... bench:      14,048 ns/iter (+/- 4,400)
test char::methods::bench_to_digit_radix_36        ... bench:      14,136 ns/iter (+/- 608)
test char::methods::bench_to_digit_radix_var       ... bench:      23,045 ns/iter (+/- 1,621)

# Run 3
test char::methods::bench_to_digit_radix_10        ... bench:      16,018 ns/iter (+/- 536)
test char::methods::bench_to_digit_radix_16        ... bench:      14,157 ns/iter (+/- 886)
test char::methods::bench_to_digit_radix_2         ... bench:      14,178 ns/iter (+/- 2,260)
test char::methods::bench_to_digit_radix_36        ... bench:      14,177 ns/iter (+/- 865)
test char::methods::bench_to_digit_radix_var       ... bench:      23,043 ns/iter (+/- 1,983)

After

# Run 1
test char::methods::bench_to_digit_radix_10        ... bench:      11,518 ns/iter (+/- 497)
test char::methods::bench_to_digit_radix_16        ... bench:      14,623 ns/iter (+/- 1,260)
test char::methods::bench_to_digit_radix_2         ... bench:      11,240 ns/iter (+/- 1,430)
test char::methods::bench_to_digit_radix_36        ... bench:      14,189 ns/iter (+/- 854)
test char::methods::bench_to_digit_radix_var       ... bench:      24,513 ns/iter (+/- 3,337)

# Run 2
test char::methods::bench_to_digit_radix_10        ... bench:      11,536 ns/iter (+/- 1,549)
test char::methods::bench_to_digit_radix_16        ... bench:      14,602 ns/iter (+/- 1,033)
test char::methods::bench_to_digit_radix_2         ... bench:      13,940 ns/iter (+/- 9,252)
test char::methods::bench_to_digit_radix_36        ... bench:      14,303 ns/iter (+/- 749)
test char::methods::bench_to_digit_radix_var       ... bench:      24,298 ns/iter (+/- 1,440)

# Run 3
test char::methods::bench_to_digit_radix_10        ... bench:      11,491 ns/iter (+/- 840)
test char::methods::bench_to_digit_radix_16        ... bench:      14,540 ns/iter (+/- 991)
test char::methods::bench_to_digit_radix_2         ... bench:      11,275 ns/iter (+/- 576)
test char::methods::bench_to_digit_radix_36        ... bench:      14,201 ns/iter (+/- 444)
test char::methods::bench_to_digit_radix_var       ... bench:      24,617 ns/iter (+/- 5,132)

as predicted the variable radix case is slightly slower than before, but that seems like a good tradeoff to me as variable radix is probably much more rare than a constant radix.

src/libcore/char/methods.rs

This seems to perform equally well

src/libcore/char/methods.rs

alexcrichton · 2018-11-14T21:15:04Z

@bors: r+

These look like great improvement, thanks @Turbo87!

bors · 2018-11-14T21:15:05Z

📌 Commit 7843e27 has been approved by alexcrichton

core/char: Speed up `to_digit()` for `radix <= 10` I noticed that `char::to_digit()` seemed to do a bit of extra work for handling `[a-zA-Z]` characters. Since `to_digit(10)` seems to be the most common case (at least in the `rust` codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the `radix < 10` case, which also seems to have a positive effect. It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something! ### Before ``` # Run 1 test char::methods::bench_to_digit_radix_10 ... bench: 16,265 ns/iter (+/- 1,774) test char::methods::bench_to_digit_radix_16 ... bench: 13,938 ns/iter (+/- 2,479) test char::methods::bench_to_digit_radix_2 ... bench: 13,090 ns/iter (+/- 524) test char::methods::bench_to_digit_radix_36 ... bench: 14,236 ns/iter (+/- 1,949) # Run 2 test char::methods::bench_to_digit_radix_10 ... bench: 16,176 ns/iter (+/- 1,589) test char::methods::bench_to_digit_radix_16 ... bench: 13,896 ns/iter (+/- 3,140) test char::methods::bench_to_digit_radix_2 ... bench: 13,158 ns/iter (+/- 1,112) test char::methods::bench_to_digit_radix_36 ... bench: 14,206 ns/iter (+/- 1,312) # Run 3 test char::methods::bench_to_digit_radix_10 ... bench: 16,221 ns/iter (+/- 2,423) test char::methods::bench_to_digit_radix_16 ... bench: 14,361 ns/iter (+/- 3,926) test char::methods::bench_to_digit_radix_2 ... bench: 13,097 ns/iter (+/- 671) test char::methods::bench_to_digit_radix_36 ... bench: 14,388 ns/iter (+/- 1,068) ``` ### After ``` # Run 1 test char::methods::bench_to_digit_radix_10 ... bench: 11,521 ns/iter (+/- 552) test char::methods::bench_to_digit_radix_16 ... bench: 12,926 ns/iter (+/- 684) test char::methods::bench_to_digit_radix_2 ... bench: 11,266 ns/iter (+/- 1,085) test char::methods::bench_to_digit_radix_36 ... bench: 14,213 ns/iter (+/- 614) # Run 2 test char::methods::bench_to_digit_radix_10 ... bench: 11,424 ns/iter (+/- 1,042) test char::methods::bench_to_digit_radix_16 ... bench: 12,854 ns/iter (+/- 1,193) test char::methods::bench_to_digit_radix_2 ... bench: 11,193 ns/iter (+/- 716) test char::methods::bench_to_digit_radix_36 ... bench: 14,249 ns/iter (+/- 3,514) # Run 3 test char::methods::bench_to_digit_radix_10 ... bench: 11,469 ns/iter (+/- 685) test char::methods::bench_to_digit_radix_16 ... bench: 12,852 ns/iter (+/- 568) test char::methods::bench_to_digit_radix_2 ... bench: 11,275 ns/iter (+/- 1,356) test char::methods::bench_to_digit_radix_36 ... bench: 14,188 ns/iter (+/- 1,501) ``` I ran the benchmark using: ```sh python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit" ```

Rollup of 16 pull requests Successful merges: - #54906 (Reattach all grandchildren when constructing specialization graph.) - #55182 (Redox: Update to new changes) - #55211 (Add BufWriter::buffer method) - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation) - #55530 (Speed up String::from_utf16) - #55556 (Use `Mmap` to open the rmeta file.) - #55622 (NetBSD: link libstd with librt in addition to libpthread) - #55827 (A few tweaks to iterations/collecting) - #55901 (fix various typos in doc comments) - #55926 (Change sidebar selector to fix compatibility with docs.rs) - #55930 (A handful of hir tweaks) - #55932 (core/char: Speed up `to_digit()` for `radix <= 10`) - #55935 (appveyor: Use VS2017 for all our images) - #55936 (save-analysis: be even more aggressive about ignorning macro-generated defs) - #55948 (submodules: update clippy from d8b4269 to 7e0ddef) - #55956 (add tests for some fixed ICEs)

core/char: Speed up `to_digit()` for `radix <= 10` I noticed that `char::to_digit()` seemed to do a bit of extra work for handling `[a-zA-Z]` characters. Since `to_digit(10)` seems to be the most common case (at least in the `rust` codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the `radix < 10` case, which also seems to have a positive effect. It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something! ### Before ``` # Run 1 test char::methods::bench_to_digit_radix_10 ... bench: 16,265 ns/iter (+/- 1,774) test char::methods::bench_to_digit_radix_16 ... bench: 13,938 ns/iter (+/- 2,479) test char::methods::bench_to_digit_radix_2 ... bench: 13,090 ns/iter (+/- 524) test char::methods::bench_to_digit_radix_36 ... bench: 14,236 ns/iter (+/- 1,949) # Run 2 test char::methods::bench_to_digit_radix_10 ... bench: 16,176 ns/iter (+/- 1,589) test char::methods::bench_to_digit_radix_16 ... bench: 13,896 ns/iter (+/- 3,140) test char::methods::bench_to_digit_radix_2 ... bench: 13,158 ns/iter (+/- 1,112) test char::methods::bench_to_digit_radix_36 ... bench: 14,206 ns/iter (+/- 1,312) # Run 3 test char::methods::bench_to_digit_radix_10 ... bench: 16,221 ns/iter (+/- 2,423) test char::methods::bench_to_digit_radix_16 ... bench: 14,361 ns/iter (+/- 3,926) test char::methods::bench_to_digit_radix_2 ... bench: 13,097 ns/iter (+/- 671) test char::methods::bench_to_digit_radix_36 ... bench: 14,388 ns/iter (+/- 1,068) ``` ### After ``` # Run 1 test char::methods::bench_to_digit_radix_10 ... bench: 11,521 ns/iter (+/- 552) test char::methods::bench_to_digit_radix_16 ... bench: 12,926 ns/iter (+/- 684) test char::methods::bench_to_digit_radix_2 ... bench: 11,266 ns/iter (+/- 1,085) test char::methods::bench_to_digit_radix_36 ... bench: 14,213 ns/iter (+/- 614) # Run 2 test char::methods::bench_to_digit_radix_10 ... bench: 11,424 ns/iter (+/- 1,042) test char::methods::bench_to_digit_radix_16 ... bench: 12,854 ns/iter (+/- 1,193) test char::methods::bench_to_digit_radix_2 ... bench: 11,193 ns/iter (+/- 716) test char::methods::bench_to_digit_radix_36 ... bench: 14,249 ns/iter (+/- 3,514) # Run 3 test char::methods::bench_to_digit_radix_10 ... bench: 11,469 ns/iter (+/- 685) test char::methods::bench_to_digit_radix_16 ... bench: 12,852 ns/iter (+/- 568) test char::methods::bench_to_digit_radix_2 ... bench: 11,275 ns/iter (+/- 1,356) test char::methods::bench_to_digit_radix_36 ... bench: 14,188 ns/iter (+/- 1,501) ``` I ran the benchmark using: ```sh python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit" ```

@ghost

Rollup of 17 pull requests Successful merges: - #55182 (Redox: Update to new changes) - #55211 (Add BufWriter::buffer method) - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation) - #55530 (Speed up String::from_utf16) - #55556 (Use `Mmap` to open the rmeta file.) - #55622 (NetBSD: link libstd with librt in addition to libpthread) - #55750 (Make `NodeId` and `HirLocalId` `newtype_index`) - #55778 (Wrap some query results in `Lrc`.) - #55781 (More precise spans for temps and their drops) - #55785 (Add mem::forget_unsized() for forgetting unsized values) - #55852 (Rewrite `...` as `..=` as a `MachineApplicable` 2018 idiom lint) - #55865 (Unix RwLock: avoid racy access to write_locked) - #55901 (fix various typos in doc comments) - #55926 (Change sidebar selector to fix compatibility with docs.rs) - #55930 (A handful of hir tweaks) - #55932 (core/char: Speed up `to_digit()` for `radix <= 10`) - #55956 (add tests for some fixed ICEs) Failed merges: r? @ghost

rust-highfive assigned alexcrichton Nov 13, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 13, 2018

Turbo87 force-pushed the to_digit branch from 50406df to c3ae8ed Compare November 13, 2018 17:38

Turbo87 added 3 commits November 13, 2018 22:02

core/benches: Add char::to_digit() benchmarks

98f61a3

core/char: Replace condition + panic!() with assert!()

04aade8

Turbo87 force-pushed the to_digit branch from c3ae8ed to 17f08fe Compare November 13, 2018 21:06

erikdesjardins reviewed Nov 13, 2018

View reviewed changes

src/libcore/char/methods.rs Outdated Show resolved Hide resolved

core/char: Drop radix == 10 special case

64a5172

This seems to perform equally well

varkor reviewed Nov 14, 2018

View reviewed changes

src/libcore/char/methods.rs Show resolved Hide resolved

core/char: Add comment to to_digit()

7843e27

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 14, 2018

kennytm mentioned this pull request Nov 15, 2018

Rollup of 16 pull requests #55943

Closed

Turbo87 mentioned this pull request Nov 15, 2018

core/num: Speed up from_str_radix() method #55973

Closed

pietroalbini mentioned this pull request Nov 15, 2018

Rollup of 17 pull requests #55974

Merged

bors merged commit 7843e27 into rust-lang:master Nov 15, 2018

Turbo87 deleted the to_digit branch November 15, 2018 15:46

SimonSapin mentioned this pull request Nov 19, 2018

Use Mmap to open the rmeta file. #55556

Merged

andjo403 mentioned this pull request Aug 10, 2019

Optimization regression in 1.32+ #59352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core/char: Speed up `to_digit()` for `radix <= 10` #55932

core/char: Speed up `to_digit()` for `radix <= 10` #55932

Turbo87 commented Nov 13, 2018

rust-highfive commented Nov 13, 2018

scottmcm commented Nov 13, 2018

Turbo87 commented Nov 13, 2018

Xanewok commented Nov 13, 2018

Turbo87 commented Nov 13, 2018

Turbo87 commented Nov 13, 2018

alexcrichton commented Nov 14, 2018

bors commented Nov 14, 2018

core/char: Speed up to_digit() for radix <= 10 #55932

core/char: Speed up to_digit() for radix <= 10 #55932

Conversation

Turbo87 commented Nov 13, 2018

Before

After

rust-highfive commented Nov 13, 2018

scottmcm commented Nov 13, 2018

Turbo87 commented Nov 13, 2018

Xanewok commented Nov 13, 2018

Turbo87 commented Nov 13, 2018

Turbo87 commented Nov 13, 2018

Before

After

alexcrichton commented Nov 14, 2018

bors commented Nov 14, 2018

core/char: Speed up `to_digit()` for `radix <= 10` #55932

core/char: Speed up `to_digit()` for `radix <= 10` #55932