[experiment] Benchmark incremental ThinLTO'd compiler. #56678

michaelwoerister · 2018-12-10T15:20:03Z

I figure we can hijack perf.rlo to get some numbers on the runtime performance of code compiled with incremental ThinLTO. To this end, we compile the compiler incrementally and then measure how much slower it is a building stuff.

michaelwoerister · 2018-12-10T15:20:12Z

@bors try

rust-highfive · 2018-12-10T15:20:16Z

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

bors · 2018-12-10T15:20:25Z

⌛ Trying commit 208b9a7 with merge 5caf635...

[experiment] Benchmark incremental ThinLTO'd compiler. I figure we can hijack perf.rlo to get some numbers on the runtime performance of code compiled with incremental ThinLTO. To this end, we compile the compiler incrementally and then measure how much slower it is a building stuff.

rust-highfive · 2018-12-10T18:21:23Z

The job dist-x86_64-linux of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

travis_fold:end:services

travis_fold:start:git.checkout
travis_time:start:004206c0
$ git clone --depth=2 --branch=try https://github.com/rust-lang/rust.git rust-lang/rust
---
[00:00:48] 
[00:00:48] Total download size: 4.9 M
[00:00:48] Downloading Packages:
[00:00:51] --------------------------------------------------------------------------------
[00:00:51] Total                                           1.8 MB/s | 4.9 MB     00:02     
[00:00:51] warning: rpmts_HdrFromFdno: Header V3 DSA signature: NOKEY, key ID e8562897
[00:00:51] Importing GPG key 0xE8562897 "CentOS-5 Key (CentOS 5 Official Signing Key) <centos-5-key@centos.org>" from /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
[00:00:51] Running Transaction Test
[00:00:51] Finished Transaction Test
[00:00:51] Transaction Test Succeeded
[00:00:51] Running Transaction
---
[00:03:07] + hide_output make install
[00:03:07] + set +x
[00:03:25] shared.sh: line 11:   351 Terminated              bash -c "while true; do sleep 30; echo \$(date) - building ...; done"
[00:03:25] + cd ..
[00:03:25] + rm -rf openssl-1.0.2k
[00:03:25] ./build-openssl.sh: line 25:  4114 Terminated              bash -c "while true; do sleep 30; echo \$(date) - building ...; done"  (wd: /tmp/openssl-1.0.2k)
[00:03:25] + ln -nsf /etc/pki/tls/cert.pem /rustroot/ssl/
[00:03:26] Removing intermediate container e0a6287fa86f
[00:03:26] Step 14/41 : COPY dist-x86_64-linux/build-curl.sh /tmp/
[00:03:26]  ---> f2f711068964
[00:03:26] Step 15/41 : RUN ./build-curl.sh
[00:03:26] Step 15/41 : RUN ./build-curl.sh
[00:03:26]  ---> Running in 24cce2d607f7
[00:03:26] + source shared.sh
[00:03:26] + VERSION=7.51.0
[00:03:26] + curl http://cool.haxx.se/download/curl-7.51.0.tar.bz2
[00:03:28]   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
[00:03:28]                                  Dload  Upload   Total   Spent    Left  Speed
[00:03:29] 
  0 2509k    0 14215    0     0   9439      0  0:04:32  0:00:01  0:04:31  9439
  0 2509k    0 14215    0     0   9439      0  0:04:32  0:00:01  0:04:31  9439
  2 2509k    2 59559    0     0  33949      0  0:01:15  0:00:01  0:01:14  177k
100 2509k  100 2509k    0     0   939k      0  0:00:02  0:00:02 --:--:-- 2140k
[00:03:29] + mkdir curl-build
[00:03:29] + cd curl-build
[00:03:29] + hide_output ../curl-7.51.0/configure --prefix=/rustroot --with-ssl=/rustroot --disable-sspi --disable-gopher --disable-smtp --disable-smb --disable-imap --disable-pop3 --disable-tftp --disable-telnet --disable-manual --disable-dict --disable-rtsp --disable-ldaps --disable-ldap
[00:03:49] + hide_output make -j10
[00:03:49] + set +x
[00:04:02] shared.sh: line 11:    12 Terminated              bash -c "while true; do sleep 30; echo \$(date) - building ...; done"
[00:04:02] + hide_output make install
---
 98 82.1M   98 80.9M    0     0  1004k      0  0:01:23  0:01:22  0:00:01 1101k
 99 82.1M   99 82.0M    0     0  1006k      0  0:01:23  0:01:23 --:--:-- 1079k
100 82.1M  100 82.1M    0     0  1006k      0  0:01:23  0:01:23 --:--:-- 1181k
[00:08:56] + cd gcc-4.8.5
[00:08:56] + sed -i 's|ftp://gcc\.gnu\.org/|http://gcc.gnu.org/|g' ./contrib/download_prerequisites
[00:08:56] --2018-12-10 15:30:50--  http://gcc.gnu.org/pub/gcc/infrastructure/mpfr-2.4.2.tar.bz2
[00:08:56] Resolving gcc.gnu.org... 209.132.180.131
[00:08:56] Connecting to gcc.gnu.org|209.132.180.131|:80... connected.
[00:08:56] HTTP request sent, awaiting response... 200 OK
---
 29 27.0M   29 8128k    0     0  6610k      0  0:00:04  0:00:01  0:00:03 6762k
 61 27.0M   61 16.4M    0     0  7572k      0  0:00:03  0:00:02  0:00:01 7665k
 89 27.0M   89 24.0M    0     0  7636k      0  0:00:03  0:00:03 --:--:-- 7701k
100 27.0M  100 27.0M    0     0  7873k      0  0:00:03  0:00:03 --:--:-- 7937k
[00:42:44] + cd llvm-7.0.0.src
[00:42:44] + mkdir -p tools/clang
[00:42:44] + curl https://releases.llvm.org/7.0.0/cfe-7.0.0.src.tar.xz
[00:42:44] + xz -d
[00:42:44] + tar xf - -C tools/clang --strip-components=1
[00:42:44]                                  Dload  Upload   Total   Spent    Left  Speed
[00:42:46] 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
 28 11.9M   28 3486k    0     0  4992k      0  0:00:02 --:--:--  0:00:02 5001k
---
[00:42:46]                                  Dload  Upload   Total   Spent    Left  Speed
[00:42:46] 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  894k  100  894k    0     0  1926k      0 --:--:-- --:--:-- --:--:-- 1935k
[00:42:46] + mkdir ../clang-build
[00:42:46] + cd ../clang-build
[00:42:46] + INC=/rustroot/include
[00:42:46] + INC=/rustroot/include:/rustroot/lib/gcc/x86_64-unknown-linux-gnu/4.8.5/include-fixed
[00:42:46] + INC=/rustroot/include:/rustroot/lib/gcc/x86_64-unknown-linux-gnu/4.8.5/include-fixed:/usr/include
[00:42:46] + hide_output cmake ../llvm-7.0.0.src -DCMAKE_C_COMPILER=/rustroot/bin/gcc -DCMAKE_CXX_COMPILER=/rustroot/bin/g++ -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/rustroot -DLLVM_TARGETS_TO_BUILD=X86 -DC_INCLUDE_DIRS=/rustroot/include:/rustroot/lib/gcc/x86_64-unknown-linux-gnu/4.8.5/include-fixed:/usr/include
[00:43:16] Mon Dec 10 16:05:11 UTC 2018 - building ...
[00:43:18] + hide_output make -j10
[00:43:18] + set +x
[00:43:48] Mon Dec 10 16:05:42 UTC 2018 - building ...
---
[01:41:18] + set +x
[01:41:19] + hide_output make INSTALL_HDR_PATH=dest headers_install
[01:41:19] + set +x
[01:41:22] shared.sh: line 11:    10 Terminated              bash -c "while true; do sleep 30; echo \$(date) - building ...; done"
[01:41:22] + find dest/include '(' -name .install -o -name ..install.cmd ')' -delete
[01:41:22] + yes
[01:41:22] + yes
[01:41:22] + cp -fr dest/include/asm dest/include/asm-generic dest/include/drm dest/include/linux dest/include/mtd dest/include/rdma dest/include/scsi dest/include/sound dest/include/video dest/include/xen /usr/include
[01:41:22] + rm -rf linux-3.2.84
[01:41:23]  ---> f7766546ed27
[01:41:23] Removing intermediate container beae4c2a93c2
[01:41:23] Step 31/41 : COPY dist-x86_64-linux/build-perl.sh /tmp/
[01:41:23] Step 31/41 : COPY dist-x86_64-linux/build-perl.sh /tmp/
[01:41:23]  ---> d6259b67dfa1
[01:41:23] Step 32/41 : RUN ./build-perl.sh
[01:41:23]  ---> Running in fb6f61c82495
[01:41:24] + source shared.sh
[01:41:24] + curl https://www.cpan.org/src/5.0/perl-5.28.0.tar.gz
[01:41:24]   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
[01:41:24]                                  Dload  Upload   Total   Spent    Left  Speed
[01:41:25] 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

bors · 2018-12-10T18:21:24Z

💔 Test failed - status-travis

alexcrichton · 2018-12-10T18:33:19Z

@bors: retry

bors · 2018-12-10T18:33:29Z

⌛ Trying commit 208b9a7 with merge 8f33a3d05703e0a9180173ae6bd15875fc94f17e...

bors · 2018-12-10T18:34:18Z

💔 Test failed - status-travis

alexcrichton · 2018-12-10T18:47:37Z

Hm I'm not entirely sure what's going on.... I couldn't easily tell if that's a normal timeout or a spurious timeout

michaelwoerister · 2018-12-11T10:36:52Z

@bors retry

bors · 2018-12-11T10:37:01Z

⌛ Trying commit 208b9a7 with merge 26f96e5...

[experiment] Benchmark incremental ThinLTO'd compiler. I figure we can hijack perf.rlo to get some numbers on the runtime performance of code compiled with incremental ThinLTO. To this end, we compile the compiler incrementally and then measure how much slower it is a building stuff.

bors · 2018-12-11T13:04:18Z

☀️ Test successful - status-travis
State: approved= try=True

michaelwoerister · 2018-12-11T13:32:56Z

@rust-timer build 26f96e5

rust-timer · 2018-12-11T13:32:57Z

Success: Queued 26f96e5 with parent 3a31213, comparison URL.

rust-timer · 2018-12-11T16:31:16Z

Finished benchmarking try commit 26f96e5

alexcrichton · 2018-12-11T18:09:57Z

All things considered that's actually a really impressive benchmark. The top 4 regressions are probably just one or two missing #[inline] annotations

michaelwoerister · 2018-12-11T20:05:33Z

Yes, the medium sized crates are hardly affected. Pretty cool!

michaelwoerister · 2018-12-12T14:28:02Z

OK, so this gives us one good, real-world test case where incremental+ThinLTO gives roughly the same result as regular ThinLTO as far as the runtime performance of the generated code goes. We'll need more examples (@nnethercote mentioned that he might take a look at Firefox as another big, real-world test case) but if things keep looking the way they are looking now, we should think about making incremental compilation the default for optimized builds too because the speedups to be had here are substantial.

The following table compares compile times with and without incremental compilation for optimized builds (numbers taken from a random, recent perf.rlo run)

	non-incr	incr-base	incr-small-change	one-time cost	speed up
piston-image	7.82s	8.18s	1.45s	+5%	5.4x
webrender	20.64s	23.34s	4.63s	+13%	4.5x
ripgrep	8.04s	8.5s	1.82s	+6%	4.4x
encoding	1.5s	1.73s	0.43s	+15%	3.5x
futures	0.95s	1.16s	0.31s	+22%	3.1x
regex	4.63s	5.08s	2.02s	+10%	2.3x
clap-rs	25.37s	26.47s	13.32s	+4%	1.9x
sentry-cli	19.1s	21.06s	11.43s	+10%	1.7x
cargo	38.95s	43.63s	24.5s	+12%	1.6x
tokio-webpush-simple	4.21s	4.77s	3.09s	+13%	1.4x
helloworld	0.15s	0.16s	0.15s	+7%	1.0x

Initial compilation still, as expected, is slower with incr. comp. but subsequent compilation sessions can be much faster! We should really try to get this into the hands of end-users somehow. I mean, how often do we get a chance to make compilation up to five times faster? :D

If we had a debug-opt Cargo profile, this would be a no-brainer. Since we only have release and debug, it's a bit more complicated. Actually released code should still be compiled non-incrementally, with one codegen unit, and possibly with whole-crate-graph LTO enabled.

cc @rust-lang/core @rust-lang/cargo @rust-lang/wg-compiler-performance

alexcrichton · 2018-12-12T15:54:04Z

Those indeed are some compelling numbers, and I could get behind making incremental release mode the default! I think we could even go further in release mode and turn on "infinite codegen units" by default as well, as that wouldn't have the one-time cost of incremental but would benefit compiling all crates.io crates in release mode which aren't compiled incrementally (and may help eat the one time cost of the local project being incremental!)

I would personally be in favor of pursuing an aggressive strategy of "simply turn on incremental by default in release mode". There's always going to be some configuration which eeks out a few percentage points of performance here and there (LTO, PGO, one CGU, implementing a mode in rustc that just reruns all LLVM passes, etc, etc). What I think this is very convincingly showing is that there's no major regressions with incrementally compiled code in release mode and it's still very competitive.

In terms of next steps I think we'd probably want to do a call on internals like we initially did for ThinLTO, basically having a very short set of instructions for how to test and ask folks to post various numbers for various projects. I suspect CARGO_INCREMENTAL=1 cargo build --release is enough to test this, but I haven't tested!

michaelwoerister · 2018-12-12T16:01:43Z

In terms of next steps I think we'd probably want to do a call on internals like we initially did for ThinLTO, basically having a very short set of instructions for how to test and ask folks to post various numbers for various projects.

Heh, you mean like this one? :)
Maybe the instructions were too complicated.

alexcrichton · 2018-12-12T16:04:07Z

Ha yes I mean exactly that one!

We may want to try messaging yeah with an easy-to-run thing at the top, and then perhaps also message simultaneously say that we're considering turning this on by default so it's important to test!

Mark-Simulacrum · 2018-12-12T16:44:31Z

FWIW, I've been running with CARGO_INCREMENTAL=1 on all my smaller Rust projects and have never noticed any slowdowns either -- including across all the Advent of Code benchmarks that I've run (so tight-loop code).

I think we may be able to just go ahead and do this without any call to action, and see if people come complaining. I believe the Cargo team is working long-term on a reworking of profiles which would help with a true "release" build here -- though at least in my experience incremental release builds today are more than fast enough.

Eh2406 · 2018-12-12T17:56:42Z

As I said in internals, we have lolbench now. We have a way to accurately measure the change on performance of the compiled output over a large number benchmarks from real world code. (I don't have access to a Computer I can dedicate to running this experiment or I would do it myself.) We don't intentionally make breaking changes without asking crater. We can do the same for the performance of compiled output!

michaelwoerister · 2018-12-12T18:08:49Z

A simple way to test this via lolbench would be to make incr. comp. the default. We are early in the cycle, after all, and can revert if performance takes too much of a hit. I don't know how much work a local lolbench run would be but it sounded like it wouldn't work out of the box.

XAMPPRocky · 2019-01-11T13:21:40Z

Triage; @michaelwoerister Hello, have you been able to get back to this PR?

michaelwoerister · 2019-01-14T13:28:42Z

This PR was just an experiment and we got some good numbers out of it. Once beta has branched off, we could look into making incr. comp. the default for release builds temporarily in order to get some numbers via lolbench.

@anp

…, r=alexcrichton Make incremental compilation the default for all profiles. This PR makes incremental compilation the default for all profiles, that is, also `release` and `bench`. `rustc` performs ThinLTO by default for incremental release builds for a while now and the [data we've gathered so far](rust-lang/rust#56678) indicates that the generated binaries exhibit roughly the same runtime performance as non-incrementally compiled ones. At the same time, incremental release builds can be 2-5 times as fast as non-incremental ones. Making incremental compilation (temporarily) the default in `cargo` would be a simple way of gathering more data about runtime performance via [lolbench.rs](https://lolbench.rs). If the results look acceptable, we can just leave it on and give a massive compile time reduction to everyone. If not, we can revert the change and think about a plan B. This strategy assumes that lolbench will actually use the nightly `cargo` version. Is that true, @anp? r? @alexcrichton

Build rustc with incr. ThinLTO so we can benchmark it.

208b9a7

rust-highfive assigned alexcrichton Dec 10, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 10, 2018

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 10, 2018

michaelwoerister closed this Jan 14, 2019

michaelwoerister mentioned this pull request Jan 18, 2019

Make incremental compilation the default for all profiles. rust-lang/cargo#6564

Merged

michaelwoerister mentioned this pull request Jan 29, 2019

Tracking Issue for making incremental compilation the default for Release Builds #57968

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[experiment] Benchmark incremental ThinLTO'd compiler. #56678

[experiment] Benchmark incremental ThinLTO'd compiler. #56678

michaelwoerister commented Dec 10, 2018

michaelwoerister commented Dec 10, 2018

rust-highfive commented Dec 10, 2018

bors commented Dec 10, 2018

rust-highfive commented Dec 10, 2018

bors commented Dec 10, 2018

alexcrichton commented Dec 10, 2018

bors commented Dec 10, 2018

bors commented Dec 10, 2018

alexcrichton commented Dec 10, 2018

michaelwoerister commented Dec 11, 2018

bors commented Dec 11, 2018

bors commented Dec 11, 2018

michaelwoerister commented Dec 11, 2018

rust-timer commented Dec 11, 2018

rust-timer commented Dec 11, 2018

alexcrichton commented Dec 11, 2018

michaelwoerister commented Dec 11, 2018

michaelwoerister commented Dec 12, 2018

alexcrichton commented Dec 12, 2018

michaelwoerister commented Dec 12, 2018

alexcrichton commented Dec 12, 2018

Mark-Simulacrum commented Dec 12, 2018

Eh2406 commented Dec 12, 2018

michaelwoerister commented Dec 12, 2018

XAMPPRocky commented Jan 11, 2019

michaelwoerister commented Jan 14, 2019

[experiment] Benchmark incremental ThinLTO'd compiler. #56678

[experiment] Benchmark incremental ThinLTO'd compiler. #56678

Conversation

michaelwoerister commented Dec 10, 2018

michaelwoerister commented Dec 10, 2018

rust-highfive commented Dec 10, 2018

bors commented Dec 10, 2018

rust-highfive commented Dec 10, 2018

bors commented Dec 10, 2018

alexcrichton commented Dec 10, 2018

bors commented Dec 10, 2018

bors commented Dec 10, 2018

alexcrichton commented Dec 10, 2018

michaelwoerister commented Dec 11, 2018

bors commented Dec 11, 2018

bors commented Dec 11, 2018

michaelwoerister commented Dec 11, 2018

rust-timer commented Dec 11, 2018

rust-timer commented Dec 11, 2018

alexcrichton commented Dec 11, 2018

michaelwoerister commented Dec 11, 2018

michaelwoerister commented Dec 12, 2018

alexcrichton commented Dec 12, 2018

michaelwoerister commented Dec 12, 2018

alexcrichton commented Dec 12, 2018

Mark-Simulacrum commented Dec 12, 2018

Eh2406 commented Dec 12, 2018

michaelwoerister commented Dec 12, 2018

XAMPPRocky commented Jan 11, 2019

michaelwoerister commented Jan 14, 2019