[perf experiment] Enable overflow checks for not-std #119440

Noratrieb · 2023-12-30T13:46:15Z

r? @ghost

Noratrieb · 2023-12-30T13:46:27Z

@bors try @rust-timer queue

bors · 2023-12-30T13:51:44Z

⌛ Trying commit e94ce10 with merge fe8f664...

[perf experiment] Enable overflow checks for not-std r? `@ghost`

bors · 2023-12-30T15:17:09Z

☀️ Try build successful - checks-actions
Build commit: fe8f664 (fe8f664b41f030f307cfeb6cb8c3a1419292aeed)

rust-timer · 2023-12-30T21:36:56Z

Finished benchmarking commit (fe8f664): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.1%	[0.4%, 4.6%]	201
Regressions ❌ (secondary)	2.2%	[0.5%, 17.4%]	188
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-4.0%	[-4.5%, -3.6%]	6
All ❌✅ (primary)	2.1%	[0.4%, 4.6%]	201

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.5%	[1.1%, 2.0%]	3
Regressions ❌ (secondary)	1.6%	[0.7%, 3.4%]	8
Improvements ✅ (primary)	-3.2%	[-3.2%, -3.2%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.3%	[-3.2%, 2.0%]	4

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.5%, 0.5%]	1
Regressions ❌ (secondary)	3.1%	[2.1%, 4.9%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.5%	[0.5%, 0.5%]	1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 668.324s -> 670.218s (0.28%)
Artifact size: 311.76 MiB -> 315.42 MiB (1.18%)

Kobzol · 2023-12-30T21:52:53Z

While instruction counts look scary, cycles are almost unmoved, and bootstrap also wasn't regressed almost at all. It makes the compiler library ~3MiB larger though, that's interesting.

Noratrieb · 2023-12-30T22:11:42Z

An overflow check is an instruction, but an extremely well predicted branch, so I think that difference makes sense. We haven't had many bugs caught with overflow checks (though to be fair, they've never been in dist, so we've only partially turned the lights on in matthias' fuzzing), but I don't think we should enable them with these results. If we could get the regressions down with a few well-placed wrapping_*, it might be worth it.

Kobzol · 2023-12-30T22:13:44Z

We could also just enable them on some selected CI runners (assuming that we don't do that already).

Noratrieb · 2023-12-30T22:16:01Z

We already have debug assertions CI. @saethlin suggested a crater run here, which might reveal something interesting.
@craterbot build

craterbot · 2023-12-30T22:16:03Z

🚨 Error: failed to parse the command

🆘 If you have any trouble with Crater please ping @rust-lang/infra!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

Noratrieb · 2023-12-30T22:18:33Z

@craterbot run mode=build-only

craterbot · 2023-12-30T22:18:45Z

👌 Experiment pr-119440 created and queued.
🤖 Automatically detected try build fe8f664
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

craterbot · 2023-12-30T22:19:00Z

🚧 Experiment pr-119440 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

craterbot · 2024-01-01T16:38:56Z

🎉 Experiment pr-119440 is completed!
📊 55 regressed and 2 fixed (403820 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

saethlin · 2024-01-01T16:47:30Z

The run found a bug: https://crater-reports.s3.amazonaws.com/pr-119440/try%23fe8f664b41f030f307cfeb6cb8c3a1419292aeed/reg/dialectic-macro-0.1.0/log.txt

saethlin · 2024-01-01T16:49:14Z

Ah, the problem is that Parser only uses a u16 for the number of angle brackets we've seen.

Noratrieb · 2024-01-01T18:58:39Z

@craterbot check p=1 crates=https://crater-reports.s3.amazonaws.com/pr-119440/retry-regressed-list.txt

craterbot · 2024-01-01T18:58:46Z

👌 Experiment pr-119440-1 created and queued.
🤖 Automatically detected try build fe8f664
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

Noratrieb · 2024-01-01T23:26:23Z

@bors try @rust-timer queue

bors · 2024-01-01T23:27:33Z

⌛ Trying commit 80ea420 with merge 5cbbfe2...

[perf experiment] Enable overflow checks for not-std r? `@ghost`

bors · 2024-01-01T23:34:11Z

💔 Test failed - checks-actions

Noratrieb · 2024-01-02T13:44:14Z

@bors try @rust-timer queue

bors · 2024-01-02T13:45:24Z

⌛ Trying commit c61310f with merge 69fcfab...

[perf experiment] Enable overflow checks for not-std r? `@ghost`

bors · 2024-01-02T15:11:20Z

☀️ Try build successful - checks-actions
Build commit: 69fcfab (69fcfab6febdbff1afb41a6417d0fac3a8e0abb1)

rust-timer · 2024-01-02T16:31:36Z

Finished benchmarking commit (69fcfab): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.8%	[0.2%, 1.9%]	165
Regressions ❌ (secondary)	1.0%	[0.2%, 2.6%]	138
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.8%	[-4.7%, -0.3%]	7
All ❌✅ (primary)	0.8%	[0.2%, 1.9%]	165

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.9%	[2.7%, 3.0%]	2
Regressions ❌ (secondary)	2.7%	[0.8%, 5.9%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.9%	[2.7%, 3.0%]	2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.0%	[1.0%, 1.0%]	1
Regressions ❌ (secondary)	2.7%	[2.1%, 3.7%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.5%	[-4.3%, -2.6%]	6
All ❌✅ (primary)	1.0%	[1.0%, 1.0%]	1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 666.95s -> 669.971s (0.45%)
Artifact size: 311.74 MiB -> 315.41 MiB (1.18%)

…Lapkin,Nilstrieb Report I/O errors from rmeta encoding with emit_fatal rust-lang#119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (rust-lang#119440 (comment)) for more out-of-disk ICEs on current master and yep there's 2 in there. So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does. <details><summary>code for the dylib</summary> ```rust // cargo add libc rand backtrace use rand::Rng; #[no_mangle] pub extern "C" fn write( fd: libc::c_int, buf: *const libc::c_void, count: libc::size_t, ) -> libc::ssize_t { if fd > 2 && rand::thread_rng().gen::<u8>() == 0 { let mut count = 0; backtrace::trace(|frame| { backtrace::resolve_frame(frame, |symbol| { if let Some(name) = symbol.name() { if count > 3 { eprintln!("{}", name); } } count += 1; }); true }); unsafe { *libc::__errno_location() = libc::ENOSPC; } return -1; } else { unsafe { let res = libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize; if res < 0 { *libc::__errno_location() = -res as i32; -1 } else { res } } } } ``` </details> Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this: ```bash while true; do cargo clean LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors if grep "thread 'rustc' panicked" errors; then break fi done ``` My "big project" for testing was an otherwise-empty project with `cargo add axum`. Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found) I believe the problem is that even though since rust-lang#117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window. I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as https://github.com/rust-lang/rust/blob/e51e98dde6a60637b6a71b8105245b629ac3fe77/compiler/rustc_codegen_ssa/src/back/write.rs#L542 but unlike rmeta encoding I am not aware of those cases of those causing problems. r? `@WaffleLapkin`

…Lapkin,Nilstrieb Report I/O errors from rmeta encoding with emit_fatal rust-lang#119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (rust-lang#119440 (comment)) for more out-of-disk ICEs on current master and yep there's 2 in there. So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does. <details><summary>code for the dylib</summary> ```rust // cargo add libc rand backtrace use rand::Rng; #[no_mangle] pub extern "C" fn write( fd: libc::c_int, buf: *const libc::c_void, count: libc::size_t, ) -> libc::ssize_t { if fd > 2 && rand::thread_rng().gen::<u8>() == 0 { let mut count = 0; backtrace::trace(|frame| { backtrace::resolve_frame(frame, |symbol| { if let Some(name) = symbol.name() { if count > 3 { eprintln!("{}", name); } } count += 1; }); true }); unsafe { *libc::__errno_location() = libc::ENOSPC; } return -1; } else { unsafe { let res = libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize; if res < 0 { *libc::__errno_location() = -res as i32; -1 } else { res } } } } ``` </details> Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this: ```bash while true; do cargo clean LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors if grep "thread 'rustc' panicked" errors; then break fi done ``` My "big project" for testing was an otherwise-empty project with `cargo add axum`. Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found) I believe the problem is that even though since rust-lang#117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window. I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as https://github.com/rust-lang/rust/blob/e51e98dde6a60637b6a71b8105245b629ac3fe77/compiler/rustc_codegen_ssa/src/back/write.rs#L542 but unlike rmeta encoding I am not aware of those cases of those causing problems. r? ``@WaffleLapkin``

Rollup merge of rust-lang#119510 - saethlin:fatal-io-errors, r=WaffleLapkin,Nilstrieb Report I/O errors from rmeta encoding with emit_fatal rust-lang#119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (rust-lang#119440 (comment)) for more out-of-disk ICEs on current master and yep there's 2 in there. So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does. <details><summary>code for the dylib</summary> ```rust // cargo add libc rand backtrace use rand::Rng; #[no_mangle] pub extern "C" fn write( fd: libc::c_int, buf: *const libc::c_void, count: libc::size_t, ) -> libc::ssize_t { if fd > 2 && rand::thread_rng().gen::<u8>() == 0 { let mut count = 0; backtrace::trace(|frame| { backtrace::resolve_frame(frame, |symbol| { if let Some(name) = symbol.name() { if count > 3 { eprintln!("{}", name); } } count += 1; }); true }); unsafe { *libc::__errno_location() = libc::ENOSPC; } return -1; } else { unsafe { let res = libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize; if res < 0 { *libc::__errno_location() = -res as i32; -1 } else { res } } } } ``` </details> Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this: ```bash while true; do cargo clean LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors if grep "thread 'rustc' panicked" errors; then break fi done ``` My "big project" for testing was an otherwise-empty project with `cargo add axum`. Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found) I believe the problem is that even though since rust-lang#117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window. I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as https://github.com/rust-lang/rust/blob/e51e98dde6a60637b6a71b8105245b629ac3fe77/compiler/rustc_codegen_ssa/src/back/write.rs#L542 but unlike rmeta encoding I am not aware of those cases of those causing problems. r? ``@WaffleLapkin``

Noratrieb · 2024-02-13T19:26:08Z

hm, maybe i should propose merging this, lol. it's perf neutral in cycles

…esleywiser Add add/sub methods that only panic with debug assertions to rustc This mitigates the perf impact of enabling overflow checks on rustc. The change to use overflow checks will be done in a later PR. For rust-lang/compiler-team#724, based on data gathered in rust-lang#119440.

Enable overflow checks for not-std

e94ce10

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Dec 30, 2023