Call FileEncoder::finish in rmeta encoding #117301

saethlin · 2023-10-28T01:35:24Z

The bug here was that rmeta encoding never called FileEncoder::finish. Now it does. Most of the changes here are needed to support that, since rmeta encoding wants to finish then access the File in the encoder, so finish can't move out.

I tried adding a cfg(debug_assertions) exploding Drop impl to FileEncoder that checked for finish being called before dropping, but fatal errors cause unwinding so this isn't really possible. If we encounter a fatal error with a dirty FileEncoder, the Drop impl ICEs even though the implementation is correct. If we try to paper over that by wrapping FileEncoder in ManuallyDrop then that just erases the fact that Drop automatically checks that we call finish on all paths.

I also changed the name of DepGraph::encode to DepGraph::finish_encoding, because that's what it does and it makes the fact that it is the path to FileEncoder::finish less confusing.

r? @WaffleLapkin

rustbot · 2023-10-28T01:35:27Z

Could not assign reviewer from: WaffleLapkin.
User(s) WaffleLapkin are either the PR author, already assigned, or on vacation, and there are no other candidates.
Use r? to specify someone else to assign.

rustbot · 2023-10-28T01:35:32Z

r? @TaKO8Ki

(rustbot has picked a reviewer for you, use r? to override)

Mark-Simulacrum · 2023-10-28T02:10:01Z

I tried adding a cfg(debug_assertions) exploding Drop impl to FileEncoder that checked for finish being called before dropping, but fatal errors cause unwinding so this isn't really possible. If we encounter a fatal error with a dirty FileEncoder, the Drop impl ICEs even though the implementation is correct.

Can't we guard the assertion in Drop with std::thread::panicking, so that a fatal error just skips over the explosion?

saethlin · 2023-10-28T03:02:07Z

Yep I think that can be done. I'll push a change that does that in a bit.

rustbot · 2023-10-28T05:29:44Z

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

compiler/rustc_interface/src/queries.rs

bors · 2023-10-30T21:44:53Z

☔ The latest upstream changes (presumably #117405) made this pull request unmergeable. Please resolve the merge conflicts.

bors · 2023-11-17T19:32:00Z

☔ The latest upstream changes (presumably #117993) made this pull request unmergeable. Please resolve the merge conflicts.

bors · 2023-11-22T14:47:06Z

☔ The latest upstream changes (presumably #118086) made this pull request unmergeable. Please resolve the merge conflicts.

WaffleLapkin · 2023-11-26T14:26:57Z

compiler/rustc_metadata/src/rmeta/encoder.rs

-    file.write_all(&[(pos >> 24) as u8, (pos >> 16) as u8, (pos >> 8) as u8, (pos >> 0) as u8])
-        .unwrap_or_else(|err| tcx.sess.emit_fatal(FailWriteFile { err }));
+    file.seek(std::io::SeekFrom::Start(header as u64))?;
+    file.write_all(&[(pos >> 24) as u8, (pos >> 16) as u8, (pos >> 8) as u8, (pos >> 0) as u8])?;


Unrelated to the PR but why is this not (pos as u32).to_be_bytes()? D:

This code is very old. It could probably do with being totally rewritten, and I didn't even notice that this is another u32 position, ouch.

WaffleLapkin · 2023-11-26T14:30:01Z

r? WaffleLapkin
@bors r+

bors · 2023-11-26T14:30:03Z

📌 Commit fbaa24e has been approved by WaffleLapkin

It is now in the queue for this repository.

bors · 2023-11-26T14:43:06Z

⌛ Testing commit fbaa24e with merge 3dbb4da...

bors · 2023-11-26T16:41:26Z

☀️ Test successful - checks-actions
Approved by: WaffleLapkin
Pushing 3dbb4da to master...

rust-timer · 2023-11-26T18:44:28Z

Finished benchmarking commit (3dbb4da): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.3%	[1.2%, 1.7%]	3
Improvements ✅ (primary)	-2.1%	[-2.1%, -2.1%]	1
Improvements ✅ (secondary)	-4.1%	[-4.7%, -3.7%]	3
All ❌✅ (primary)	-2.1%	[-2.1%, -2.1%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.5%	[-2.5%, -2.5%]	1
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 674.277s -> 674.133s (-0.02%)
Artifact size: 313.37 MiB -> 313.38 MiB (0.00%)

…apkin Use a u64 for the rmeta root position Waffle noticed this in rust-lang#117301 (comment) We've upgraded the other file offsets to u64, and this one only costs 4 bytes per file. Also the way the truncation was being done before was extremely easy to miss, I sure missed it! It's not clear to me if not having this change effectively made the other upgrades from u32 to u64 ineffective, but we can have it now. r? `@WaffleLapkin`

Use a u64 for the rmeta root position Waffle noticed this in rust-lang/rust#117301 (comment) We've upgraded the other file offsets to u64, and this one only costs 4 bytes per file. Also the way the truncation was being done before was extremely easy to miss, I sure missed it! It's not clear to me if not having this change effectively made the other upgrades from u32 to u64 ineffective, but we can have it now. r? `@WaffleLapkin`

…Lapkin,Nilstrieb Report I/O errors from rmeta encoding with emit_fatal rust-lang#119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (rust-lang#119440 (comment)) for more out-of-disk ICEs on current master and yep there's 2 in there. So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does. <details><summary>code for the dylib</summary> ```rust // cargo add libc rand backtrace use rand::Rng; #[no_mangle] pub extern "C" fn write( fd: libc::c_int, buf: *const libc::c_void, count: libc::size_t, ) -> libc::ssize_t { if fd > 2 && rand::thread_rng().gen::<u8>() == 0 { let mut count = 0; backtrace::trace(|frame| { backtrace::resolve_frame(frame, |symbol| { if let Some(name) = symbol.name() { if count > 3 { eprintln!("{}", name); } } count += 1; }); true }); unsafe { *libc::__errno_location() = libc::ENOSPC; } return -1; } else { unsafe { let res = libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize; if res < 0 { *libc::__errno_location() = -res as i32; -1 } else { res } } } } ``` </details> Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this: ```bash while true; do cargo clean LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors if grep "thread 'rustc' panicked" errors; then break fi done ``` My "big project" for testing was an otherwise-empty project with `cargo add axum`. Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found) I believe the problem is that even though since rust-lang#117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window. I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as https://github.com/rust-lang/rust/blob/e51e98dde6a60637b6a71b8105245b629ac3fe77/compiler/rustc_codegen_ssa/src/back/write.rs#L542 but unlike rmeta encoding I am not aware of those cases of those causing problems. r? `@WaffleLapkin`

…Lapkin,Nilstrieb Report I/O errors from rmeta encoding with emit_fatal rust-lang#119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (rust-lang#119440 (comment)) for more out-of-disk ICEs on current master and yep there's 2 in there. So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does. <details><summary>code for the dylib</summary> ```rust // cargo add libc rand backtrace use rand::Rng; #[no_mangle] pub extern "C" fn write( fd: libc::c_int, buf: *const libc::c_void, count: libc::size_t, ) -> libc::ssize_t { if fd > 2 && rand::thread_rng().gen::<u8>() == 0 { let mut count = 0; backtrace::trace(|frame| { backtrace::resolve_frame(frame, |symbol| { if let Some(name) = symbol.name() { if count > 3 { eprintln!("{}", name); } } count += 1; }); true }); unsafe { *libc::__errno_location() = libc::ENOSPC; } return -1; } else { unsafe { let res = libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize; if res < 0 { *libc::__errno_location() = -res as i32; -1 } else { res } } } } ``` </details> Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this: ```bash while true; do cargo clean LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors if grep "thread 'rustc' panicked" errors; then break fi done ``` My "big project" for testing was an otherwise-empty project with `cargo add axum`. Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found) I believe the problem is that even though since rust-lang#117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window. I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as https://github.com/rust-lang/rust/blob/e51e98dde6a60637b6a71b8105245b629ac3fe77/compiler/rustc_codegen_ssa/src/back/write.rs#L542 but unlike rmeta encoding I am not aware of those cases of those causing problems. r? ``@WaffleLapkin``

Rollup merge of rust-lang#119510 - saethlin:fatal-io-errors, r=WaffleLapkin,Nilstrieb Report I/O errors from rmeta encoding with emit_fatal rust-lang#119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (rust-lang#119440 (comment)) for more out-of-disk ICEs on current master and yep there's 2 in there. So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does. <details><summary>code for the dylib</summary> ```rust // cargo add libc rand backtrace use rand::Rng; #[no_mangle] pub extern "C" fn write( fd: libc::c_int, buf: *const libc::c_void, count: libc::size_t, ) -> libc::ssize_t { if fd > 2 && rand::thread_rng().gen::<u8>() == 0 { let mut count = 0; backtrace::trace(|frame| { backtrace::resolve_frame(frame, |symbol| { if let Some(name) = symbol.name() { if count > 3 { eprintln!("{}", name); } } count += 1; }); true }); unsafe { *libc::__errno_location() = libc::ENOSPC; } return -1; } else { unsafe { let res = libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize; if res < 0 { *libc::__errno_location() = -res as i32; -1 } else { res } } } } ``` </details> Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this: ```bash while true; do cargo clean LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors if grep "thread 'rustc' panicked" errors; then break fi done ``` My "big project" for testing was an otherwise-empty project with `cargo add axum`. Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found) I believe the problem is that even though since rust-lang#117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window. I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as https://github.com/rust-lang/rust/blob/e51e98dde6a60637b6a71b8105245b629ac3fe77/compiler/rustc_codegen_ssa/src/back/write.rs#L542 but unlike rmeta encoding I am not aware of those cases of those causing problems. r? ``@WaffleLapkin``

Use a u64 for the rmeta root position Waffle noticed this in rust-lang/rust#117301 (comment) We've upgraded the other file offsets to u64, and this one only costs 4 bytes per file. Also the way the truncation was being done before was extremely easy to miss, I sure missed it! It's not clear to me if not having this change effectively made the other upgrades from u32 to u64 ineffective, but we can have it now. r? `@WaffleLapkin`

rustbot assigned TaKO8Ki Oct 28, 2023

saethlin force-pushed the finish-rmeta-encoding branch from e33d313 to 80648b8 Compare October 28, 2023 05:29

bjorn3 reviewed Oct 28, 2023

View reviewed changes

compiler/rustc_interface/src/queries.rs Show resolved Hide resolved

saethlin force-pushed the finish-rmeta-encoding branch from fa7023d to df1e12f Compare November 10, 2023 20:25

saethlin force-pushed the finish-rmeta-encoding branch from df1e12f to 9427d5c Compare November 18, 2023 04:17

Call FileEncoder::finish in rmeta encoding

fbaa24e

saethlin force-pushed the finish-rmeta-encoding branch from 9427d5c to fbaa24e Compare November 23, 2023 03:49

WaffleLapkin approved these changes Nov 26, 2023

View reviewed changes

rustbot assigned WaffleLapkin and unassigned TaKO8Ki Nov 26, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 26, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 26, 2023

bors merged commit 3dbb4da into rust-lang:master Nov 26, 2023
12 checks passed

rustbot added this to the 1.76.0 milestone Nov 26, 2023

bors mentioned this pull request Nov 26, 2023

Eagerly compute output_filenames #117584

Merged

saethlin deleted the finish-rmeta-encoding branch November 26, 2023 18:54

saethlin mentioned this pull request Nov 27, 2023

Use a u64 for the rmeta root position #118344

Merged

This was referenced Dec 31, 2023

range start index 351645 out of range for slice of length 16384 getting the resolver for lowering #119456

Closed

Report I/O errors from rmeta encoding with emit_fatal #119510

Merged

ICE: MemDecoder exhausted #119511

Open

This was referenced Feb 11, 2024

ICE: index out of bounds #120554

Closed

rustc panic: "range start index 274090 out of range for slice of length 266240" #120065

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Call FileEncoder::finish in rmeta encoding #117301

Call FileEncoder::finish in rmeta encoding #117301

saethlin commented Oct 28, 2023

rustbot commented Oct 28, 2023

rustbot commented Oct 28, 2023

Mark-Simulacrum commented Oct 28, 2023 •

edited

Loading

saethlin commented Oct 28, 2023

rustbot commented Oct 28, 2023

bors commented Oct 30, 2023

bors commented Nov 17, 2023

bors commented Nov 22, 2023

WaffleLapkin Nov 26, 2023

saethlin Nov 26, 2023

WaffleLapkin commented Nov 26, 2023

bors commented Nov 26, 2023

bors commented Nov 26, 2023

bors commented Nov 26, 2023

rust-timer commented Nov 26, 2023

Call FileEncoder::finish in rmeta encoding #117301

Call FileEncoder::finish in rmeta encoding #117301

Conversation

saethlin commented Oct 28, 2023

rustbot commented Oct 28, 2023

rustbot commented Oct 28, 2023

Mark-Simulacrum commented Oct 28, 2023 • edited Loading

saethlin commented Oct 28, 2023

rustbot commented Oct 28, 2023

bors commented Oct 30, 2023

bors commented Nov 17, 2023

bors commented Nov 22, 2023

WaffleLapkin Nov 26, 2023

Choose a reason for hiding this comment

saethlin Nov 26, 2023

Choose a reason for hiding this comment

WaffleLapkin commented Nov 26, 2023

bors commented Nov 26, 2023

bors commented Nov 26, 2023

bors commented Nov 26, 2023

rust-timer commented Nov 26, 2023

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Mark-Simulacrum commented Oct 28, 2023 •

edited

Loading