Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not allocate or unwind after fork #81858

Merged
merged 14 commits into from
May 16, 2021
Merged

Conversation

ijackson
Copy link
Contributor

@ijackson ijackson commented Feb 7, 2021

Objective scenarios

  • Make (simple) panics safe in Command::pre_exec_hook, including most panic! calls, Option::unwrap, and array bounds check failures.
  • Make it possible to libc::fork and then safely panic in the child (needed for the above, but this requirement means exposing the new raw hook API which the Command implementation needs).
  • In singlethreaded programs, where panic in pre_exec_hook is already memory-safe, prevent the double-unwinding malfunction panic! in Command child forked from non-main thread results in exit status 0 #79740.

I think we want to make panic after fork safe even though the post-fork child environment is only experienced by users of unsafe, beause the subset of Rust in which any panic is UB is really far too hazardous and unnatural.

Approach

  • Provide a way for a program to, at runtime, switch to having panics abort. This makes it possible to panic without making any heap allocations, which is needed because on some platforms malloc is UB in a child forked from a multithreaded program (see std::process::unix: Command: Do not unwind past fork(), in child #80263 (comment), and maybe also the SuS spec).
  • Make that change in the child spawned by Command.
  • Document the rules comprehensively enough that a programmer has a fighting chance of writing correct code.
  • Test that this all works as expected (and in particular, that there aren't any heap allocations we missed)

Fixes #79740

Rejected (or previously attempted) approaches

  • Change the panic machinery to be able to unwind without allocating, at least when the payload and message are both 'static. This seems like it would be even more subtle. Also that is a potentially-hot path which I don't want to mess with.
  • Change the existing panic hook mechanism to not convert the message to a String before calling the hook. This would be a surprising change for existing code and would not be detected by the type system.
  • Provide a raw_panic_hook function to intercept panics in a way that doesn't allocate. (That was an earlier version of this MR.)

History

This MR could be considered a v2 of #80263. Thanks to everyone who commented there. In particular, thanks to @m-ou-se, @Mark-Simulacrum and @hyd-dev. (Tagging you since I think you might be interested in this new MR.) Compared to #80263, this MR has very substantial changes and additions.

Additionally, I have recently (2021-04-20) completely revised this series following very helpful comments from @m-ou-se.

r? @m-ou-se

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 7, 2021
Copy link
Member

@the8472 the8472 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this also work for vfork?

library/std/src/sys/unix/ext/process.rs Outdated Show resolved Hide resolved
library/std/src/panicking.rs Outdated Show resolved Hide resolved
library/std/src/sys/unix/ext/process.rs Outdated Show resolved Hide resolved
@ijackson
Copy link
Contributor Author

ijackson commented Feb 7, 2021

Would this also work for vfork?

vfork is more tricky, because anything you mutate in the child stays mutated (or may do - some people provide a vfork which is just fork...)

I think allocation is still potentially unsafe because you may deadlock with your suspended parent, and the thread ids or TLS etc. may make no sense to your allocator. So you still want to be able to avoid allocation and this MR achieves that.

There's the additional complication that with vfork, setting the panic hook affects the parent too. You'd have to Do Something about that. Maybe check the pid in your hook.

So I think this MR makes it easier to DTRT for vfork, but I haven't thought about it thoroughly. In particular I'm not sure if even after this MR, it is possible to arrange for it to be safe to panic after vfork.

@bors

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@ijackson
Copy link
Contributor Author

Rebased onto master and fixed the trivial merge conflict.

@crlf0710 crlf0710 added the O-unix Operating system: Unix-like label Mar 5, 2021
@bors

This comment has been minimized.

@ijackson ijackson force-pushed the fork-no-unwind branch 2 times, most recently from fca912b to 139801d Compare March 10, 2021 12:30
@ijackson
Copy link
Contributor Author

Rebased onto master and fixed the trivial merge conflict.

@bors

This comment has been minimized.

Copy link
Member

@m-ou-se m-ou-se left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds quite a large API surface area for a very niche issue. It'd be great to make panicking after fork safer, but I'm not sure if it's worth it to add so many new things for it. Maybe there's a simpler way that would be good enough?

A big part of the api (the raw_hook) is only there to avoid the allocation of a String after a panic!() with formatting arguments, right? The alternative would be to only have this work for panic!("literal without arguments"), which does not allocate before the hook is executed. That won't work on Result::unwrap() though. (But it would for Option::unwrap().)

Not sure how/if we could do this, but maybe we could instead make a way to fully disable the global allocator. Replace it with one that won't (de)allocate anything anymore. That would prevent a lot more misusages in pre_exec, and we could have the panic handler skip allocating the String before calling the hook, if allocation fails.

Comment on lines 412 to 413
// We wrap this so that we never return 0, which would cause
// std::io::write_all et al to box to create a WriteZero error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #83353 to fix this.

@m-ou-se
Copy link
Member

m-ou-se commented Mar 21, 2021

Another idea would be to be able to set some kind of flag (or set panic_count to a special value) that would cause rust_panic_with_hook to simply util::dump_print + abort, just like it already does for panic_count > 2. Then you don't need to do anything with the hook at all.

@ijackson
Copy link
Contributor Author

@m-ou-se Thanks for the review. You make many good observations and suggestions which I don't have time to look at right now in detail. I will come back to this. Thanks!

@rustbot modify labels -S-waitng-on-review +S-waiting-on-author

@rustbot rustbot added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 21, 2021
@m-ou-se m-ou-se removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 21, 2021
@bors

This comment has been minimized.

@crlf0710
Copy link
Member

@ijackson Ping from triage! Any updates on this? Also, there's merge conflict now.

@crlf0710 crlf0710 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 17, 2021
@ijackson
Copy link
Contributor Author

@ijackson Ping from triage! Any updates on this? Also, there's merge conflict now.

Hi. Thanks for the ping. I may be able to look at this today...

@ijackson
Copy link
Contributor Author

Not sure how/if we could do this, but maybe we could instead make a way to fully disable the global allocator. Replace it with one that won't (de)allocate anything anymore. That would prevent a lot more misusages in pre_exec, and we could have the panic handler skip allocating the String before calling the hook, if allocation fails.

I think this would be ideal. However it is really tough. The existing compile-time indirections boil each call to the allocator down to a call to malloc, without any runtime indirection or status checks or anything like that. Adding some kind of additional check to that path is, I think, out of the question. Ideally the libc would provide us with a way to cause malloc to abort; for any reasonable implementation there will be a slow path whose use could be forced and which could do the additional check. But that's not where we are. I will pursue your other ideas.

@ijackson
Copy link
Contributor Author

Another idea would be to be able to set some kind of flag (or set panic_count to a special value) that would cause rust_panic_with_hook to simply util::dump_print + abort, just like it already does for panic_count > 2. Then you don't need to do anything with the hook at all.

I tried this approach and it worked very well with very little code.

I chose to steal a bit out of the top of the thread-local panic_count. In principle the thing is now a struct containing this one bit, plus a (n-1)-bit counter, but that would be bigger than one word, so I chose to pick a flag bit. The flag can only be set, never cleared, and always forces panics to go via the slow path.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 14, 2021
@Aaron1011
Copy link
Member

@bors retry

@Mark-Simulacrum
Copy link
Member

@bors r- r=m-ou-se

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 15, 2021
@bors
Copy link
Contributor

bors commented May 15, 2021

📌 Commit 88ccaa7 has been approved by m-ou-se

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 15, 2021
@Mark-Simulacrum
Copy link
Member

@bors retry

@bors
Copy link
Contributor

bors commented May 15, 2021

⌛ Testing commit 88ccaa7 with merge 77fee3038ddf4e8e451975a88b57c0f1bd45cdf8...

@bors
Copy link
Contributor

bors commented May 15, 2021

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 15, 2021
@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

@Mark-Simulacrum
Copy link
Member

@bors retry aarch64-gnu builder no logs

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 15, 2021
@bors
Copy link
Contributor

bors commented May 15, 2021

⌛ Testing commit 88ccaa7 with merge d565c74...

@bors
Copy link
Contributor

bors commented May 16, 2021

☀️ Test successful - checks-actions
Approved by: m-ou-se
Pushing d565c74 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 16, 2021
@bors bors merged commit d565c74 into rust-lang:master May 16, 2021
@rustbot rustbot added this to the 1.54.0 milestone May 16, 2021
@ijackson
Copy link
Contributor Author

Hooray! Thanks everyone.

@m-ou-se
Copy link
Member

m-ou-se commented May 17, 2021

This is now available on nightly: https://doc.rust-lang.org/nightly/std/panic/fn.always_abort.html

Thanks for working on this, @ijackson!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. O-unix Operating system: Unix-like S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic! in Command child forked from non-main thread results in exit status 0