Better way of discussing memory safety than "segfaults" #30963

lambda · 2016-01-16T20:25:54Z

As pointed out in a thread on users.rlo, language in the documentation and website about preventing segfaults or preventing segfaults outside of unsafe code can be misleading. The Rust compiler uses guard pages to protect against stack overflow, and will hopefully eventually use stack probes to make this sound (#16012). However, this means that it's possible to get a stack overflow using 100% safe code:

fn main() {
    let a = [0; 999999999];
}

This is not a theoretical concern; it has happened in practice using the musl target which has a small stack size, while using std::io::copy which copies a file using a 64k stack allocated buffer, in tailhook/vagga#116.

It seems that in a lot of discussion of Rust, "no segfaults" is used as a proxy for memory safety. However, it's not a very good proxy; there is both the possibility for segfaults without memory safety, like a stack overflow, and the bigger danger of memory unsafety is generally silent corruption of memory, not segfaults; segfaults are just an implementation specific and visible symptom of memory unsafety.

The book says that if you see a segfault "you can be sure the cause is related to something marked unsafe" (recently changed in #30819, but still not accurate, as demonstrated by the above code that segfaults without unsafe). The front page of the website says "prevents segfaults," which while that can be true as it doesn't say it prevents all segfaults, could mislead someone into thinking that it does.

I don't necessarily have a better phrasing for either the book or the website; explaining what is and is not considered memory safe is somewhat nuanced, and hard to do in a single sentence. But I think that whatever the phrasing is, "segfault" should probably not be used as it is misleading.

The text was updated successfully, but these errors were encountered:

steveklabnik · 2016-01-19T15:25:14Z

The book and docs describe Rust as it should be, not paying attention to bugs. The lack of stack probes is a bug, not an inherent limitation of the language.

lambda · 2016-01-19T20:38:28Z

But even once stack probes are added, overflowing the stack will still cause a segfault. All stack probes help is in ensuring that you actually do get that segfault before you start writing beyond the guard page. All a stack probe is, as far as I understand, is extra accesses to memory added in the function prologue to ensure that you actually do try to access that guard page.

Without a stack probe, if you have a stack frame that is larger than a page, some of the references in that frame may extend beyond the guard page. If those are written to before anything within the guard page is touched, it's possible to hit undefined behavior, scribbling over memory that you don't own.

The way a stack probe deals with this is by adding no-op accesses to memory if the stack frame of a function exceeds the page size of the architecture. These no-op accesses will normally do nothing, but if the stack frame overlaps with the guard page, will cause a segfault, before the body of the function has run and been able to potentially overwrite memory it doesn't own. Note that this still produces a segfault; and also, any function call that uses less than a page of stack frame will have no stack probes, as any access to its stack frame would cause a segfault. You can just as easily produce a segfault with the following, which won't be affected by stack probes:

fn overflow(n: i64) {
    if n > 0 {
        overflow(n-1)
    }
}

fn main() {
    overflow(999999999);
}

Note that on Linux, overflowing the stack may die with SIGBUS rather than SIGSEGV (it looks like I get SIGBUS when overflowing the stack on the main thread, and SIGSEGV when overflowing the stack on another thread), and the Rust runtime sets up a signal handler to print a nicer error message when either of those occur, but which particular signal kills the process is not all that relevant; it's still what's colloquially referred to as a "segfault", accessing memory outside of the set of valid mappings.

So, stack probes or not, if using guard pages to detect stack overflow (rather than explicit stack overflow checks on each function call), the result of a stack overflow that hits the guard page will be a segfault. Unless we want to require that the Rust ABI use stack overflow checks on each function call (like the old __morestack approach) rather than guard pages to detect stack overflows, it will always be possible to get a segfault in safe code just by overflowing the stack.

steveklabnik · 2016-01-19T20:56:00Z

Yeah, that makes sense. I guess that there's a distinction in my brain between this kind of segfault and the kind that I'm generally worried about when writing application code; but that distinction may not be worthwhile.

brson · 2016-01-20T01:25:03Z

Are there other places where Rust allows segfaults but recovers (even if recover means abort), or where it might in the future?

I'm not opposed to changing the website, but after re-reading the website blurb, the current phrasing just sounds so good - as in, it has a pleasing sound and cadence. That it isn't quite accurate is a very fine point. We rejected changing the term 'thread safety' previously because it's short and intuitive, and that phrase is perhaps even less accurate.

lambda · 2016-01-27T07:56:17Z

Hmm. Yeah, thread safety is a bit vague, and so could be seen to contain many guarantees that are not provided. On the other hand, its very vagueness gives a little bit of wiggle room, as there isn't a single well accepted definition of thread safety that readers could default to, so I think it would be more expected that they'd need to look up the precise definition in more detail in the docs or FAQ, at which point they can find the full explanation of data race freedom.

"Segfault," on the other hand, is very specific, and a trivial counterexample is so easy to construct, that the specificity of it feels a little off. Since it's just one symptom of the problem being solved, and can still occur in simple beginner programs, I feel like it's a lot more likely to confuse people.

One the other hand, I don't really have a better suggestion for the front page. It is hard to come up with a good, punchy phrasing. "Prevents memory corruption", "Prevents memory errors", "Guarantees memory safety" all come to mind, but don't sound quite so snappy and have the same vagueness issue as "thread safety" (though as mentioned, that may be an advantage as people would know they would need to look up the exact guarantees).

I suppose another option would be to just fix the wording in the book, and add a FAQ item about "I got a segfault, but the website says 'prevents segfaults', what gives?" now that we have a nice FAQ to put it in. That answer could cover stack overflow, unsafe, and guarantees relied upon by unsafe being violated, and maybe point to the nomicon or unsafe chapter in the book for more details.

lilianmoraru · 2016-01-27T08:09:37Z

I think it's okay to update just the documentation and FAQ. It would be the same way as it is for thread safety - you can find in the documentation issues that can still occur with safe Rust when working with threads(dead locks, ...).

lambda · 2016-01-27T08:26:06Z

Are there other places where Rust allows segfaults but recovers (even if recover means abort), or where it might in the future?

Forgot to address this. There are none that I know of. I suppose it would be possible to use allocation adjacent to a guard page on one end of an array (or both if the array is an integral number of pages) to elide bounds checks on that end of the array, but that sounds like a pretty far fetched idea for an optimization as it I think it would trash your TLB if you used it frequently. Any other use I can think of would have to be unsafe code, or would indicate a soundness issue. So I think the only one we have to worry about in safe code is the stack overflow, thought I could be wrong if I haven't thought creatively enough about possible uses of memory map shenanigans.

lambda · 2016-02-06T23:07:30Z

This has been addressed by #31333, where we actually do treat the SIGSEGV on stack overflow as an implementation detail and abort the process like any other runtime abort. I think that this is sufficient to avoid the confusion described here, so closing this ticket out.

sfackler added the A-docs label Jan 16, 2016

This was referenced Feb 1, 2016

Stack overflow should abort the process normally, not segfault and dump core #31273

Closed

Abort on stack overflow instead of re-raising SIGSEGV #31333

Merged

lambda closed this as completed Feb 6, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better way of discussing memory safety than "segfaults" #30963

Better way of discussing memory safety than "segfaults" #30963

lambda commented Jan 16, 2016

steveklabnik commented Jan 19, 2016

lambda commented Jan 19, 2016

steveklabnik commented Jan 19, 2016

brson commented Jan 20, 2016

lambda commented Jan 27, 2016

lilianmoraru commented Jan 27, 2016

lambda commented Jan 27, 2016

lambda commented Feb 6, 2016

Better way of discussing memory safety than "segfaults" #30963

Better way of discussing memory safety than "segfaults" #30963

Comments

lambda commented Jan 16, 2016

steveklabnik commented Jan 19, 2016

lambda commented Jan 19, 2016

steveklabnik commented Jan 19, 2016

brson commented Jan 20, 2016

lambda commented Jan 27, 2016

lilianmoraru commented Jan 27, 2016

lambda commented Jan 27, 2016

lambda commented Feb 6, 2016