First version of the Extending Safe Mutability proposal. #78

matthieu-m · 2014-05-16T18:14:46Z

Here comes an improved version of the discussion started on reddit An alternative take on memory safety (mutability/aliasing).

I went down the rabbit hole, trying to dig up the fundamentals of memory safety at the lowest level available (ie, the C Standard level), hopefully, this work can always be reused later on in the event this RFC does not pan out.

I still feel that segregating mutability from aliasing is a worthy goal in and out of itself. For an extreme example, Concurrent Containers are generally shared and yet you might want to restrict mutation to only some references.

This proposal is thus about achieving safe mutability in the presence of aliasing:

with the mutability clearly documented in the type system
and opening the way to compiler-verified safely mutable types (and their methods)

This may be seen as adding significant complexity and it certainly adds complexity to the compiler. From a user point of view, however, the segregation makes both mutability and aliasing explicit and thus more easily searchable; with support from the compiler diagnostics (reporting all useful terms) it might actually end up being both more approachable and more flexible.

Finally, the appeal of immutability may be a sufficient reason in itself: it helps drawing all those users to which immutability has been taught as The Silver Bullet(tm) of programming.

bgamari · 2014-05-16T19:34:33Z

active/0000-extending-safe-mutability.md

+
+*Note: why `mutable` and not `mut`? Just so we can easily identify snippets written in the current system from those written in the proposed system.*
+
+It is trivially provable that mutation through `&exclusive T` or `&exclusive mutable T` is safe, although we disable the former. The borrow-checker current implementation is sufficient to enforce the `exclusive` rule, and therefore we can easily convert today's code:


Minor nit: "disallow" might be more clear than "disable" here. Also "borrow-check" should probably be "borrow-checker's"

thestinger · 2014-05-17T02:03:16Z

I have a much simpler transitive mutability system to propose, although I don't really want the current system to change. The important thing to take into account is that mutability and aliasing are not orthogonal concepts. The useful definition of alias used by C and LLVM refers to a memory dependency. This means two pointers alias if one can observe a write through the other. If the pointers are both read-only, they do not alias. If they're guaranteed to point to immutability memory, they alias no other pointer.

There are 3 kinds of mutability in Rust:

external (observed) mutability (&mut)
internal (inherent) mutability (Cell)
inherited mutability

The mut keyword only refers to inherited mutability and the other two are always permitted. The proposal to introduce a third &uniq reference type simply takes &T and allows external mutability by borrowing the unique ownership of the value. Both would continue forbid inherited mutability and permit internal mutability. The only use case is avoiding some mut markers on locals, but with by-value captures for closures I think this will be so rare that it would be silly to special-case it.

In a system where Rust had no mutability markers for local variables (which is entirely a check of the programmer's intuition), it would only have the &T and &mut T pointer types. This was the alternative proposal by @nikomatsakis, although it was mixed with a renaming of the type. There would be no difference between &uniq T and the current &mut T type without immutable locals, so lets leave the naming issue aside for now.

If you want transitive immutability, then you don't want any changes to aliasing. The system you want is a distinction between three types of mutability:

inherited, external and internal mutability
external and internal mutability
transitive immutability

A fully backwards compatible example is the following:

&, x (internal mutability for both and external mutability for variables)
&mut, mut x (inherited, internal and external mutability)
&imm, imm x (transitive immutability) along with a reintroduction of Freeze

There are no real changes required to the type system or standard libraries to support this. The compiler is already fully aware of inherited (mut), internal (Unsafe<T>) and external mutability (&mut T). The feature would do nothing more than prevent coercion from &T to &imm T or &imm borrows when the type contains Unsafe<T> or &mut T. The Freeze trait would indicate a lack of internal Unsafe<T> and &mut T for use as a bound. I don't really see the value prospect, but that's all you need to extend inherited mutability to transitive immutability.

lilyball · 2014-05-17T03:01:04Z

I agree 100% with @thestinger. I don't think the current system should change, but if it should, adding &imm seems to be the only approach that actually does what all these various proposals are trying to do (which is, to provide a way to ensure deep immutability). I'd rather not have it, though, as I believe it's an unnecessary complication.

lilyball · 2014-05-17T03:06:36Z

active/0000-extending-safe-mutability.md

+
+Let us define a model to talk about how types are laid out in memory.
+
+For interoperability reasons, Rust lays its `struct`s out like C would. We will take the assumption that its `enum`s and arrays are laid out similarly.


I don't believe we make any guarantees about how enums are laid out. There's no FFI compatibility issue there (as our enum types don't exist in C). The only guarantee we make is for C-like enums, where we guarantee that the enum is represented by an integral value with the given discriminant (and the integral type can be controlled with the #[repr] attribute).

It might be unclear. The point I am trying to make is that much like struct, enum are (probably) laid out without concern for their names but only for their "shape". It might not hold for Option<~T> though, which is special cased (and I don't know if this special casing is done by name or shape).

lilyball · 2014-05-17T03:54:15Z

You didn't make it entirely clear as to whether you can take a &mutable reference to any type, or just to types of SafelyMutable (you defined SafelyMutable types as being mutable through &mutable references, but didn't say whether the reference itself was legal for non-SafelyMutable types). I assume it has to be legal to take a &mutable reference to any type, as that's the only way that type can mutate any contained SafelyMutable fields.

In any case, your proposal here makes it impossible to implement Rc. Specifically, Rc needs to mutate its contained Cells from the clone() method, but clone() is defined by a trait and must take &self. It cannot take &mutable self in the trait as it needs to be callable on non-mutable values. But if it takes &self then Rc can't update its reference count.

I don't see any solution to this. The only thing that makes Rc work is the ability for Cell to be mutated via a & reference. Even if you try and inline the behavior of Cell into Rc, it has the same problem (where mutation must be able to happen from within a &self method).

lilyball · 2014-05-17T04:01:50Z

active/0000-extending-safe-mutability.md

+ - no `unsafe` code in `get` and `set`
+ - `set` now requires a `&mutable self` parameter
+
+*Note: `noshare` should be unnecessary now, because the aliasing of this type is tracked properly.*


noshare cannot be inferred, because that would prevent Mutex from being written. Mutex needs to be Share, but it also needs to be SafelyMutable. It's safe to be Share because the implementation of Mutex is designed to enforce this safety.

lilyball · 2014-05-17T05:12:03Z

If this proposal were implemented, I could still write a Cell-like type that allows for mutation through & pointers. I think it's possible with TLS (if you're willing to accept that the Cell is non-Send), and I think it's also possible using a static mut (using a thread-safe data structure to actually contain the values). The Cell in this case would only contain an immutable key.

matthieu-m · 2014-05-17T12:34:43Z

The proposal does not seek to eliminate all uses of internal mutability (such as Rc), but instead to introduce mutable to document (in the type signature) visible mutability such as that of Cell.

It then goes on to try and define helpful rules to diminish the use of unsafe code, which is a somewhat orthogonal goal, but I seriously doubt that it will ever be possible to remove all instances of unsafe and therefore we need to draw the line... somewhere.

I think that as long as visible mutability is clearly documented, we have won something. If it takes unsafe to implement methods on &mutable self, so be it.

There were holes and distractions in the previous iteration, however perfection is unnecessary so let's just aim for a step forward instead. Besides, the language will continue to evolve, and it will be easier to add it than to remove from it. Notable changes: - removed the distraction on layouts, Rust is nominally typed so let us focus on nominal typing. - removed the vague "do not leak references", it is much too easy to accidentally introduce holes here. - removed the example of RefCell, and focus on POD and Cell for now. - reorganized the sections a bit, with more examples. - added some more unresolved questions to ponder.

lilyball · 2014-05-17T19:01:28Z

This still doesn't solve the Rc problem. The only way for Rc to work is if it uses unsafe to coerce a &self pointer to &mutable self within the clone() method. And as far as I can tell, the ability to do that contradicts the entire point of this proposal (which is to disallow "hidden" mutability).

matthieu-m · 2014-05-17T19:14:56Z

The point of the proposal is not actually to disable hidden mutable. Or not exactly.

The point of the proposal is to propose a framework for annotating operations that mutate user-visible state:

regardless of whether the reference is exclusive or not, which is an incident property
and with no regard for user-invisible state (such as the state of the actual mutex in Mutex or the reference count in Rc)
and with no intention of replacing fully unsafe contexts, though I do wish to make a dent here

With this proposal, MutexGuard would implement DerefExclusive (fn deref<'a>(&'a self) -> &'a exclusive T) and DerefExclusiveMut (fn deref<'a>(&'a mutable self) -> &'a exclusive mutable T). It always guarantee non-aliasing and the programmer may choose whether mutation of the internal object is allowed or not.

lilyball · 2014-05-17T19:31:24Z

If this proposal essentially allows opt-in mutability annotations, but doesn't actually prohibit internal mutability through a &self pointer, then it doesn't actually appear to solve any problem. I can already provide opt-in mutability annotations on methods by giving them more descriptive names. I thought the whole point was to require mutability annotations, because only then can you actually consider &T to be deeply immutable.

ftxqxd · 2014-05-17T21:27:19Z

With this proposal, would there be any downside to simply using &mutable T pointers anywhere where we’d use &T today? As far as I can tell, they only add functionality to &T (they can mutate SafelyMutable types) and don’t restrict any (they’re still aliasable). Presumably using &mutable pointers everywhere would also require making all variables mutable as well, so would it be possible to combine Niko’s proposal with this one, remove mutability and have 2 pointer types: &mutable T and &exclusive mutable T (obviously with different names; perhaps &T and &my T respectively)?

lilyball · 2014-05-17T21:42:56Z

@P1start That's functionally no different than just adopting Niko's proposal, because you've just gotten rid of &T and the only real purpose to this PR #78 is to establish the difference between &T and &mutable T (which of course requires &T to exist).

ftxqxd · 2014-05-17T21:48:11Z

@kballard How does that make it no different? It allows one to have multiple mutable pointers to the same location (if it’s SafelyMutable). The following code would not work under Niko’s proposal:

let x = 4;
let y = &x;
let z = &x;
*y = 5; // under Niko’s proposal this wouldn’t work
*z = 6; // … nor this

It also allows for a safe Cell implementation (which I don’t think Niko’s proposal does).

zwarich · 2014-05-17T21:51:15Z

One correction to @thestinger's comment: aliasing in C and LLVM is not just about mutability.

If I have a fragment of LLVM IR with two loads from addresses with distinct TBAA tags, followed by a comparison of the two addresses, an optimization pass is allowed to optimize the comparison even though there is nothing involved with mutability. A more realistic example would be scalar replacement. If an alloca is potentially read from by a load that may read from multiple allocas depending on control flow, then that would block a scalar replacement pass from converting the alloca to an SSA variable (unless that pass is sufficiently advanced to rewrite the code to push the load up the control flow), and this has nothing to do with mutability. Another example is points-to analysis for devirtualization, which is nontrivial even without any mutation.

Aliasing is a dynamic property of variables and expressions in a language with a semantics based on memory locations. This dynamic property can be approximated with simple static analyses (e.g. comparing offsets on loads with a common base, looking for escaping pointers, etc.) and augmented with type systems. These static approximations of aliasing can then be used to create static approximations of dependence between memory operations, but aliasing is not defined in terms of memory dependence.

Aliasing is actually more important for safety of deallocation than safety of mutation. If code iterating over a fixed-size vector of integers has multiple mutable references into the vector, then no violation of memory safety can occur unless the vector's storage is deallocated. Of course, since Rust pervasively uses destructors, mutation often goes hand-in-hand with deallocation, and the removal of const means that POD and non-POD types now share the same rules, even though it is not required for memory safety.

lilyball · 2014-05-17T21:51:47Z

@P1start Ok, yes, I forgot that this proposal defined primitives as SafelyMutable. So you're right it's not exactly the same thing.

But I think that making primitives mutable like this is actually a bad thing. Today, LLVM knows that any &T pointer, where T is not Unsafe (and the vast majority of types are not Unsafe), does not alias with any mutable pointer. This allows it to optimize away redundant reads, or to reorder reads with respect to other write operations. But once T is a type that may mutate (which is to say, that's Unsafe today, or SafelyMutable with this proposal), it loses that optimization opportunity.

zwarich · 2014-05-17T21:57:38Z

@kballard Today, LLVM does not know that &T does not alias with any mutable pointer. This property can not be expressed with LLVM's existing TBAA system, even disregarding Unsafe.

lilyball · 2014-05-17T21:59:09Z

@zwarich Really? Darn, I thought it did know that. That's rather unfortunate.

Still, there was talk about writing a custom LLVM pass to teach it about alias information that Rust has that it can't express to LLVM today. Presumably such a pass would then enable LLVM to make the optimizations I just described.

ftxqxd · 2014-05-17T22:13:43Z

@kballard Yes, I now see that it does seem somehow wrong to have almost no assertions about mutability anywhere. Although I’d probably personally find it easier to code with, all pointers being mutable seems like numerous optimisations could be made impossible, and just knowing that a reference or variable will not change can be useful when writing code.

thestinger · 2014-05-18T05:22:00Z

@matthieu-m: The mutation performed during clone for Rc is essentially user-visible. It can cause the destructor (fn drop(&mut self) { println!("side effect") }) of the contained value to be run later.

thestinger · 2014-05-18T05:25:13Z

@zwarich: That may apply to TBAA metadata, but I expect that we would be using a custom pass where NoAlias is documented as being entirely about memory dependency. The same set of rules is used for noalias parameters and return values, which we're already using.

zwarich · 2014-05-18T18:44:19Z

@thestinger There are other aliasing rules in the LLVM LangRef, e.g. noalias pointers can't alias pointers that are not derived from them, global variables and allocas don't alias other memory, etc. You are correct in asserting that despite the name, LLVM's AliasAnalysis interface isn't really an alias analysis interface; it's a memory dependence interface. We would probably need a custom system of TBAA metadata on memory operations, a replacement for the BasicAlias pass, and some modifications to the inliner (to ensure that the must-not-alias constraints of a &mut inlined twice don't overlap). Hopefully we can ride on support for the existing TBAA metadata so that existing passes that copy that metadata will just copy ours, and the inliner will be the only existing pass we need to modify.

@matthieu-m The compiler doesn't need to analyze T in depth in order to determine whether &T aliases another mutable reference. Any load or store that violates this aliasing rule will have to occur behind an unsafe block, and the compiler will just have to mark those memory operations as potentially aliasing anything.

There is a bit of a problem with this approach: since transitive unsafety is quite pervasive (so many basic data structures are implemented in terms of unsafe), many operations will ultimately expand to an inlined unsafe block. At the moment, the compiler has to basically assume that unsafe code can do anything, a raw pointer used in an unsafe block has to be assumed to alias any other pointer. The only way to improve this situation would be to have some way to distinguish between type-unsafe and lifetime-unsafe. A lifetime-unsafe use of a T* would still be seen as only aliasing other pointers to T, whereas a type-unsafe use of T* would be seen as potentially aliasing any other pointer.

@huon What is gained by making modifications through &T in an unsafe block be undefined behavior?

thestinger · 2014-05-18T18:47:38Z

active/0000-extending-safe-mutability.md

+
+## 1. Defining safe mutation
+
+Rust is nominally typed, and therefore it seems dangerous to allow a `&A` to point to an instance of anything else than a `A`. Not only methods could be called that would be unexpected, but it would possibly confuse the Type-Based Alias Analysis in LLVM.


Rust doesn't really make any guarantees about this right now.

thestinger · 2014-05-18T20:00:26Z

@zwarich: The current rules require that internal mutability happens through an Unsafe<T> wrapper type, so the compiler is aware of types where unsafe is used for that purpose.

zwarich · 2014-05-18T21:14:07Z

@thestinger: That rule is frequently violated in Rust and Servo code, so I consider it more wishful thinking at the moment. I'm also not entirely convinced it's sound, although it's difficult to say since it's never even been precisely stated.

huonw · 2014-05-18T21:57:20Z

The statement is exactly that: the only legal way to mutate aliased data is by using the Unsafe type to perform the mutation.

RefCell, Cell, Mutex, RWLock and the atomics all use Unsafe internally. (There's only a few remaining instances of non-Unsafe mutation in Rust; at least, I removed the obvious ones when I removed the old transmute_mut function and, iirc, the only ones I didn't migrate to Unsafe were those where the & references didn't alias with anything. I left fixmes on them too.)

The advantage of this rule is &T references are known to be NoAlias except if T contains Unsafe.

nrc · 2014-05-18T22:12:20Z

@huonw by 'legal', do you mean by convention or that the compiler enforces it?

huonw · 2014-05-18T22:21:54Z

I mean: not using Unsafe is undefined behaviour (in the technical sense, like dereferencing null or signed integer overflow in C). That is, the compiler is allowed to assume that no-one ever mutates behind & without using Unsafe and is allowed to optimise based on it never happening, which may break any program that does invoke the bad behaviour.

thestinger · 2014-05-19T02:10:28Z

@zwarich: The manual has stated that this is undefined behaviour for quite some time. The standard library types with inherited mutability used to rely on a #[no_freeze] marker, followed by NoFreeze and now Unsafe<T>. If Servo is relying on this, I think that's an issue for Servo to fix rather than one rustc has to be concerned with.

rkjnsn · 2014-05-21T00:17:20Z

I'm pretty new to Rust, but I thought I'd drop my two cents.

To me, uniqueness and mutability seem to be distinct, though related, things, and expressing them independently, as in this proposal, makes the most intuitive sense.

Uniqueness (or exclusiveness) is a property of the reference, is strongly enforced by the compiler, is required to provide compile-time memory safety, and is one of the features that makes Rust so exciting.

Mutability, on the other hand, I feel is more an expression of programmer intent, and helps with reading and reasoning about code. Having it in the type system is not required for safety, but including it allows the compiler to catch programmer mistakes, which is nice. (E.g., you tried to mutate a value you said you wouldn't (error), or you didn't mutate a value you said you would (warning/lint)).

Because mutability would be a statement of programmer intent, it makes sense to me that it would be concerned with "logical" mutability. In other words, any mutation to a non-mutable object would be limited to private members and would not be readily apparent to the user of the object. This would allow an implementation to be changed to, e.g., use caching for better performance without changing the interface.

While the additional possibilities for reference types introduced by this proposal can be seen as adding complexity, I feel separating exclusivity and mutability actually makes things conceptually simpler and easier to reason about. For example, changing the value in a Cell would require mutability but not exclusiveness, which I find easier to understand/explain than the current situation. I also know that a non-mutable (exclusive or not) reference shouldn't change in a way I have to worry about as a consumer of the type. I think all four combinations are useful and worth having.

All that said, I'm not as convinced about SafelyMutable. It seems like a fair amount of complexity, and I'm not sure what it actually gets you. It seems to have a lot of overlap with the capabilities of Cell.

brson · 2014-06-19T22:22:38Z

Discussed at https://github.com/rust-lang/rust/wiki/Meeting-RFC-triage-2014-06-19. This is a large and complex proposal, containing major changes to language semantics. At this stage in development we don't want to consider such a major overhaul. Closing. Thank you.

First version of the Extending Safe Mutability proposal.

c36700f

bgamari reviewed May 16, 2014
View reviewed changes

thestinger mentioned this pull request May 17, 2014

Rename &mut to &only #58

Closed

lilyball reviewed May 17, 2014
View reviewed changes

thestinger reviewed May 18, 2014
View reviewed changes

brson closed this Jun 19, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First version of the Extending Safe Mutability proposal. #78

First version of the Extending Safe Mutability proposal. #78

matthieu-m commented May 16, 2014

bgamari May 16, 2014

thestinger commented May 17, 2014

lilyball commented May 17, 2014

lilyball May 17, 2014

matthieu-m May 17, 2014

lilyball commented May 17, 2014

lilyball May 17, 2014

lilyball commented May 17, 2014

matthieu-m commented May 17, 2014

lilyball commented May 17, 2014

matthieu-m commented May 17, 2014

lilyball commented May 17, 2014

ftxqxd commented May 17, 2014

lilyball commented May 17, 2014

ftxqxd commented May 17, 2014

zwarich commented May 17, 2014

lilyball commented May 17, 2014

zwarich commented May 17, 2014

lilyball commented May 17, 2014

ftxqxd commented May 17, 2014

thestinger commented May 18, 2014

thestinger commented May 18, 2014

zwarich commented May 18, 2014

thestinger May 18, 2014

thestinger commented May 18, 2014

zwarich commented May 18, 2014

huonw commented May 18, 2014

nrc commented May 18, 2014

huonw commented May 18, 2014

thestinger commented May 19, 2014

rkjnsn commented May 21, 2014

brson commented Jun 19, 2014


		Note: why `mutable` and not `mut`? Just so we can easily identify snippets written in the current system from those written in the proposed system.

		It is trivially provable that mutation through `&exclusive T` or `&exclusive mutable T` is safe, although we disable the former. The borrow-checker current implementation is sufficient to enforce the `exclusive` rule, and therefore we can easily convert today's code:


		Let us define a model to talk about how types are laid out in memory.

		For interoperability reasons, Rust lays its `struct`s out like C would. We will take the assumption that its `enum`s and arrays are laid out similarly.


		## 1. Defining safe mutation

		Rust is nominally typed, and therefore it seems dangerous to allow a `&A` to point to an instance of anything else than a `A`. Not only methods could be called that would be unexpected, but it would possibly confuse the Type-Based Alias Analysis in LLVM.

First version of the Extending Safe Mutability proposal. #78

First version of the Extending Safe Mutability proposal. #78

Conversation

matthieu-m commented May 16, 2014

bgamari May 16, 2014

Choose a reason for hiding this comment

thestinger commented May 17, 2014

lilyball commented May 17, 2014

lilyball May 17, 2014

Choose a reason for hiding this comment

matthieu-m May 17, 2014

Choose a reason for hiding this comment

lilyball commented May 17, 2014

lilyball May 17, 2014

Choose a reason for hiding this comment

lilyball commented May 17, 2014

matthieu-m commented May 17, 2014

lilyball commented May 17, 2014

matthieu-m commented May 17, 2014

lilyball commented May 17, 2014

ftxqxd commented May 17, 2014

lilyball commented May 17, 2014

ftxqxd commented May 17, 2014

zwarich commented May 17, 2014

lilyball commented May 17, 2014

zwarich commented May 17, 2014

lilyball commented May 17, 2014

ftxqxd commented May 17, 2014

thestinger commented May 18, 2014

thestinger commented May 18, 2014

zwarich commented May 18, 2014

thestinger May 18, 2014

Choose a reason for hiding this comment

thestinger commented May 18, 2014

zwarich commented May 18, 2014

huonw commented May 18, 2014

nrc commented May 18, 2014

huonw commented May 18, 2014

thestinger commented May 19, 2014

rkjnsn commented May 21, 2014

brson commented Jun 19, 2014