-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What are the semantics of move
operands?
#416
Comments
I guess an alternative would be to combine aliasing tricks with ad-hoc non-determinism: For assignments, currently my main problem is that we don't even have a good idea of what |
I don't have any objections to removing move operands everywhere except for function calls. Maybe there are borrowck differences though? |
Here's an example that must be UB, where the contents of a moved-out-of-place are observed: #![feature(custom_mir, core_intrinsics)]
use std::intrinsics::mir::*;
pub struct S(i32);
#[custom_mir(dialect = "runtime", phase = "optimized")]
fn main() {
mir! {
let unit: ();
{
let non_copy = S(42);
// This could change `non_copy` in-place
Call(unit, after_call, change_arg(Move(non_copy)))
}
after_call = {
// So now we must not be allowed to observe non-copy again.
let _observe = non_copy.0;
Return()
}
}
}
pub fn change_arg(mut x: S) {
x.0 = 0;
} However, just making a moved-out place uninit is not sufficient. We have to also prevent writes to the moved-out-of-place. IOW, the following also must be UB: #![feature(custom_mir, core_intrinsics)]
use std::intrinsics::mir::*;
pub struct S(i32);
#[custom_mir(dialect = "runtime", phase = "optimized")]
fn main() {
mir! {
let unit: ();
{
let non_copy = S(42);
let ptr = std::ptr::addr_of_mut!(non_copy);
// Inside `callee`, the first argument and `*ptr` are basically
// aliasing places!
Call(unit, after_call, callee(Move(*ptr), ptr))
}
after_call = {
Return()
}
}
}
pub fn callee(x: S, ptr: *mut S) {
// With the setup above, if `x` is indeed moved in
// (i.e. we actually just get a pointer to the underlying storage),
// then writing to `ptr` will change the value stored in `x`!
unsafe { ptr.write(S(0)) };
assert_eq!(x.0, 42);
} |
rust-lang/rust#113569 implements a possible semantics in Miri that makes the above examples UB. This basically makes
I think this is a reasonably nice semantics and avoids extra non-determinism, but unfortunately it doesn't work entirely -- actually doing in-place argument passing changes the observations made through pointer comparisons in a way that this semantics does not explain. (Specifically, the address of the arguments and return place in the callee is still always distinct from previously created allocations.) |
I suppose we can have a pass before |
What I meant is removing them from the syntax. |
Watching this Carbon talk, I think that Carbon elegantly avoided the problems around address identity by making function arguments, by default, values rather than places. Values don't have an observable address so the callee cannot tell where they are living. Unfortunately I don't see a way for Rust to adopt a semantics like that in a backwards-compatible way; it relies on not being able to take the address of a function argument. |
Link to relevant part of video To me that sounds more appealing when it comes to borrowed values than owned ones. For large owned values that are passed on the caller’s stack, we know the value exists in memory and has a stable address as long as the function runs. Not letting the user take advantage of that – or in other words, forcing a copy to a new stack allocation if the user tries to do something that does require an address – seems unnecessarily limiting. Are there optimizations that a no-observable-address approach would allow that aren’t currently viable? For borrowed arguments, not having address identity would have allowed the compiler to optimize borrows of small types, like Of course, the ship has long since sailed on whether references carry address identity, and there were good reasons why they do. But it sounds like Carbon might be able to do better, by allowing arguments to be passed in 'borrowed' form without actually using pointer types, and having the compiler automatically determine whether to pass the data in a pointer or in registers. The cost is that it doesn't seem to have a first-class type representing an immutable borrow at all – or if it does, it's not mentioned in the video and would represent additional language complexity compared to Rust, which has only one way to represent an immutable borrow. If it doesn't have one, then borrows become less composable. Is it possible in Carbon to have an argument of type equivalent to |
Having the semantics depend on "large" and "owned" seems really unappealing. It would be a different matter if we guaranteed that this was always done like that.
If you are passing an |
Well, I think it's fine for it to be nondeterministic. I'm more concerned about what optimizations might be ruled out by any particular definition.
I know, but I'm arguing the opposite of your last sentence. I think arguments having observable addresses is not a big problem, but It sounds like what Carbon is doing for its default argument passing mode is somewhere in between Rust's |
By the way, there’s another reason that address stability of references is relevant. Suppose that you could go back in time and change Rust 1.0 to disallow taking the address of function arguments, while keeping references as-is. There would be a usability problem, because what if a function that takes In Carbon this (apparently) isn’t a problem because Carbon’s equivalent of |
Ah, that's an interesting way to frame this, thanks. They do compare it with C++ references that also are not quite like ours (though I don't recall if C++ references have an observable address).
That's a good point. |
This question is tied up in several discussions:
However it's all somewhat messy and mixed with other questions, making it all a bit hard to follow. Part of the question is where
move
operands are even allowed and which places they may work on.rust-lang/rust#112564 has some nice examples demonstrating that
move
is already meaningful today for function calls, and Miri fails to properly model that. I am not aware of similar examples for other uses ofmove
(in regular MIR assignments). We should at least have a spec explaining today's codegen (and implement it in Miri); I don't know if there are plans for the future here that would require a tighter spec than that.The function call example lends itself to a semantics like this. At a high level,
move
would mean "load the value from the given place, and then deallocate the place it came from, before allocating any of the new memory needed for this statement -- that way the old memory may actually be reused for the new allocation".However this only makes sense for certain place expressions (
move(local.field)
would have to be disallowed). And how does it look like for MIR assignments?could mean "load the MiniRust
Value
fromlocal
, then dealloclocal
like StorageDead, then StorageLivelocal2
, then store the value in there". But when would that be useful? Iflocal2
was already live before, there's no advantage over just doingso supposedly this should only be used for initializing a value?
Using custom MIR, can we have concrete examples of
move
in an assignment causing behavior that Miri cannot currently explain? (Specifically, having memory reused instead of doing amemcpy
.) If no, could we in principle entirely get rid ofmove
everywhere except for function calls? Cc @bjorn3The text was updated successfully, but these errors were encountered: