-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Environ shim #1147
Environ shim #1147
Conversation
if ecx.machine.communicate { | ||
// Put each environment variable pointer in `EnvVars`, collect pointers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Put each environment variable pointer in `EnvVars`, collect pointers. | |
// Put each environment variable pointer in `EnvVars`, collect pointers. |
src/machine.rs
Outdated
@@ -280,6 +287,9 @@ impl<'mir, 'tcx> Machine<'mir, 'tcx> for Evaluator<'tcx> { | |||
let data = vec![0; size.bytes() as usize]; | |||
Allocation::from_bytes(&data, tcx.data_layout.pointer_align.abi) | |||
} | |||
"environ" => { | |||
memory_extra.environ.as_ref().unwrap().clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this should not be called twice, you can use
memory_extra.environ.as_ref().unwrap().clone() | |
memory_extra.environ.take().unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm but then what should happen if you set another variable? I haven't thought about that part yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ that is handled by miri. You only need to provide the initial value here, afterwards you have a perfectly viable mutable allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean? shouldn't we need to update environ
every time (un)setenv
is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gotta run, so just an excercise and no explanation: go up the call stack from here (you will have to go up a few call frames into the miri engine in rustc) and check out what happens with your allocation and why this code is even triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also see https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/interpret/trait.Machine.html#tymethod.find_foreign_static: this function is only called the first time a foreign static is accessed.
@oli-obk I wonder... would it make sense to expand/change the find_foreign_static
type so that it can/must return an AllocId
? Having to actually return the allocation here is a bit annoying for environ
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would mean we'd end up with two AllocId
s for the same allocation. I guess that happens anyway for immutable statics and the initial value of mutable statics, unsure. Maybe we should rather pass in the AllocId
and require find_foreign_static
to put the allocation into Memory
at that alloc id. We'd need an allocate_with(AllocId, ...)
function, but then everything else could just work as intended without creating confusing situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... we could pull the same trick as with "normal" statics, and treat one AllocId
as "hidden"? We have two AllocId
there and it's working...
I don't think allocate_with(alloc_id)
will work; we cannot change what that ID maps to in the global tcx store as that's immutable -- right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All non interned allocations need to be in the memory's alloc_map, which is the first thing that is always queried. If there's an entry there, no further code is looked at. Theoretically we could proactively insert that allocation at memory creation time instead of the current "lazy" (it's just lazily inserting, the creation is eager) scheme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main problem with this (using Option::take
) is that we would need &mut MemoryExtra
as a parameter and that would require changing a lot of code. Maybe we can add interior mutability to that field instead?
let place = ecx.mplace_field(environ_place, idx as u64).unwrap(); | ||
ecx.write_scalar(var, place.into()).unwrap(); | ||
} | ||
ecx.memory.mark_immutable(environ_place.ptr.assert_ptr().alloc_id).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this prevent later uses of set_env
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the question is, do we try to be smart and recycle those lists, or do we just allocate a new one on each change?
But either way, we'll have to construct such a list multiple times -- so it probably should have its own function.
ecx.memory.mark_immutable(environ_place.ptr.assert_ptr().alloc_id).unwrap(); | ||
// A pointer to that place corresponds to the `environ` static. | ||
let environ_ptr = ecx.force_ptr(environ_place.ptr).unwrap(); | ||
let environ_alloc = ecx.memory.get_raw(environ_ptr.alloc_id).unwrap().clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is suboptimal as the allocation will be left in Memory
forever even if it's never going to get used. I'm not sure if this is feasible, but you may be able to create the alloc with Allocation::undef(size, align)
and then write to it directly instead of going through all the extra accessors above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the allocation last forever makes sense to me, it's a static after all.
We'll also need to update this allocation when the environment changes (whenever environ_place
above gets reallocated).
} | ||
|
||
impl MemoryExtra { | ||
pub fn new(rng: StdRng, validate: bool, tracked_pointer_tag: Option<PtrId>) -> Self { | ||
pub fn new( | ||
rng: StdRng, validate: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's some odd formatting... either all arguments should be on one line, or they should all have their own line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I have rustfmt disabled to avoid huge diffs. I'll do the formatting when I have a more concise PR
There are two allocations whose lifecycle we'll have to manage:
|
I can implement that quickly. I'll add an
I got a little lost so let me see if I understood correctly: you want to preserve the same On the other hand what's blocking this from compiling is that the resulting allocation will have tags and extra but |
That sounds very wrong.
The The annoying and tricky part is that with the current structure, we have a phase separation: there's "before @oli-obk is there any way we can determine the |
yes, you call |
oh wait, that is naming sorry I misread.. No this is something miri has to do on its own. |
I rewrote |
Note: when inserting long text, please make it foldable with something like
"long text" can be wrapped in
I don't follow, what is there to initialize? It's all NULL, isn't it? @oli-obk I don't think our current scheme can work unless we have a way to access an extern static by link_name (and all alternatvies I can come up with are awful). So it'd be worth finding this out for sure. To use Also what happens when there are multiple |
Yes but the static needs to be tagged properly. |
We need some sort of name to alloc ID map in the memory extradata, I see no way around that. find_foreign_static could fill that in lazily |
But even with something like that, Miri would treat two different imports of the same extern static as having different addresses... Okay so the plan then would be for |
Hmmm... ok now I see what you mean. I guess we could make While we could create a scheme similar to lang items, that feels a lot like overkill |
I don't see how that would even help. We need multiple entries in memory's Or we say that we just don't support programs that import the same extern static at multiple places. But even then we need to rework some things -- as @christianpoveda learned the hard way, the current setup means that the initial value of an "extern static" cannot contain pointers, as it is untagged and gets retrofitted with default tags outside of the Or we need an indirection that "canonicalizes" the
|
@oli-obk gave 👍, so I guess we can go ahead with this plan. @christianpoveda do you want to do that? If you have questions about the proposal just ask. :) |
I have a lot of questions but I'll do them when I have played with this a little bit. Sounds good to me :P |
@christianpoveda I am going to implement what I proposed above; I got some time this week-end. |
Once rust-lang/rust#69408 lands, this should be unblocked. :) |
Miri: let machine canonicalize AllocIDs This implements the rustc side of the plan I laid out [here](rust-lang/miri#1147 (comment)). Miri PR: rust-lang/miri#1190
Miri: let machine canonicalize AllocIDs This implements the rustc side of the plan I laid out [here](rust-lang/miri#1147 (comment)). Miri PR: rust-lang/miri#1190
@christianpoveda okay, the groundwork is laid. You should now be able to allocate |
Ok let me rebase this thing to get it working Edit: Actually it might be easier to do a new PR, I'll be back soon. |
Environ shim Remake of #1147. There are three main problems with this: 1. For some reason `update_environ` is not updating `environ` when `setenv` or `unsetenv` are called. Even then it works during initialization. 2. I am not deallocating the old array with the variables in `update_environ`. 3. I had to store the `environ` place into `MemoryExtra` as a field to update it. I was thinking about changing `extern_statics` to store places instead of `AllocID`s to avoid this. @RalfJung
The Rust side of this is at rust-lang/rust#68259
Makes progress towards #756
r? @RalfJung