-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New lint [ptr_to_temporary
]
#10962
New lint [ptr_to_temporary
]
#10962
Conversation
r? @Alexendoo (rustbot has picked a reviewer for you, use r? to override) |
627aa92
to
0c88f66
Compare
0c88f66
to
caca5bd
Compare
caca5bd
to
96c9004
Compare
☔ The latest upstream changes (presumably #10925) made this pull request unmergeable. Please resolve the merge conflicts. |
I'm still not sure how the lint as written is useful. Blanket banning |
There's still a case of I don't recall exactly where we say what does and doesn't get promoted to statics, and whether that's a stable promise, so I'm not sure that this lint can be useful in that regard. If those rules ever change, this lint is either going to miss too much (preferable, but not great), or error on code that's perfectly sound. |
Huh, weird, I've went with the assumption that giving a function a pointer to temporary would cause immediate UB. Guess not.
If it's not, I see no reason why that shouldn't be linted against (as such code changing at will due to an implementation detail is likely not the intended behavior). Regardless, though; I've done some testing and I think, even if it's UB, it's impossible for it to be reproduced as allocating stack memory always involves subtracting from the stack pointer; in the case of allocating a temporary, then allocating actual locals, it'll never be overwritten because the temporary will be at But; returning a temporary, however, is UB and will always be reproduceable, as it'll both free the temporary and add to the stack pointer which can then in the caller overwrite the pointer's contents if you're unlucky. This should be linted even if it's a static, as for the reason stated above. So, I think we should isolate this to returning temporaries. |
Can this also catch |
I'll look into that, that would be very useful. If |
Ignoring constant promotion and lifetime extension, temporaries usually live until the end of the statement, so they'll be dropped after the function has already returned
Constant promotion should be easy enough to ignore, we can throw in an One thing to be aware of if you're going for returns is block expressions like so https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9bc3a39542f9977e20653e34d50c9f64 In general let's go with whatever MIRI says as the source of truth, if it says it's UB we could lint it, if not we shouldn't |
Good point, we could probably only lint if the temporary implements
I don't think we should just ignore constant promotion, again, imo, if code depends on an implementation detail to be correct that should still be disallowed Also, this returns
I think for the case of returning pointers to temporaries, we should make it more general to "usage of this pointer may cause Undefined Behavior if it isn't promoted to a constant". Even if it isn't UB in some cases, it's definitely not something intended (nor wanted) by the programmer. (It's also worth mentioning constant promotion should not happen if it can change the behavior of the program, in this case it (understandably) can, as the program relies on it to be correct, so I don't think ignoring constant promotion is the right way to go about this.) |
Both constant promotion and lifetime extension are defined as part of the language. Even if they weren't in the reference, they would de facto be part of the language; far too much code would break because of it. |
So, if I understand correctly, the exact criteria for constant promotion are:
And this is entirely independent of If all of those are true, then I can see why we shouldn't lint it. It would be entirely pointless. But IMO, if even one of those aren't it should still be linted. Also,
This was a bad way to put it. It's not an implementation detail. I more so meant it depends on something outside of the programmer's control to be correct; which, if it can change at will, is not something I would deem "correct". * Not 100% true. Have a look at what this code outputs with |
Looking at the optimised output of programs containing UB will not be that useful, because LLVM is free to optimise assuming it can't happen It can necessarily change the behaviour of the code, many things won't compile without it (which is why it can't be opt level dependent) and addresses of references can be observed at runtime |
Which is what makes it so weird since it should prove that it can happen, it's as if it optimizes out storing 110 (which is the only major difference between the two).
I meant code that would compile without it that's now promoted In this case, that would be something like
Depending on the exact address of something is a terrible idea :D Sure, it's observable at runtime, but the code has the same behavior, and if you're printing the address of something, it's gonna be practically random anyway; the OS is free to allocate the binary into memory wherever it pleases (address space layout randomization), any code that depends on that is incorrect (and honestly, should be linted in If there's an exhaustive list of everything required for constant promotion, then I'm happy to implement it that way but I still believe this should be linted, as a configuration option possibly. |
That's thinking about it from the wrong angle, by definition the compiler can assume UB does not happen. That's precisely the reason it can consider the 110 a dead store - it would be UB to read from a dangling pointer, therefore it isn't read from, therefore there's no reason to store 110 in it. Or something like that. I think a lint that suggests transforming cases where the promotion is being relied upon for safety to a separate binding would be fine, |
I agree. For this lint in particular, let's go with only linting non-promoted returned pointers then have a separate restriction lint (or, of course a configuration option since they're kinda the same lint anyway). But in regards to the first paragraph, I meant that LLVM has to prove so, in this case the MIR generated is basically identical for that function with both |
I've made it only lint non-promoted temporaries now. It's a bit ugly, though. I tried first recreating I also moved it away from |
1f10fb5
to
aa2b23d
Compare
☔ The latest upstream changes (presumably #11003) made this pull request unmergeable. Please resolve the merge conflicts. |
I've made it operate on the MIR now for The reason why MIR works better is mostly because we know when locals are dropped just by looking at the statements/terminator, if there's a ^ Actually, this can probably be simplified even further: If there's a drop, we can use the same logic but continue to the target basic block for MIR also has no concept of method calls, just function calls, so we can easily lint both The earlier |
8c84a12
to
75d9c16
Compare
75d9c16
to
ebcbc2d
Compare
Add `sym::iter_mut` + `sym::as_mut_ptr` for Clippy We currently have `sym::iter` and `sym::iter_repeat`, this PR adds `sym::iter_mut` as it's useful for rust-lang/rust-clippy#11038 and another Clippy lint, it also adds `sym::as_mut_ptr` as it's useful for rust-lang/rust-clippy#10962.
Add `sym::iter_mut` + `sym::as_mut_ptr` for Clippy We currently have `sym::iter` and `sym::iter_repeat`, this PR adds `sym::iter_mut` as it's useful for rust-lang#11038 and another Clippy lint, it also adds `sym::as_mut_ptr` as it's useful for rust-lang#10962.
|
||
if let ExprKind::Cast(cast_expr, _) = expr.kind | ||
&& let ExprKind::AddrOf(BorrowKind::Ref, _, e) = cast_expr.kind | ||
&& !is_promotable(cx, e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should operate on the MIR here, as after a few reborrows(? stuff like _ = &(*_)
) there should always be a const _
if it's promoted. Otherwise, it's temporary. This should be a good starting point.
traverse_up_until_owned_inner(body, local_assignments, start.as_ref(), 0) | ||
} | ||
|
||
fn traverse_up_until_owned_inner<'tcx>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use an analysis pass instead, and track where this owned data goes.
cc @Jarcho, do you have any ideas of what "owned" should be defined as? Basically, for Drop
types, the place that is dropped. We could just use the place that is dropped for these types but a temporary that doesn't implement Drop
won't have this, despite being freed. Checking the order of StorageDead
s works, as the temporary cannot be dead before the pointer, but there may be some nuances to it I'm missing.
source_info.span, | ||
body.source_scopes[source_info.scope] | ||
.local_data | ||
.clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.clone() | |
.as_ref() |
// Get the final return statement if this is a return statement, or don't lint | ||
let expr = if let ExprKind::Ret(Some(expr)) = expr.kind { | ||
expr | ||
} else if let OwnerNode::Item(parent) = cx.tcx.hir().owner(cx.tcx.hir().get_parent_item(expr.hir_id)) | ||
&& let ItemKind::Fn(_, _, body) = parent.kind | ||
&& let block = cx.tcx.hir().body(body).value | ||
&& let ExprKind::Block(block, _) = block.kind | ||
&& let Some(final_block_expr) = block.expr | ||
&& final_block_expr.hir_id == expr.hir_id | ||
{ | ||
expr | ||
} else { | ||
return false; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be a use for find_all_ret_expressions
r? @Jarcho for the MIR stuff |
Linting uses of let x = some_vec.as_ptr();
GLOBAL.lock().some_field = some_vec;
return x; This can be detected by noting let x = some_vec.as_ptr();
swap(&mut GLOBAL.lock().some_field, &mut some_vec);
return x; This is basically the same, except Cases where the local isn't used after the pointer is created can still be linted (e.g. |
Ping @Centri3. I'd love very much to have this lint. Do you need help? |
I don't have the time nor energy to work on this. I'd love to finish this but I can't remember what needed fixing or anything about this PR for that matter 😅 If you can take a look that'd be much appreciated, however I should be back soon-ish |
I'm closing this according to the last comment. Thank you for the time, you already put into this lint! @GuillaumeGomez, since you expressed interest in a previous comment, you're welcome to pick this PR up :) @rustbot label +S-inactive-closed -S-waiting-on-author -S-waiting-on-review |
If anybody tries to continue this, it will need a major rewrite. The current way of traversing backwards is... terse, verbose, and it shouldn't be done this way. There's Since So yeah, we need a bad heuristic for this lint. Ideally there'd be more info, and maybe we can have smth like The way MIRI detects this ofc only works if it's actually used after being dropped (that meaning, an access after a |
This is a small one, but very nice as even
miri
doesn't catch this (it catches it if it's used incorrectly, but not created erroneously).There were a couple cases of this in other tests, I checked them all and they were 100% valid so I think this is (hopefully) bug-free. Still likely needs more tests though, as always :D (I'm pretty bad at them)
Closes #10959
changelog: New lint [
ptr_to_temporary
]