Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The tail_expr_drop_order lint today gives a lot of false positives #130836

Open
dingxiangfei2009 opened this issue Sep 25, 2024 · 4 comments
Open
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@dingxiangfei2009
Copy link
Contributor

dingxiangfei2009 commented Sep 25, 2024

cc @traviscross

TL;DR

We have implemented a HIR-based lint against changes in drop order due to #![feature(shorter_tail_lifetimes)].
Through a crater run we learnt that this approach produces a lot of false positives. We need to account for control flow and track the use of places and partial moves. How can we achieve this?

Current status

The current approach to identify temporaries in tail expression whose drop order may change is to identify local variable declarations as well as (sub)expressions of the tail expression that are of types with significant drop implementation. The lint will then simply assume that all of them will observe a change in drop order and report as such.

For example, the following code is linted against.

#![deny(tail_expr_drop_order)]
#![feature(shorter_tail_lifetimes)]

struct LoudDropper;
impl Drop for LoudDropper {
    fn drop(&mut self) {
        // This destructor should be considered significant because it is a custom destructor
        // and we will assume that the destructor can generate side effects arbitrarily so that
        // a change in drop order is visible.
        println!("loud drop");
    }
}
impl LoudDropper {
    fn get(&self) -> i32 {
        0
    }
}

fn should_lint() -> i32 {
    let x = LoudDropper;
    //  ^ this has significant drop
    x.get() + LoudDropper.get()
    //~^ ERROR: this value of type `LoudDropper` has significant drop implementation that will have a different drop order from that of Edition 2021
    //~| WARN: this changes meaning in Rust 2024
}

MIR shows that the temporary value _6 is indeed dropped later that _1 aka. x.

MIR
fn should_lint() -> i32 {
    let mut _0: i32;
    let _1: LoudDropper;
    let mut _2: i32;
    let mut _3: &LoudDropper;
    let mut _4: i32;
    let mut _5: &LoudDropper;
    let _6: LoudDropper;
    let mut _7: (i32, bool);
    scope 1 {
        debug x => const LoudDropper;
    }

    bb0: {
        _3 = &_1;
        _2 = LoudDropper::get(move _3) -> [return: bb1, unwind: bb8];
    }

    bb1: {
        _5 = &_6;
        _4 = LoudDropper::get(move _5) -> [return: bb2, unwind: bb6];
    }

    bb2: {
        _7 = AddWithOverflow(_2, _4);
        assert(!move (_7.1: bool), "attempt to compute `{} + {}`, which would overflow", move _2, move _4) -> [success: bb3, unwind: bb6];
    }

    bb3: {
        _0 = move (_7.0: i32);
        drop(_1) -> [return: bb4, unwind: bb7];
    }

    bb4: {
        drop(_6) -> [return: bb5, unwind continue];
    }

    bb5: {
        return;
    }

    bb6 (cleanup): {
        drop(_1) -> [return: bb7, unwind terminate(cleanup)];
    }

    bb7 (cleanup): {
        drop(_6) -> [return: bb9, unwind terminate(cleanup)];
    }

    bb8 (cleanup): {
        drop(_1) -> [return: bb9, unwind terminate(cleanup)];
    }

    bb9 (cleanup): {
        resume;
    }
}

Drawbacks

This lint is now too naive. For instance, the following example clearly will not have the drop order changed. The analysis does not consider the fact that temporary values can be moved and the drop on the locals or places are effectively no-ops.

fn should_not_lint_when_consumed() -> (LoudDropper, i32) {
    let x = LoudDropper;
    // Should not lint
    (LoudDropper, x.get())
//   ^~~~~~~~~~~ this is moved
}

This is a fairly common case. From the last crater run we learnt it from the fact that serde::Deserialize macro keeps the partially deserialized data as locals and later moves them into the return value. The lint was triggered even though those values are not dropped at the function return point.

New attempts

Therefore, we need to improve the lint so that the analysis is more precise by factoring in the control flow and the kind of use of various MIR places, especially partial moves. Solving this issue perfectly requires one to inspect MIR instead, in conjunction with MaybeUninitializedPlaces data flow. Identifying which statements in MIR is today accurate enough, but we can't say the same for drop terminators. In the earliest attempts of #129864, span information in the terminators turned out to be pointing outside the tail expression span even though they properly fulfill the drop schedules for the tail expression. This gave us even more false positives and negatives.

As an example, we saw that the drop happened at the last closing bracket of this code from the MIR source_info on the drop terminators.

fn should_lint() -> i32 {
    let x = LoudDropper;
//  v~~~~~~~~~~~~~~~~~~~~~~~~~~ the tail expression span is this
    x.get() + LoudDropper.get()
} // <-- both `x` and `LoudDropper` are dropped here
// ... even though they belong to different scopes
// or they are dropped for different "reasons"

One possible way to address this incorrect information is to extend MIR syntax so that drop terminators carry information on which exactly the HIR region scope the drops are scheduled for. With it we can precisely differentiate drops for the proper tail expression, the drops for the locals in the body and, in particular, Edition 2021 temporary values that dropped not as part of the tail expression drop schedule. Those are the targets this lint is exactly looking for.

Drawbacks

Due to copying and retention of this scope information in MIRs, there is now regression on compiler performance.

Questions

  • Are there better ways to track the scope that a drop in a built MIR is scheduled for without so much overhead?
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 25, 2024
@dingxiangfei2009
Copy link
Contributor Author

@rustbot labels +T-lang +I-lang-nominated +C-discussion

@rustbot rustbot added C-discussion Category: Discussion or questions that doesn't represent real issues. I-lang-nominated Nominated for discussion during a lang team meeting. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Sep 25, 2024
@traviscross traviscross removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 25, 2024
@jieyouxu jieyouxu added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Sep 25, 2024
@dingxiangfei2009
Copy link
Contributor Author

dingxiangfei2009 commented Sep 26, 2024

Report from crater run #129604

From this crater run, we found these inprecise diagnoses.

We should not lint against upvars

In this example the Arc<Mutex<_>>s and channels are moved into the closure. Their lifetime is bound to the closure/coroutine bodies, and so their drop order is not affected by the edition change. We should ignore them.

We should not lint against values moved into return place or consumed

In this example, droppy values are moved into the return value. Their destructor would not even be called.

There exists types that we don't care

It turns out types like std::io::Error that is considered equipped with a destructor significant enough, but are they semantically important?

@dingxiangfei2009
Copy link
Contributor Author

I also devised a "recursive lock" (???) example to demonstrate the significance of drop order as the first try.

use std::sync::atomic::AtomicUsize;
use std::sync::atomic::Ordering::{Acquire, Release};

static SEMAPHORE: AtomicUsize = AtomicUsize::new(0);

struct S(usize);
impl S {
    fn new() -> Self {
        Self(SEMAPHORE.fetch_add(1, Acquire))
    }
    fn get(&self) -> usize {
        self.0
    }
}
impl Drop for S {
    fn drop(&mut self) {
        assert_eq!(self.0 + 1, SEMAPHORE.fetch_sub(1, Release));
    }
}

fn semaphore_test() -> usize {
    let s1 = S::new();
    let s2 = S::new();
    S::new().get()
}

fn main() {
    let x = semaphore_test();
    println!("{x}")
}

@traviscross
Copy link
Contributor

@rustbot labels -I-lang-nominated

We talked about this last week, resulting in further discussions that resulted in an implementation with fewer false positives.

Please of course renominate if there are any further questions there.

@rustbot rustbot removed the I-lang-nominated Nominated for discussion during a lang team meeting. label Oct 2, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 7, 2024
…t-2, r=<try>

Reduce false positives of tail-expr-drop-order from consumed values (attempt #2)

r? `@nikomatsakis`

Related to rust-lang#129864 but not replacing, yet.

Related to rust-lang#130836.

This is an implementation of the approach suggested in the [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/temporary.20drop.20order.20changes). A new MIR statement `BackwardsIncompatibleDrop` is added to the MIR syntax. The lint now works by inspecting possibly live move paths before at the `BackwardsIncompatibleDrop` location and the actual drop under the current edition, which should be one before Edition 2024 in practice.
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 7, 2024
…t-2, r=<try>

Reduce false positives of tail-expr-drop-order from consumed values (attempt #2)

r? `@nikomatsakis`

Related to rust-lang#129864 but not replacing, yet.

Related to rust-lang#130836.

This is an implementation of the approach suggested in the [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/temporary.20drop.20order.20changes). A new MIR statement `BackwardsIncompatibleDrop` is added to the MIR syntax. The lint now works by inspecting possibly live move paths before at the `BackwardsIncompatibleDrop` location and the actual drop under the current edition, which should be one before Edition 2024 in practice.
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 8, 2024
…t-2, r=<try>

Reduce false positives of tail-expr-drop-order from consumed values (attempt #2)

r? `@nikomatsakis`

Tracked by rust-lang#123739.

Related to rust-lang#129864 but not replacing, yet.

Related to rust-lang#130836.

This is an implementation of the approach suggested in the [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/temporary.20drop.20order.20changes). A new MIR statement `BackwardsIncompatibleDrop` is added to the MIR syntax. The lint now works by inspecting possibly live move paths before at the `BackwardsIncompatibleDrop` location and the actual drop under the current edition, which should be one before Edition 2024 in practice.
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 12, 2024
…t-2, r=<try>

Reduce false positives of tail-expr-drop-order from consumed values (attempt #2)

r? `@nikomatsakis`

Tracked by rust-lang#123739.

Related to rust-lang#129864 but not replacing, yet.

Related to rust-lang#130836.

This is an implementation of the approach suggested in the [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/temporary.20drop.20order.20changes). A new MIR statement `BackwardsIncompatibleDrop` is added to the MIR syntax. The lint now works by inspecting possibly live move paths before at the `BackwardsIncompatibleDrop` location and the actual drop under the current edition, which should be one before Edition 2024 in practice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants