The `tail_expr_drop_order` lint today gives a lot of false positives #130836

dingxiangfei2009 · 2024-09-25T13:54:17Z

TL;DR

We have implemented a HIR-based lint against changes in drop order due to #![feature(shorter_tail_lifetimes)].
Through a crater run we learnt that this approach produces a lot of false positives. We need to account for control flow and track the use of places and partial moves. How can we achieve this?

Current status

The current approach to identify temporaries in tail expression whose drop order may change is to identify local variable declarations as well as (sub)expressions of the tail expression that are of types with significant drop implementation. The lint will then simply assume that all of them will observe a change in drop order and report as such.

For example, the following code is linted against.

#![deny(tail_expr_drop_order)]
#![feature(shorter_tail_lifetimes)]

struct LoudDropper;
impl Drop for LoudDropper {
    fn drop(&mut self) {
        // This destructor should be considered significant because it is a custom destructor
        // and we will assume that the destructor can generate side effects arbitrarily so that
        // a change in drop order is visible.
        println!("loud drop");
    }
}
impl LoudDropper {
    fn get(&self) -> i32 {
        0
    }
}

fn should_lint() -> i32 {
    let x = LoudDropper;
    //  ^ this has significant drop
    x.get() + LoudDropper.get()
    //~^ ERROR: this value of type `LoudDropper` has significant drop implementation that will have a different drop order from that of Edition 2021
    //~| WARN: this changes meaning in Rust 2024
}

MIR shows that the temporary value _6 is indeed dropped later that _1 aka. x.

MIR

fn should_lint() -> i32 {
    let mut _0: i32;
    let _1: LoudDropper;
    let mut _2: i32;
    let mut _3: &LoudDropper;
    let mut _4: i32;
    let mut _5: &LoudDropper;
    let _6: LoudDropper;
    let mut _7: (i32, bool);
    scope 1 {
        debug x => const LoudDropper;
    }

    bb0: {
        _3 = &_1;
        _2 = LoudDropper::get(move _3) -> [return: bb1, unwind: bb8];
    }

    bb1: {
        _5 = &_6;
        _4 = LoudDropper::get(move _5) -> [return: bb2, unwind: bb6];
    }

    bb2: {
        _7 = AddWithOverflow(_2, _4);
        assert(!move (_7.1: bool), "attempt to compute `{} + {}`, which would overflow", move _2, move _4) -> [success: bb3, unwind: bb6];
    }

    bb3: {
        _0 = move (_7.0: i32);
        drop(_1) -> [return: bb4, unwind: bb7];
    }

    bb4: {
        drop(_6) -> [return: bb5, unwind continue];
    }

    bb5: {
        return;
    }

    bb6 (cleanup): {
        drop(_1) -> [return: bb7, unwind terminate(cleanup)];
    }

    bb7 (cleanup): {
        drop(_6) -> [return: bb9, unwind terminate(cleanup)];
    }

    bb8 (cleanup): {
        drop(_1) -> [return: bb9, unwind terminate(cleanup)];
    }

    bb9 (cleanup): {
        resume;
    }
}

Drawbacks

This lint is now too naive. For instance, the following example clearly will not have the drop order changed. The analysis does not consider the fact that temporary values can be moved and the drop on the locals or places are effectively no-ops.

fn should_not_lint_when_consumed() -> (LoudDropper, i32) {
    let x = LoudDropper;
    // Should not lint
    (LoudDropper, x.get())
//   ^~~~~~~~~~~ this is moved
}

This is a fairly common case. From the last crater run we learnt it from the fact that serde::Deserialize macro keeps the partially deserialized data as locals and later moves them into the return value. The lint was triggered even though those values are not dropped at the function return point.

New attempts

Therefore, we need to improve the lint so that the analysis is more precise by factoring in the control flow and the kind of use of various MIR places, especially partial moves. Solving this issue perfectly requires one to inspect MIR instead, in conjunction with MaybeUninitializedPlaces data flow. Identifying which statements in MIR is today accurate enough, but we can't say the same for drop terminators. In the earliest attempts of #129864, span information in the terminators turned out to be pointing outside the tail expression span even though they properly fulfill the drop schedules for the tail expression. This gave us even more false positives and negatives.

As an example, we saw that the drop happened at the last closing bracket of this code from the MIR source_info on the drop terminators.

fn should_lint() -> i32 {
    let x = LoudDropper;
//  v~~~~~~~~~~~~~~~~~~~~~~~~~~ the tail expression span is this
    x.get() + LoudDropper.get()
} // <-- both `x` and `LoudDropper` are dropped here
// ... even though they belong to different scopes
// or they are dropped for different "reasons"

One possible way to address this incorrect information is to extend MIR syntax so that drop terminators carry information on which exactly the HIR region scope the drops are scheduled for. With it we can precisely differentiate drops for the proper tail expression, the drops for the locals in the body and, in particular, Edition 2021 temporary values that dropped not as part of the tail expression drop schedule. Those are the targets this lint is exactly looking for.

Drawbacks

Due to copying and retention of this scope information in MIRs, there is now regression on compiler performance.

Questions

Are there better ways to track the scope that a drop in a built MIR is scheduled for without so much overhead?

The text was updated successfully, but these errors were encountered:

dingxiangfei2009 · 2024-09-25T13:54:32Z

@rustbot labels +T-lang +I-lang-nominated +C-discussion

dingxiangfei2009 · 2024-09-26T12:28:23Z

Report from crater run #129604

From this crater run, we found these inprecise diagnoses.

We should not lint against upvars

In this example the Arc<Mutex<_>>s and channels are moved into the closure. Their lifetime is bound to the closure/coroutine bodies, and so their drop order is not affected by the edition change. We should ignore them.

We should not lint against values moved into return place or consumed

In this example, droppy values are moved into the return value. Their destructor would not even be called.

There exists types that we don't care

It turns out types like std::io::Error that is considered equipped with a destructor significant enough, but are they semantically important?

dingxiangfei2009 · 2024-09-26T12:29:50Z

I also devised a "recursive lock" (???) example to demonstrate the significance of drop order as the first try.

use std::sync::atomic::AtomicUsize;
use std::sync::atomic::Ordering::{Acquire, Release};

static SEMAPHORE: AtomicUsize = AtomicUsize::new(0);

struct S(usize);
impl S {
    fn new() -> Self {
        Self(SEMAPHORE.fetch_add(1, Acquire))
    }
    fn get(&self) -> usize {
        self.0
    }
}
impl Drop for S {
    fn drop(&mut self) {
        assert_eq!(self.0 + 1, SEMAPHORE.fetch_sub(1, Release));
    }
}

fn semaphore_test() -> usize {
    let s1 = S::new();
    let s2 = S::new();
    S::new().get()
}

fn main() {
    let x = semaphore_test();
    println!("{x}")
}

traviscross · 2024-10-02T09:42:57Z

@rustbot labels -I-lang-nominated

We talked about this last week, resulting in further discussions that resulted in an implementation with fewer false positives.

Please of course renominate if there are any further questions there.

…t-2, r=<try> Reduce false positives of tail-expr-drop-order from consumed values (attempt #2) r? `@nikomatsakis` Related to rust-lang#129864 but not replacing, yet. Related to rust-lang#130836. This is an implementation of the approach suggested in the [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/temporary.20drop.20order.20changes). A new MIR statement `BackwardsIncompatibleDrop` is added to the MIR syntax. The lint now works by inspecting possibly live move paths before at the `BackwardsIncompatibleDrop` location and the actual drop under the current edition, which should be one before Edition 2024 in practice.

…t-2, r=<try> Reduce false positives of tail-expr-drop-order from consumed values (attempt #2) r? `@nikomatsakis` Tracked by rust-lang#123739. Related to rust-lang#129864 but not replacing, yet. Related to rust-lang#130836. This is an implementation of the approach suggested in the [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/temporary.20drop.20order.20changes). A new MIR statement `BackwardsIncompatibleDrop` is added to the MIR syntax. The lint now works by inspecting possibly live move paths before at the `BackwardsIncompatibleDrop` location and the actual drop under the current edition, which should be one before Edition 2024 in practice.

rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 25, 2024

rustbot added C-discussion Category: Discussion or questions that doesn't represent real issues. I-lang-nominated Nominated for discussion during a lang team meeting. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Sep 25, 2024

traviscross removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 25, 2024

jieyouxu added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Sep 25, 2024

rustbot removed the I-lang-nominated Nominated for discussion during a lang team meeting. label Oct 2, 2024

dingxiangfei2009 mentioned this issue Oct 6, 2024

Reduce false positives of tail-expr-drop-order from consumed values (attempt #2) #131326

Open

dingxiangfei2009 mentioned this issue Oct 9, 2024

Stabilize shorter_tail_lifetime #131445

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The `tail_expr_drop_order` lint today gives a lot of false positives #130836

The `tail_expr_drop_order` lint today gives a lot of false positives #130836

dingxiangfei2009 commented Sep 25, 2024 •

edited

Loading

dingxiangfei2009 commented Sep 25, 2024

dingxiangfei2009 commented Sep 26, 2024 •

edited

Loading

dingxiangfei2009 commented Sep 26, 2024

traviscross commented Oct 2, 2024

The tail_expr_drop_order lint today gives a lot of false positives #130836

The tail_expr_drop_order lint today gives a lot of false positives #130836

Comments

dingxiangfei2009 commented Sep 25, 2024 • edited Loading

TL;DR

Current status

Drawbacks

New attempts

Drawbacks

Questions

dingxiangfei2009 commented Sep 25, 2024

dingxiangfei2009 commented Sep 26, 2024 • edited Loading

Report from crater run #129604

We should not lint against upvars

We should not lint against values moved into return place or consumed

There exists types that we don't care

dingxiangfei2009 commented Sep 26, 2024

traviscross commented Oct 2, 2024

The `tail_expr_drop_order` lint today gives a lot of false positives #130836

The `tail_expr_drop_order` lint today gives a lot of false positives #130836

dingxiangfei2009 commented Sep 25, 2024 •

edited

Loading

dingxiangfei2009 commented Sep 26, 2024 •

edited

Loading