Skip to content

Commit

Permalink
Auto merge of #12126 - teor2345:patch-1, r=llogiq
Browse files Browse the repository at this point in the history
Fix sign-handling bugs and false negatives in `cast_sign_loss`

**Note: anyone should feel free to move this PR forward, I might not see notifications from reviewers.**

changelog: [`cast_sign_loss`]: Fix sign-handling bugs and false negatives

This PR fixes some arithmetic bugs and false negatives in PR #11883 (and maybe earlier PRs).
Cc `@J-ZhengLi`

I haven't updated the tests yet. I was hoping for some initial feedback before adding tests to cover the cases listed below.

Here are the issues I've attempted to fix:

#### `abs()` can return a negative value in release builds

Example:
```rust
i32::MIN.abs()
```
https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=022d200f9ef6ee72f629c0c9c1af11b8

Docs: https://doc.rust-lang.org/std/primitive.i32.html#method.abs

Other overflows that produce negative values could cause false negatives (and underflows could produce false positives), but they're harder to detect.

#### Values with uncertain signs can be positive or negative

Any number of values with uncertain signs cause the whole expression to have an uncertain sign, because an uncertain sign can be positive or negative.

Example (from UI tests):
```rust
fn main() {
    foo(a: i32, b: i32, c: i32) -> u32 {
        (a * b * c * c) as u32
        //~^ ERROR: casting `i32` to `u32` may lose the sign of the value
    }

    println!("{}", foo(1, -1, 1));
}
```
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=165d2e2676ee8343b1b9fe60db32aadd

#### Handle `expect()` the same way as `unwrap()`

Since we're ignoring `unwrap()` we might as well do the same with `expect()`.

This doesn't seem to have tests but I'm happy to add some like `Some(existing_test).unwrap() as u32`.

#### A negative base to an odd exponent is guaranteed to be negative

An integer `pow()`'s sign is only uncertain when its operants are uncertain. (Ignoring overflow.)

Example:
```rust
((-2_i32).pow(3) * -2) as u32
```

This offsets some of the false positives created by one or more uncertain signs producing an uncertain sign. (Rather than just an odd number of uncertain signs.)

#### Both sides of a multiply or divide should be peeled recursively

I'm not sure why the lhs was peeled recursively, and the rhs was left intact. But the sign of any sequence of multiplies and divides is determined by the signs of its operands. (Ignoring overflow.)

I'm not sure what to use as an example here, because most expressions I want to use are const-evaluable.

But if `p()` is [a non-const function that returns a positive value](https://doc.rust-lang.org/std/primitive.i32.html#method.isqrt), and if the lint handles unary negation, these should all lint:
```rust
fn peel_all(x: i32) {
    (-p(x) * -p(x) * -p(x)) as u32;
    ((-p(x) * -p(x)) * -p(x)) as u32;
    (-p(x) * (-p(x) * -p(x))) as u32;
}
```

#### The right hand side of a Rem doesn't change the sign

Unlike Mul and Div,
> Given remainder = dividend % divisor, the remainder will have the same sign as the dividend.
https://doc.rust-lang.org/reference/expressions/operator-expr.html#arithmetic-and-logical-binary-operators

I'm not sure what to use as an example here, because most expressions I want to use are const-evaluable.

But if `p()` is [a non-const function that returns a positive value](https://doc.rust-lang.org/std/primitive.i32.html#method.isqrt), and if the lint handles unary negation, only the first six expressions should lint.

The expressions that start with a constant should lint (or not lint) regardless of whether the lint supports `p()` or unary negation, because only the dividend's sign matters.

Example:
```rust
fn rem_lhs(x: i32) {
    (-p(x) % -1) as u32;
    (-p(x) % 1) as u32;
    (-1 % -p(x)) as u32;
    (-1 % p(x)) as u32;
    (-1 % -x) as u32;
    (-1 % x) as u32;
    // These shouldn't lint:
    (p(x) % -1) as u32;
    (p(x) % 1) as u32;
    (1 % -p(x)) as u32;
    (1 % p(x)) as u32;
    (1 % -x) as u32;
    (1 % x) as u32;
}
```

#### There's no need to bail on other expressions

When peeling, any other operators or expressions can be left intact and sent to the constant evaluator.

If these expressions can be evaluated, this offsets some of the false positives created by one or more uncertain signs producing an uncertain sign. If not, they end up marked as having uncertain sign.
  • Loading branch information
bors committed Feb 27, 2024
2 parents fb06081 + 1e3c55e commit e33cba5
Show file tree
Hide file tree
Showing 3 changed files with 503 additions and 144 deletions.
310 changes: 246 additions & 64 deletions clippy_lints/src/casts/cast_sign_loss.rs
Original file line number Diff line number Diff line change
@@ -1,15 +1,47 @@
use std::convert::Infallible;
use std::ops::ControlFlow;

use clippy_utils::consts::{constant, Constant};
use clippy_utils::diagnostics::span_lint;
use clippy_utils::{clip, method_chain_args, sext};
use clippy_utils::visitors::{for_each_expr, Descend};
use clippy_utils::{method_chain_args, sext};
use rustc_hir::{BinOpKind, Expr, ExprKind};
use rustc_lint::LateContext;
use rustc_middle::ty::{self, Ty, UintTy};
use rustc_middle::ty::{self, Ty};

use super::CAST_SIGN_LOSS;

const METHODS_RET_POSITIVE: &[&str] = &["abs", "checked_abs", "rem_euclid", "checked_rem_euclid"];
/// A list of methods that can never return a negative value.
/// Includes methods that panic rather than returning a negative value.
///
/// Methods that can overflow and return a negative value must not be included in this list,
/// because casting their return values can still result in sign loss.
const METHODS_RET_POSITIVE: &[&str] = &[
"checked_abs",
"saturating_abs",
"isqrt",
"checked_isqrt",
"rem_euclid",
"checked_rem_euclid",
"wrapping_rem_euclid",
];

/// A list of methods that act like `pow()`. See `pow_call_result_sign()` for details.
///
/// Methods that can overflow and return a negative value must not be included in this list,
/// because casting their return values can still result in sign loss.
const METHODS_POW: &[&str] = &["pow", "saturating_pow", "checked_pow"];

pub(super) fn check(cx: &LateContext<'_>, expr: &Expr<'_>, cast_op: &Expr<'_>, cast_from: Ty<'_>, cast_to: Ty<'_>) {
/// A list of methods that act like `unwrap()`, and don't change the sign of the inner value.
const METHODS_UNWRAP: &[&str] = &["unwrap", "unwrap_unchecked", "expect", "into_ok"];

pub(super) fn check<'cx>(
cx: &LateContext<'cx>,
expr: &Expr<'_>,
cast_op: &Expr<'_>,
cast_from: Ty<'cx>,
cast_to: Ty<'_>,
) {
if should_lint(cx, cast_op, cast_from, cast_to) {
span_lint(
cx,
Expand All @@ -20,35 +52,27 @@ pub(super) fn check(cx: &LateContext<'_>, expr: &Expr<'_>, cast_op: &Expr<'_>, c
}
}

fn should_lint(cx: &LateContext<'_>, cast_op: &Expr<'_>, cast_from: Ty<'_>, cast_to: Ty<'_>) -> bool {
fn should_lint<'cx>(cx: &LateContext<'cx>, cast_op: &Expr<'_>, cast_from: Ty<'cx>, cast_to: Ty<'_>) -> bool {
match (cast_from.is_integral(), cast_to.is_integral()) {
(true, true) => {
if !cast_from.is_signed() || cast_to.is_signed() {
return false;
}

// Don't lint if `cast_op` is known to be positive.
// Don't lint if `cast_op` is known to be positive, ignoring overflow.
if let Sign::ZeroOrPositive = expr_sign(cx, cast_op, cast_from) {
return false;
}

let (mut uncertain_count, mut negative_count) = (0, 0);
// Peel off possible binary expressions, e.g. x * x * y => [x, x, y]
let Some(exprs) = exprs_with_selected_binop_peeled(cast_op) else {
// Assume cast sign lose if we cannot determine the sign of `cast_op`
return true;
};
for expr in exprs {
let ty = cx.typeck_results().expr_ty(expr);
match expr_sign(cx, expr, ty) {
Sign::Negative => negative_count += 1,
Sign::Uncertain => uncertain_count += 1,
Sign::ZeroOrPositive => (),
};
if let Sign::ZeroOrPositive = expr_muldiv_sign(cx, cast_op) {
return false;
}

if let Sign::ZeroOrPositive = expr_add_sign(cx, cast_op) {
return false;
}

// Lint if there are odd number of uncertain or negative results
uncertain_count % 2 == 1 || negative_count % 2 == 1
true
},

(false, true) => !cast_to.is_signed(),
Expand All @@ -57,7 +81,13 @@ fn should_lint(cx: &LateContext<'_>, cast_op: &Expr<'_>, cast_from: Ty<'_>, cast
}
}

fn get_const_int_eval(cx: &LateContext<'_>, expr: &Expr<'_>, ty: Ty<'_>) -> Option<i128> {
fn get_const_signed_int_eval<'cx>(
cx: &LateContext<'cx>,
expr: &Expr<'_>,
ty: impl Into<Option<Ty<'cx>>>,
) -> Option<i128> {
let ty = ty.into().unwrap_or_else(|| cx.typeck_results().expr_ty(expr));

if let Constant::Int(n) = constant(cx, cx.typeck_results(), expr)?
&& let ty::Int(ity) = *ty.kind()
{
Expand All @@ -66,29 +96,52 @@ fn get_const_int_eval(cx: &LateContext<'_>, expr: &Expr<'_>, ty: Ty<'_>) -> Opti
None
}

fn get_const_unsigned_int_eval<'cx>(
cx: &LateContext<'cx>,
expr: &Expr<'_>,
ty: impl Into<Option<Ty<'cx>>>,
) -> Option<u128> {
let ty = ty.into().unwrap_or_else(|| cx.typeck_results().expr_ty(expr));

if let Constant::Int(n) = constant(cx, cx.typeck_results(), expr)?
&& let ty::Uint(_ity) = *ty.kind()
{
return Some(n);
}
None
}

#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum Sign {
ZeroOrPositive,
Negative,
Uncertain,
}

fn expr_sign(cx: &LateContext<'_>, expr: &Expr<'_>, ty: Ty<'_>) -> Sign {
fn expr_sign<'cx>(cx: &LateContext<'cx>, expr: &Expr<'_>, ty: impl Into<Option<Ty<'cx>>>) -> Sign {
// Try evaluate this expr first to see if it's positive
if let Some(val) = get_const_int_eval(cx, expr, ty) {
if let Some(val) = get_const_signed_int_eval(cx, expr, ty) {
return if val >= 0 { Sign::ZeroOrPositive } else { Sign::Negative };
}
if let Some(_val) = get_const_unsigned_int_eval(cx, expr, None) {
return Sign::ZeroOrPositive;
}

// Calling on methods that always return non-negative values.
if let ExprKind::MethodCall(path, caller, args, ..) = expr.kind {
let mut method_name = path.ident.name.as_str();

if method_name == "unwrap"
&& let Some(arglist) = method_chain_args(expr, &["unwrap"])
// Peel unwrap(), expect(), etc.
while let Some(&found_name) = METHODS_UNWRAP.iter().find(|&name| &method_name == name)
&& let Some(arglist) = method_chain_args(expr, &[found_name])
&& let ExprKind::MethodCall(inner_path, ..) = &arglist[0].0.kind
{
// The original type has changed, but we can't use `ty` here anyway, because it has been
// moved.
method_name = inner_path.ident.name.as_str();
}

if method_name == "pow"
if METHODS_POW.iter().any(|&name| method_name == name)
&& let [arg] = args
{
return pow_call_result_sign(cx, caller, arg);
Expand All @@ -100,53 +153,182 @@ fn expr_sign(cx: &LateContext<'_>, expr: &Expr<'_>, ty: Ty<'_>) -> Sign {
Sign::Uncertain
}

/// Return the sign of the `pow` call's result.
/// Return the sign of the `pow` call's result, ignoring overflow.
///
/// If the caller is a positive number, the result is always positive,
/// If the `power_of` is a even number, the result is always positive as well,
/// Otherwise a [`Sign::Uncertain`] will be returned.
fn pow_call_result_sign(cx: &LateContext<'_>, caller: &Expr<'_>, power_of: &Expr<'_>) -> Sign {
let caller_ty = cx.typeck_results().expr_ty(caller);
if let Some(caller_val) = get_const_int_eval(cx, caller, caller_ty)
&& caller_val >= 0
{
return Sign::ZeroOrPositive;
/// If the base is positive, the result is always positive.
/// If the exponent is a even number, the result is always positive,
/// Otherwise, if the base is negative, and the exponent is an odd number, the result is always
/// negative.
///
/// Otherwise, returns [`Sign::Uncertain`].
fn pow_call_result_sign(cx: &LateContext<'_>, base: &Expr<'_>, exponent: &Expr<'_>) -> Sign {
let base_sign = expr_sign(cx, base, None);

// Rust's integer pow() functions take an unsigned exponent.
let exponent_val = get_const_unsigned_int_eval(cx, exponent, None);
let exponent_is_even = exponent_val.map(|val| val % 2 == 0);

match (base_sign, exponent_is_even) {
// Non-negative bases always return non-negative results, ignoring overflow.
(Sign::ZeroOrPositive, _) |
// Any base raised to an even exponent is non-negative.
// These both hold even if we don't know the value of the base.
(_, Some(true))
=> Sign::ZeroOrPositive,

// A negative base raised to an odd exponent is non-negative.
(Sign::Negative, Some(false)) => Sign::Negative,

// Negative/unknown base to an unknown exponent, or unknown base to an odd exponent.
// Could be negative or positive depending on the actual values.
(Sign::Negative | Sign::Uncertain, None) |
(Sign::Uncertain, Some(false)) => Sign::Uncertain,
}
}

if let Some(Constant::Int(n)) = constant(cx, cx.typeck_results(), power_of)
&& clip(cx.tcx, n, UintTy::U32) % 2 == 0
{
return Sign::ZeroOrPositive;
/// Peels binary operators such as [`BinOpKind::Mul`] or [`BinOpKind::Rem`],
/// where the result could always be positive. See [`exprs_with_muldiv_binop_peeled()`] for details.
///
/// Returns the sign of the list of peeled expressions.
fn expr_muldiv_sign(cx: &LateContext<'_>, expr: &Expr<'_>) -> Sign {
let mut negative_count = 0;

// Peel off possible binary expressions, for example:
// x * x / y => [x, x, y]
// a % b => [a]
let exprs = exprs_with_muldiv_binop_peeled(expr);
for expr in exprs {
match expr_sign(cx, expr, None) {
Sign::Negative => negative_count += 1,
// A mul/div is:
// - uncertain if there are any uncertain values (because they could be negative or positive),
Sign::Uncertain => return Sign::Uncertain,
Sign::ZeroOrPositive => (),
};
}

Sign::Uncertain
// A mul/div is:
// - negative if there are an odd number of negative values,
// - positive or zero otherwise.
if negative_count % 2 == 1 {
Sign::Negative
} else {
Sign::ZeroOrPositive
}
}

/// Peels binary operators such as [`BinOpKind::Add`], where the result could always be positive.
/// See [`exprs_with_add_binop_peeled()`] for details.
///
/// Returns the sign of the list of peeled expressions.
fn expr_add_sign(cx: &LateContext<'_>, expr: &Expr<'_>) -> Sign {
let mut negative_count = 0;
let mut positive_count = 0;

// Peel off possible binary expressions, for example:
// a + b + c => [a, b, c]
let exprs = exprs_with_add_binop_peeled(expr);
for expr in exprs {
match expr_sign(cx, expr, None) {
Sign::Negative => negative_count += 1,
// A sum is:
// - uncertain if there are any uncertain values (because they could be negative or positive),
Sign::Uncertain => return Sign::Uncertain,
Sign::ZeroOrPositive => positive_count += 1,
};
}

// A sum is:
// - positive or zero if there are only positive (or zero) values,
// - negative if there are only negative (or zero) values, or
// - uncertain if there are both.
// We could split Zero out into its own variant, but we don't yet.
if negative_count == 0 {
Sign::ZeroOrPositive
} else if positive_count == 0 {
Sign::Negative
} else {
Sign::Uncertain
}
}

/// Peels binary operators such as [`BinOpKind::Mul`], [`BinOpKind::Div`] or [`BinOpKind::Rem`],
/// which the result could always be positive under certain condition.
/// where the result depends on:
/// - the number of negative values in the entire expression, or
/// - the number of negative values on the left hand side of the expression.
/// Ignores overflow.
///
/// Other operators such as `+`/`-` causing the result's sign hard to determine, which we will
/// return `None`
fn exprs_with_selected_binop_peeled<'a>(expr: &'a Expr<'_>) -> Option<Vec<&'a Expr<'a>>> {
#[inline]
fn collect_operands<'a>(expr: &'a Expr<'a>, operands: &mut Vec<&'a Expr<'a>>) -> Option<()> {
match expr.kind {
ExprKind::Binary(op, lhs, rhs) => {
if matches!(op.node, BinOpKind::Mul | BinOpKind::Div | BinOpKind::Rem) {
collect_operands(lhs, operands);
operands.push(rhs);
} else {
// Things are complicated when there are other binary ops exist,
// abort checking by returning `None` for now.
return None;
}
},
_ => operands.push(expr),
///
/// Expressions using other operators are preserved, so we can try to evaluate them later.
fn exprs_with_muldiv_binop_peeled<'e>(expr: &'e Expr<'_>) -> Vec<&'e Expr<'e>> {
let mut res = vec![];

for_each_expr(expr, |sub_expr| -> ControlFlow<Infallible, Descend> {
// We don't check for mul/div/rem methods here, but we could.
if let ExprKind::Binary(op, lhs, _rhs) = sub_expr.kind {
if matches!(op.node, BinOpKind::Mul | BinOpKind::Div) {
// For binary operators where both sides contribute to the sign of the result,
// collect all their operands, recursively. This ignores overflow.
ControlFlow::Continue(Descend::Yes)
} else if matches!(op.node, BinOpKind::Rem | BinOpKind::Shr) {
// For binary operators where the left hand side determines the sign of the result,
// only collect that side, recursively. Overflow panics, so this always holds.
//
// Large left shifts turn negatives into zeroes, so we can't use it here.
//
// > Given remainder = dividend % divisor, the remainder will have the same sign as the dividend
// > ...
// > Arithmetic right shift on signed integer types
// https://doc.rust-lang.org/reference/expressions/operator-expr.html#arithmetic-and-logical-binary-operators

// We want to descend into the lhs, but skip the rhs.
// That's tricky to do using for_each_expr(), so we just keep the lhs intact.
res.push(lhs);
ControlFlow::Continue(Descend::No)
} else {
// The sign of the result of other binary operators depends on the values of the operands,
// so try to evaluate the expression.
res.push(sub_expr);
ControlFlow::Continue(Descend::No)
}
} else {
// For other expressions, including unary operators and constants, try to evaluate the expression.
res.push(sub_expr);
ControlFlow::Continue(Descend::No)
}
Some(())
}
});

res
}

/// Peels binary operators such as [`BinOpKind::Add`], where the result depends on:
/// - all the expressions being positive, or
/// - all the expressions being negative.
/// Ignores overflow.
///
/// Expressions using other operators are preserved, so we can try to evaluate them later.
fn exprs_with_add_binop_peeled<'e>(expr: &'e Expr<'_>) -> Vec<&'e Expr<'e>> {
let mut res = vec![];
collect_operands(expr, &mut res)?;
Some(res)

for_each_expr(expr, |sub_expr| -> ControlFlow<Infallible, Descend> {
// We don't check for add methods here, but we could.
if let ExprKind::Binary(op, _lhs, _rhs) = sub_expr.kind {
if matches!(op.node, BinOpKind::Add) {
// For binary operators where both sides contribute to the sign of the result,
// collect all their operands, recursively. This ignores overflow.
ControlFlow::Continue(Descend::Yes)
} else {
// The sign of the result of other binary operators depends on the values of the operands,
// so try to evaluate the expression.
res.push(sub_expr);
ControlFlow::Continue(Descend::No)
}
} else {
// For other expressions, including unary operators and constants, try to evaluate the expression.
res.push(sub_expr);
ControlFlow::Continue(Descend::No)
}
});

res
}
Loading

0 comments on commit e33cba5

Please sign in to comment.