Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle many more intrinsics in Bounds.cpp #7823

Merged
merged 11 commits into from
Dec 1, 2023
40 changes: 31 additions & 9 deletions src/Bounds.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,17 @@ using std::string;
using std::vector;

namespace {

Expr widen(Expr a) {
Type result_type = a.type().widen();
return Cast::make(result_type, std::move(a));
}

Expr narrow(Expr a) {
Type result_type = a.type().narrow();
return Cast::make(result_type, std::move(a));
}

int static_sign(const Expr &x) {
if (is_positive_const(x)) {
return 1;
Expand All @@ -56,6 +67,7 @@ int static_sign(const Expr &x) {
}
return 0;
}

} // anonymous namespace

const FuncValueBounds &empty_func_value_bounds() {
Expand Down Expand Up @@ -1468,6 +1480,7 @@ class Bounds : public IRVisitor {
}
} else if (op->args.size() == 1 &&
(op->is_intrinsic(Call::round) ||
op->is_intrinsic(Call::strict_float) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's going to be a merge conflict here because Call::saturating_cast is in the same category. Probably should add it in this PR in case the other one doesn't go in and we revert the u32 -> i32 cast change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it back! saturating_cast doesn't belong here.

op->name == "ceil_f32" || op->name == "ceil_f64" ||
op->name == "floor_f32" || op->name == "floor_f64" ||
op->name == "exp_f32" || op->name == "exp_f64" ||
Expand Down Expand Up @@ -1517,15 +1530,24 @@ class Bounds : public IRVisitor {
result.include(arg_bounds.get(i));
}
interval = result;
} else if (op->is_intrinsic(Call::widen_right_add)) {
Expr add = Add::make(op->args[0], cast(op->args[0].type(), op->args[1]));
add.accept(this);
} else if (op->is_intrinsic(Call::widen_right_sub)) {
Expr sub = Sub::make(op->args[0], cast(op->args[0].type(), op->args[1]));
sub.accept(this);
} else if (op->is_intrinsic(Call::widen_right_mul)) {
Expr mul = Mul::make(op->args[0], cast(op->args[0].type(), op->args[1]));
mul.accept(this);
} else if (op->is_intrinsic(Call::halving_add)) {
// lower_halving_add() uses bitwise tricks that are hard to reason
// about; let's do this instead:
Expr e = narrow((widen(op->args[0]) + widen(op->args[1])) / 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the bot failure, I suspect this is trying to widen a 64-bit input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, well, ok, but this is literally the fallback implementation for it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I don't know how to make the bitwise op handling robust enough to handle this correctly)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the right way to handle this is to special-case 64-bit and use the min/max possible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a branch that (among other things) tacked on bounds inference for many of these intrinsics - I did have to special-case any intrinsic that semantically widens if the arguments are 64 bit, and there were a few that would produce a double-widening so had to be even further special-cases (I think rounding_mul_shift_right lowers to a widening mul followed by a rounding shift right that lowers to a widening add or something like that, so it would double-widen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a branch

please share! These changes make for much better bounds inference in some cases (esp pipelines with fixed-point math); if your fixes are better than these we should take yours.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to severely clean it up - that's part of why I never opened a PR. I can try to clean it up and share, might take me a few days unfortunately - I am about to be traveling for a funding thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it's ugly, feel free to put it somewhere I can look at it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.accept(this);
} else if (op->is_intrinsic(Call::rounding_halving_add)) {
// lower_rounding_halving_add() uses bitwise tricks that are hard to reason
// about; let's do this instead:
Expr e = narrow((widen(op->args[0]) + widen(op->args[1]) + 1) / 2);
e.accept(this);
} else if (op->call_type == Call::PureIntrinsic) {
Expr e = lower_intrinsic(op);
if (e.defined()) {
e.accept(this);
} else {
// Just use the bounds of the type
bounds_of_type(t);
}
} else if (op->call_type == Call::Halide) {
bounds_of_func(op->name, op->value_index, op->type);
} else {
Expand Down