Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stronger chain detection in LoopCarry pass #8016

Merged
merged 10 commits into from
Jan 9, 2024
Merged

Conversation

vksnk
Copy link
Member

@vksnk vksnk commented Jan 2, 2024

can_prove is stronger than graph_equal, because it doesn't require index expressions to be exactly the same, but evalutate to the same value. I kept the graph_equal check, because it's faster and should be executed before the more expensive check.

In one of the internal workloads, I see that with this change, what was previously split into three different chains of 4-, 2-, 3- values, is correctly combined into one long chain of lenght 9-.

@steven-johnson steven-johnson requested a review from abadams January 2, 2024 23:40
src/LoopCarry.cpp Outdated Show resolved Hide resolved
@vksnk
Copy link
Member Author

vksnk commented Jan 3, 2024

Also, fixed a bug when indices with different types are compared.

BTW, as far as I know, this pass is only used in Hexagon and Xtensa backends.

@vksnk
Copy link
Member Author

vksnk commented Jan 4, 2024

All tests are green now.

@abadams
Copy link
Member

abadams commented Jan 4, 2024

See the comment up at line 250. It's not safe to use can_prove on a boolean Expr after doing substitute_in_all_lets. To make it safe to call, you have to call common_subexpression_elimination on the Expr first.

Note that this gets called on every pair of indices, so it has quadratic complexity in the IR size. I worry that this will stall for very large unrolled stencils. It's worth writing a test of a very large case. If it does indeed stall, we might need a better algorithm. One could for example hash the expressions and look for hash collisions, where by "hash" I mean substitute in some arbitrary values for the variables and constant-fold, and then only do can_prove on exprs that have the same hash.

src/LoopCarry.cpp Outdated Show resolved Hide resolved
@vksnk
Copy link
Member Author

vksnk commented Jan 4, 2024

See the comment up at line 250. It's not safe to use can_prove on a boolean Expr after doing substitute_in_all_lets. To make it safe to call, you have to call common_subexpression_elimination on the Expr first.

Note that this gets called on every pair of indices, so it has quadratic complexity in the IR size. I worry that this will stall for very large unrolled stencils. It's worth writing a test of a very large case. If it does indeed stall, we might need a better algorithm. One could for example hash the expressions and look for hash collisions, where by "hash" I mean substitute in some arbitrary values for the variables and constant-fold, and then only do can_prove on exprs that have the same hash.

Thanks a lot, this is very helpful!

I changed it to apply CSE first and only then run can_prove. Also, added a test which triggers loop_carry on the loop with large number of indices and the compilation time seems to be fine.

@vksnk
Copy link
Member Author

vksnk commented Jan 8, 2024

I don't think test failures are related.

@vksnk vksnk merged commit 91b063d into main Jan 9, 2024
16 of 19 checks passed
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
* Stronger chain detection in LoopCarry

* Make sure that types are the same

* Add a comment

* Run CSE before calling can_prove

* Test for loop carry

* clang-tidy

* Add missing override

* Update comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants