Stronger chain detection in LoopCarry pass #8016

vksnk · 2024-01-02T23:15:15Z

can_prove is stronger than graph_equal, because it doesn't require index expressions to be exactly the same, but evalutate to the same value. I kept the graph_equal check, because it's faster and should be executed before the more expensive check.

In one of the internal workloads, I see that with this change, what was previously split into three different chains of 4-, 2-, 3- values, is correctly combined into one long chain of lenght 9-.

src/LoopCarry.cpp

vksnk · 2024-01-03T04:20:21Z

Also, fixed a bug when indices with different types are compared.

BTW, as far as I know, this pass is only used in Hexagon and Xtensa backends.

vksnk · 2024-01-04T07:39:44Z

All tests are green now.

abadams · 2024-01-04T10:50:41Z

See the comment up at line 250. It's not safe to use can_prove on a boolean Expr after doing substitute_in_all_lets. To make it safe to call, you have to call common_subexpression_elimination on the Expr first.

Note that this gets called on every pair of indices, so it has quadratic complexity in the IR size. I worry that this will stall for very large unrolled stencils. It's worth writing a test of a very large case. If it does indeed stall, we might need a better algorithm. One could for example hash the expressions and look for hash collisions, where by "hash" I mean substitute in some arbitrary values for the variables and constant-fold, and then only do can_prove on exprs that have the same hash.

src/LoopCarry.cpp

vksnk · 2024-01-04T22:34:51Z

See the comment up at line 250. It's not safe to use can_prove on a boolean Expr after doing substitute_in_all_lets. To make it safe to call, you have to call common_subexpression_elimination on the Expr first.

Note that this gets called on every pair of indices, so it has quadratic complexity in the IR size. I worry that this will stall for very large unrolled stencils. It's worth writing a test of a very large case. If it does indeed stall, we might need a better algorithm. One could for example hash the expressions and look for hash collisions, where by "hash" I mean substitute in some arbitrary values for the variables and constant-fold, and then only do can_prove on exprs that have the same hash.

Thanks a lot, this is very helpful!

I changed it to apply CSE first and only then run can_prove. Also, added a test which triggers loop_carry on the loop with large number of indices and the compilation time seems to be fine.

test/correctness/loop_carry.cpp

vksnk · 2024-01-08T22:31:56Z

I don't think test failures are related.

* Stronger chain detection in LoopCarry * Make sure that types are the same * Add a comment * Run CSE before calling can_prove * Test for loop carry * clang-tidy * Add missing override * Update comments

Stronger chain detection in LoopCarry

3055e2a

steven-johnson requested a review from abadams January 2, 2024 23:40

steven-johnson reviewed Jan 2, 2024

View reviewed changes

src/LoopCarry.cpp Outdated Show resolved Hide resolved

vksnk added 2 commits January 2, 2024 20:12

Make sure that types are the same

7ed555b

Add a comment

b71d889

Merge branch 'main' into vksnk/better-loop-carry

7cfc0b0

abadams reviewed Jan 4, 2024

View reviewed changes

src/LoopCarry.cpp Outdated Show resolved Hide resolved

vksnk added 2 commits January 4, 2024 14:26

Run CSE before calling can_prove

8540d8b

Test for loop carry

1fceaa3

vksnk added 2 commits January 4, 2024 15:29

clang-tidy

09e48f8

Add missing override

8ba85a2

abadams reviewed Jan 8, 2024

View reviewed changes

test/correctness/loop_carry.cpp Outdated Show resolved Hide resolved

vksnk added 2 commits January 8, 2024 11:03

Update comments

89c74af

Merge branch 'main' into vksnk/better-loop-carry

20d6b0f

abadams approved these changes Jan 8, 2024

View reviewed changes

vksnk merged commit 91b063d into main Jan 9, 2024
16 of 19 checks passed

BrewTestBot mentioned this pull request Jul 17, 2024

halide 18.0.0 Homebrew/homebrew-core#177657

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stronger chain detection in LoopCarry pass #8016

Stronger chain detection in LoopCarry pass #8016

vksnk commented Jan 2, 2024

vksnk commented Jan 3, 2024

vksnk commented Jan 4, 2024

abadams commented Jan 4, 2024 •

edited

Loading

vksnk commented Jan 4, 2024

vksnk commented Jan 8, 2024

Stronger chain detection in LoopCarry pass #8016

Stronger chain detection in LoopCarry pass #8016

Conversation

vksnk commented Jan 2, 2024

vksnk commented Jan 3, 2024

vksnk commented Jan 4, 2024

abadams commented Jan 4, 2024 • edited Loading

vksnk commented Jan 4, 2024

vksnk commented Jan 8, 2024

abadams commented Jan 4, 2024 •

edited

Loading