[TIR, Schedule] Add schedule primitive PadEinsum #12750

vinx13 · 2022-09-09T23:24:17Z

Co-authored-by: Bohan Hou 32121147+spectrometerHBH@users.noreply.github.com

This PR adds a schedule primitive PadEinsum. It is used for computation in Einsum pattern specifically, which cover most cases for tensorization. Different from general cases for padding in https://github.com/apache/tvm-rfcs/blob/main/rfcs/0077-layout-transform-padding.md, this primitive pads the output blocks and the input blocks at once, which eliminates the need to extra arithmetic analysis to provide the guarantee of program correctness.

cc @Hzfengsy @wrongtest-intellif @spectrometerHBH @Lunderberg

wrongtest-intellif · 2022-09-10T02:54:55Z

tests/python/unittest/test_tir_schedule_pad_einsum.py

+
+
+@T.prim_func
+def matmul_expected(


Compare to #12720 cc @Lunderberg
Could I understand that it equals with a bundle of operations in certain workload pattern? Like

for buffer in [A_shared, B_shared, C_shared]: s.transpose_layout(buffer, (127, 127) -> (128, 128), pad_value=0) for block in [A, B, C_shared]: for axis in s.get_loops(block) s.fuse(*s.split(axis, [1, 128])) s.annotate(C_shared, "en_some_predicate_versus_overcomputation_selection", 1)

Yes. It pads the producers with init value (zero) and over-computes the reduction block

tests/python/unittest/test_tir_schedule_pad_einsum.py

src/tir/schedule/primitive/pad_einsum.cc

Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>

Hzfengsy

LGTM

Lunderberg

Overall, looks good, but just a few usability questions and potential improvements. I like seeing which assumptions are made here, which lead to a much simpler analysis than the more general case from the padding RFC.

I think the biggest question is the padding specified, and whether it can be specified as both a left/right padding, rather than only padding on the right.

Lunderberg · 2022-09-14T15:05:38Z

include/tvm/tir/schedule/schedule.h

+   * the output buffer and the producer buffer to be allocated inside the PrimFunc.
+   *
+   * The padding is a list of non-negative integers, each element corresponds to the padding for
+   * each block iter in the order of block iters. The block and it's producer blocks should have


Nitpick: "it's" should be "its", without an apostrphe

Lunderberg · 2022-09-14T15:08:27Z

include/tvm/tir/schedule/schedule.h

+   * The output buffer and the producer buffer is resized according to the padding size. It requires
+   * the output buffer and the producer buffer to be allocated inside the PrimFunc.
+   *
+   * The padding is a list of non-negative integers, each element corresponds to the padding for


It looks like the padding can only be applied to the end of an axis/iterator, and cannot be applied to the beginning. Could we specify two arrays of padding, one for the lower end each block iter and one for the upper end?

src/tir/schedule/primitive/pad_einsum.cc

vinx13 · 2022-09-14T18:17:23Z

@Lunderberg The current assumption is to over compute the reduction block, and infer the padding of the producer. Since the padding is inferred from buffer access pattern, I think we can't specify the padding as tuple

Lunderberg · 2022-09-14T18:34:36Z

@vinx13 Thank you, and that makes sense. So, one of the simplifying assumptions that is all padding will only be on one side, and if the padding is allowed on both sides, that wouldn't just add a free parameter for the final output, but also for each producer.

Lunderberg

LGTM!

junrushao · 2022-09-21T02:52:14Z

@vinx13 let's fix the following warnings:

/root/Projects/tvm-dev/src/tir/schedule/primitive/pad_einsum.cc:231:8: warning: 'tvm::tir::PadEinsumRewriter::VisitStmt_' hides overloaded virtual function [-Woverloaded-virtual]
  Stmt VisitStmt_(const ForNode* op) final {
       ^
/root/Projects/tvm-dev/src/tir/schedule/primitive/.././transform.h:134:8: note: hidden overloaded virtual function 'tvm::tir::ReplaceBufferMutator::VisitStmt_' declared here: type mismatch at 1st parameter ('const tvm::tir::BufferStoreNode *' vs 'const tvm::tir::ForNode *')
  Stmt VisitStmt_(const BufferStoreNode* op) final;
       ^
/root/Projects/tvm-dev/src/tir/schedule/primitive/pad_einsum.cc:374:47: warning: lambda capture 'buffer_remap' is not used [-Wunused-lambda-capture]
  auto f_pad_buffer = [&padded_iter_extents, &buffer_remap](Buffer buffer,
                                           ~~~^~~~~~~~~~~~

* [TIR, Schedule] Add schedule primitive PadEinsum Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> * lint * [TIR] Fix producer indices check in PadEinsum * address comments * simplify lambda expr * fix Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>

github-actions bot requested review from Hzfengsy, Lunderberg, spectrometerHBH and wrongtest-intellif September 9, 2022 23:24

vinx13 force-pushed the feat/tir-pad-einsum branch from af3c595 to ad826eb Compare September 9, 2022 23:26

wrongtest-intellif reviewed Sep 10, 2022

View reviewed changes

tests/python/unittest/test_tir_schedule_pad_einsum.py Outdated Show resolved Hide resolved

src/tir/schedule/primitive/pad_einsum.cc Show resolved Hide resolved

[TIR, Schedule] Add schedule primitive PadEinsum

b2e3ad1

Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>

vinx13 force-pushed the feat/tir-pad-einsum branch 2 times, most recently from db1b3e5 to 9a0a81c Compare September 12, 2022 21:31

lint

09ed7ab

vinx13 force-pushed the feat/tir-pad-einsum branch from 9a0a81c to 09ed7ab Compare September 12, 2022 23:08

[TIR] Fix producer indices check in PadEinsum

c430110

Hzfengsy approved these changes Sep 14, 2022

View reviewed changes

Lunderberg reviewed Sep 14, 2022

View reviewed changes

vinx13 added 3 commits September 14, 2022 11:59

address comments

4b940f8

simplify lambda expr

48ac825

fix

0becfce

vinx13 force-pushed the feat/tir-pad-einsum branch from f7726ee to 0becfce Compare September 15, 2022 00:33

Lunderberg approved these changes Sep 15, 2022

View reviewed changes

vinx13 merged commit 1f8b5de into apache:main Sep 15, 2022

AndrewZhaoLuo mentioned this pull request Oct 4, 2022

TVM v0.10.0.rc0 Release Candidate Notes #12979

Closed

masahi mentioned this pull request Dec 13, 2022

[DietCode] Local Padding #11793

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TIR, Schedule] Add schedule primitive PadEinsum #12750

[TIR, Schedule] Add schedule primitive PadEinsum #12750

vinx13 commented Sep 9, 2022 •

edited

Loading

wrongtest-intellif Sep 10, 2022

vinx13 Sep 12, 2022

Hzfengsy left a comment

Lunderberg left a comment

Lunderberg Sep 14, 2022

Lunderberg Sep 14, 2022

vinx13 commented Sep 14, 2022

Lunderberg commented Sep 14, 2022

Lunderberg left a comment

junrushao commented Sep 21, 2022



		@T.prim_func
		def matmul_expected(

[TIR, Schedule] Add schedule primitive PadEinsum #12750

[TIR, Schedule] Add schedule primitive PadEinsum #12750

Conversation

vinx13 commented Sep 9, 2022 • edited Loading

wrongtest-intellif Sep 10, 2022

Choose a reason for hiding this comment

vinx13 Sep 12, 2022

Choose a reason for hiding this comment

Hzfengsy left a comment

Choose a reason for hiding this comment

Lunderberg left a comment

Choose a reason for hiding this comment

Lunderberg Sep 14, 2022

Choose a reason for hiding this comment

Lunderberg Sep 14, 2022

Choose a reason for hiding this comment

vinx13 commented Sep 14, 2022

Lunderberg commented Sep 14, 2022

Lunderberg left a comment

Choose a reason for hiding this comment

junrushao commented Sep 21, 2022

vinx13 commented Sep 9, 2022 •

edited

Loading