[MetaSchedule] Refactor ScheduleRule Attributes #13195

junrushao · 2022-10-25T17:23:58Z

This PR refactors the existing schedule_rule block annotation-based dispatch into a more organized folder structure, which follows the exact same mechanism as AutoTVM.

In the example below, when target is Target("llvm -keys=cuda,gpu") and the block annotation is conv2d_nchw_winograd_inverse:

with T.block("some_block"):
  T.block_attr({
    "scehdule_rule": "conv2d_nchw_winograd_inverse",
  })

the system will find the following global packed functions in order:

meta_schedule.cuda.conv2d_nchw_winograd_inverse
meta_schedule.gpu.conv2d_nchw_winograd_inverse
whose function signatures are:

def schedule_rule(
  sch: tvm.tir.Schedule,
  block: tvm.tir.schedule.BlockRV,
) -> List[tvm.tir.Schedule]:

In terms of code organization, for example, for target key cuda, the schedule functions are supposed to be defined in:

include/tvm/meta_schedule/schedule/cuda for public methods
src/meta_schedule/schedule/cuda for private methods
python/tvm/meta_schedule/schedule/cuda for direct python side definition

tvm-bot · 2022-10-25T17:24:01Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Built docs for commit 0e16e61 can be found here.
cc @Hzfengsy, @elvin-n _{See #10317 for details}

_{Generated by tvm-bot}

This PR refactors the existing `schedule_rule` block annotation-based dispatch into a more organized folder structure, which follows the exact same mechanism as AutoTVM. In the example below, when target is `Target("llvm -keys=cuda,gpu")` and the block annotation is `conv2d_nchw_winograd_inverse`: ```python with T.block("some_block"): T.block_attr({ "scehdule_rule": "conv2d_nchw_winograd_inverse", }) ``` the system will find the following global packed functions in order: - `meta_schedule.cuda.conv2d_nchw_winograd_inverse` - `meta_schedule.gpu.conv2d_nchw_winograd_inverse` whose function signatures are: ```python def schedule_rule( sch: tvm.tir.Schedule, block: tvm.tir.schedule.BlockRV, ) -> List[tvm.tir.Schedule]: ``` In terms of code organization, for example, for target key `cuda`, the schedule functions are supposed to be defined in: - `include/tvm/meta_schedule/schedule/cuda` for public methods - `src/meta_schedule/schedule/cuda` for private methods - `python/tvm/meta_schedule/schedule/cuda` for direct python side definition

zxybazh

LGTM, I find the introduction of ApplyCustomRule pretty helpful -- no need to couple that with PostOrderApply. Only got 2 nits. And thanks for fixing the typos.

zxybazh · 2022-11-07T23:13:54Z

python/tvm/relay/op/strategy/cuda.py

-                (2 < kh < 8 and 2 < kw < 8 and kh == kw)
+            N, _, H, W = get_const_tuple(data.shape)
+            CO, CI, KH, KW = get_const_tuple(kernel.shape)
+            (_, _, judge_winograd_auto_scheduler) = judge_winograd(


Since we are resuing the policy of judge_winograd for MS, probably good to call it judge_winograd_decision.

certainly - it definitely sounds better

zxybazh · 2022-11-07T23:15:12Z

python/tvm/topi/nn/conv2d.py

+    output : tvm.te.Tensor
+        4-D with shape [batch, out_height, out_width, out_channel]
+    """
+    tile_size = 4


Just to confirm if tile_size is limited to 4 here is for simplicity?

not for simplicity but for consistency with auto-scheduler's nhwc case a couple of lines above. note that this configuration is also overridden by cuda dispatch, so we dont have to worry too much about it

masahi · 2022-11-08T09:06:29Z

python/tvm/meta_schedule/search_strategy/evolutionary_search.py

@@ -58,7 +58,7 @@ class EvolutionarySearch(SearchStrategy):
    def __init__(
        self,
        *,
-        population_size: int = 2048,
+        population_size: int = 512,


cc @zxybazh Is this change intended to go with this PR, or is it a left-over from development?

This change is so small and looks unrelated to this PR, but it has huge implications. Tuning time will become much shorter, which I like, but could there be a concern for perf regression due to this change?

I'm asking this because I have been doing perf and tuning time improvement for int8 TC, and after rebasing I'm directly affected by this change.

@masahi Thanks for asking, and it's definitely a valid concern. The reason is that we found it's somehow set to an unreasonably large number (we were consistently using 512 on downstream), and now wanted to scale it back, because it negatively affects tuning time. Also note that the population calculation is slightly different than AutoScheduler, which includes invalid candidates in population, while ours always prune ahead of time.

We did make sure we have proper numbers in hand before merging, and please refer to the table below for details:

Mainline (ms) This PR (ms) Difference

resnet_50 1.829128008 1.73111331 5.66%

mobilenet_v2 0.4773168361 0.4806015715 -0.68%

resnet_18 0.678600832 0.6301909627 7.68%

mobilenet_v3 0.6649458484 0.668666894 -0.56%

wide_resnet_50 3.753781691 3.077549256 21.97%

densenet_121 2.361973117 2.288411393 3.21%

inception_v3 3.53431975 3.478368823 1.61%

resnet3d_18 7.791658449 7.465777971 4.36%

masahi · 2022-11-10T07:08:16Z

src/meta_schedule/schedule_rule/apply_custom_rule.cc

+}
+
+TVM_REGISTER_NODE_TYPE(ApplyCustomRuleNode);
+TVM_REGISTER_GLOBAL("meta_schedule.ScheduleApplyCustomRule")


Is this a typo? Shouldn't it be ScheduleRuleApplyCustomRule.

schedule_rule/apply_custom_rule.py refers to ScheduleRuleApplyCustomRule that doesn't exist.

I think this should be a typo, given it's part of a ScheduleRule and the ffi function call here on the python side doesn't have corresponding registry.

Yep this is a typo. Feel free to send a quick patch :-)

Just sent a patch #13353

masahi · 2022-11-10T10:54:30Z

src/meta_schedule/space_generator/post_order_apply.cc

-          applied = (*custom_schedule_fn)(sch, block_rv);
+        if (!ScheduleRule::IsApplyCustomRule(sch_rule)) {
+          if (tir::GetAnn<String>(sch->GetSRef(block_rv), "schedule_rule").defined()) {
+            stack.emplace_back(sch, blocks);


It seems this change broke auto tensorization for VNNI and Hexagon vrmpy. They target the TE compute that happens to be annotated with schedule_rule (specifically, NCHWc int8 conv2d https://github.com/apache/tvm/blob/main/python/tvm/topi/nn/conv2d.py#L534-L552). After this change, MultiLevelTilingWithIntrin will never be applied to the block corresponding to this te compute.

So for auto tensorization, we just want to ignore this annotation.

Ah I see! I was not aware the usecase where schedule_rule is hardcoded but intentionally ignored. In this particular case, how about we add a line below in ApplyCustomRule after the warning is emitted:

sch->Unannotate(block, "schedule_rule");

Thanks Unannotate fixed the issue.

masahi · 2022-11-15T01:30:45Z

src/te/operation/create_primfunc.cc

        }
      }
-      block.CopyOnWrite()->annotations.Set(topi_attr, new_buffers);


Is this change intentional? Just found that this change broke RewriteLayout when link-params = True. It reverts the change I made in #12991. We need to preserve layout_free_placeholders annotation.

The problem is that the Buffer objects in the annotation will not be updated when doing transformation, and that’s why I removed this line. I was not aware this is useful for link-param-specific hack.

Feel free to add it back if it’s needed. Shall we add a flag in pass config which defaults to off, and hexagon-specific users could turn it on when needed?

I see, for now I reverted this change in #13384 to unblock tuning with link params. I don't have a good solution for now, other than enforcing transformation passes to take buffers in layout_free_placeholders into account.

I'd say the same problem applies to the current approach which specifies the layout free buffers by their indices in the parameter list, since it is also technically possible (although unlikely) that some transformation may change the signature of the prim func, which invalidates the parameter indices.

junrushao force-pushed the feature/2022-10-10/autotvm-debug branch 9 times, most recently from 3d002b9 to 25335d1 Compare November 1, 2022 17:23

junrushao force-pushed the feature/2022-10-10/autotvm-debug branch 13 times, most recently from 061af8f to 10e2d45 Compare November 5, 2022 17:55

junrushao changed the title ~~[WIP][MetaSchedule] Refactor ScheduleRule Attributes~~ [MetaSchedule] Refactor ScheduleRule Attributes Nov 5, 2022

junrushao force-pushed the feature/2022-10-10/autotvm-debug branch from 10e2d45 to 0e16e61 Compare November 7, 2022 17:37

junrushao marked this pull request as ready for review November 7, 2022 20:15

spectrometerHBH approved these changes Nov 7, 2022

View reviewed changes

spectrometerHBH merged commit b16a64d into apache:main Nov 7, 2022

zxybazh approved these changes Nov 7, 2022

View reviewed changes

masahi reviewed Nov 8, 2022

View reviewed changes

masahi reviewed Nov 10, 2022

View reviewed changes

masahi mentioned this pull request Nov 10, 2022

[MetaSchedule] Unannotate schedule_rule if corresponding schedule func is not found #13346

Merged

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 10, 2022

[MetaSchedule] Refactor ScheduleRule Attributes (apache#13195)

dae448f

zxybazh mentioned this pull request Nov 10, 2022

[MetaSchedule][Minor] Fix Typo in ApplyCustomRule Schedule Rule #13353

Merged

masahi reviewed Nov 15, 2022

View reviewed changes

masahi mentioned this pull request Nov 15, 2022

[MetaSchedule] Support schedules with cache read in RewriteLayout #13384

Merged

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022

[MetaSchedule] Refactor ScheduleRule Attributes (apache#13195)

67a6dd6

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

apeskov mentioned this pull request Apr 28, 2023

[MetaSchedule] Skip custom_rule in case of None #14732

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MetaSchedule] Refactor ScheduleRule Attributes #13195

[MetaSchedule] Refactor ScheduleRule Attributes #13195

junrushao commented Oct 25, 2022 •

edited

Loading

tvm-bot commented Oct 25, 2022 •

edited

Loading

zxybazh left a comment

zxybazh Nov 7, 2022

junrushao Nov 7, 2022

zxybazh Nov 7, 2022

junrushao Nov 7, 2022

masahi Nov 8, 2022 •

edited

Loading

junrushao Nov 8, 2022 •

edited

Loading

masahi Nov 10, 2022

zxybazh Nov 10, 2022

junrushao Nov 10, 2022

zxybazh Nov 10, 2022

masahi Nov 10, 2022

junrushao Nov 10, 2022 •

edited

Loading

masahi Nov 10, 2022

masahi Nov 15, 2022 •

edited

Loading

junrushao Nov 15, 2022

masahi Nov 15, 2022 •

edited

Loading

	Mainline (ms)	This PR (ms)	Difference
resnet_50	1.829128008	1.73111331	5.66%
mobilenet_v2	0.4773168361	0.4806015715	-0.68%
resnet_18	0.678600832	0.6301909627	7.68%
mobilenet_v3	0.6649458484	0.668666894	-0.56%
wide_resnet_50	3.753781691	3.077549256	21.97%
densenet_121	2.361973117	2.288411393	3.21%
inception_v3	3.53431975	3.478368823	1.61%
resnet3d_18	7.791658449	7.465777971	4.36%

[MetaSchedule] Refactor ScheduleRule Attributes #13195

[MetaSchedule] Refactor ScheduleRule Attributes #13195

Conversation

junrushao commented Oct 25, 2022 • edited Loading

tvm-bot commented Oct 25, 2022 • edited Loading

zxybazh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masahi Nov 8, 2022 • edited Loading

Choose a reason for hiding this comment

junrushao Nov 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junrushao Nov 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masahi Nov 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masahi Nov 15, 2022 • edited Loading

Choose a reason for hiding this comment

junrushao commented Oct 25, 2022 •

edited

Loading

tvm-bot commented Oct 25, 2022 •

edited

Loading

masahi Nov 8, 2022 •

edited

Loading

junrushao Nov 8, 2022 •

edited

Loading

junrushao Nov 10, 2022 •

edited

Loading

masahi Nov 15, 2022 •

edited

Loading

masahi Nov 15, 2022 •

edited

Loading