Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SME] Add scalable fp16->fp32 dense schedule #16981

Merged
merged 5 commits into from
May 28, 2024

Conversation

lhutton1
Copy link
Contributor

@lhutton1 lhutton1 commented May 8, 2024

This commit extends the functionality of the SME dense and matmul schedules to support operations with fp16 inputs and an fp32 output, where transpose_a=False and transpose_b=True.

For convenience, it also adds a utility called get_vscale_factor which creates the correct multiplier for vscale given a data type, reflecting ideas from an early design of the SVE RFC.

Note: this commit depends on #16921 so also contains the contents of #16921.

lhutton1 added 2 commits May 15, 2024 10:32
This commit extends the functionality of the SME dense and matmul
schedules to support operations with fp16 inputs and an fp32 output,
where `transpose_a=False` and `transpose_b=True`.

For convenience, it also adds a utility called `get_vscale_factor`
which created the correct multiplier for `vscale` given a data type,
reflecting ideas from an early design of the
[SVE](apache/tvm-rfcs#104) RFC.

Change-Id: I8c00bc6baf2df6015fa41200a238781126c73589
Change-Id: Ie7fb7a0a76119aa5c82e03ea0b2cc10de9f15f5e
@lhutton1 lhutton1 force-pushed the sme-fp16-fp32-dense-schedule branch from d2a164c to 1fe9bac Compare May 15, 2024 12:11
lhutton1 added 2 commits May 15, 2024 12:36
Change-Id: I0e9e45b285082b42676e53e74158e11d7e08608b
Change-Id: I32273241ae7569b65e082759e4f2ca4355ac6933
@lhutton1 lhutton1 marked this pull request as ready for review May 16, 2024 07:58
@lhutton1
Copy link
Contributor Author

lhutton1 commented May 16, 2024

cc @ekalda @Anndrey24 @leandron

Copy link
Contributor

@ekalda ekalda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lhutton1, really cool stuff! I only have nits.

tests/python/relay/strategy/arm_cpu/test_dense.py Outdated Show resolved Hide resolved
python/tvm/tir/op.py Outdated Show resolved Hide resolved
python/tvm/tir/op.py Outdated Show resolved Hide resolved
python/tvm/tir/op.py Outdated Show resolved Hide resolved
python/tvm/topi/arm_cpu/dense_alter_op.py Show resolved Hide resolved
python/tvm/tir/tensor_intrin/arm_cpu.py Show resolved Hide resolved
Change-Id: I237b4c5cb5ca22e33529d98cbd75177b94904857
@lhutton1 lhutton1 force-pushed the sme-fp16-fp32-dense-schedule branch from 0d2be71 to bc02e47 Compare May 22, 2024 16:37
@lhutton1
Copy link
Contributor Author

@tvm-bot rerun

Copy link
Contributor

@ekalda ekalda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lhutton1, LGTM!

@ekalda ekalda merged commit 430e02f into apache:main May 28, 2024
19 checks passed
@ekalda
Copy link
Contributor

ekalda commented May 28, 2024

Thanks @lhutton1 this is merged now!

lhutton1 added a commit to lhutton1/tvm that referenced this pull request May 29, 2024
Fixes a merge conflict between apache#16981 and apache#17003.

Change-Id: Ifcc983ef0b8c00250568a048fd682933adfdcde4
tqchen pushed a commit that referenced this pull request May 29, 2024
Fixes a merge conflict between #16981 and #17003.

Change-Id: Ifcc983ef0b8c00250568a048fd682933adfdcde4
Anndrey24 added a commit to Anndrey24/tvm that referenced this pull request May 30, 2024
This commit extends the SME conv2d NHWC schedule to support convolutions with float16 inputs (data and kernel) and a float32 output using the tensor intrinsics added in apache#16981.
ekalda pushed a commit that referenced this pull request Jun 5, 2024
This commit extends the SME conv2d NHWC schedule to support convolutions with float16 inputs (data and kernel) and a float32 output using the tensor intrinsics added in #16981.
@lhutton1 lhutton1 deleted the sme-fp16-fp32-dense-schedule branch June 6, 2024 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants