[Hexagon] Add hexagon user DMA intrins for tensorization #13719

nverke · 2023-01-06T23:13:05Z

Added some intrins for user dma on hexagon. Currently these seem to perform worse than all other options used in the test.

@adstraw

tvm-bot · 2023-01-06T23:13:07Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @Icemist, @ibsidorenko, @mehrdadh, @quic-sanirudh _{See #10317 for details}

_{Generated by tvm-bot}

adstraw · 2023-01-13T19:28:17Z

python/tvm/tir/tensor_intrin/hexagon.py

+            T.evaluate(
+                T.tvm_call_packed(
+                    "device_api.hexagon.dma_copy",
+                    0,


Synchronous DMA uses queue ID -1. See here. This is so as not to interfere with async DMA flow which uses queue IDs starting with 0. Please use queue -1 and add some comments here.

adstraw · 2023-01-13T19:29:24Z

python/tvm/tir/tensor_intrin/hexagon.py

+    """Generator of dma_load intrins"""
+
+    @T.prim_func
+    def dma_load_desc(a: T.handle, c: T.handle) -> None:


Would like this to be called "sync_dma_load_desc" with some comments to distinguish between async and sync (copy and immediate wait) flow.

adstraw · 2023-01-13T19:29:34Z

python/tvm/tir/tensor_intrin/hexagon.py

+                    C[vii] = A[vii]
+
+    @T.prim_func
+    def dma_load_impl(a: T.handle, c: T.handle) -> None:


sync_dma_load_impl

adstraw · 2023-01-13T19:34:54Z

python/tvm/tir/tensor_intrin/hexagon.py

+                    T.address_of(C[0], dtype="handle"),
+                    T.address_of(A[0], dtype="handle"),
+                    size,
+                    0,


Need comments, at least to indicate that this is for bypass. Better would be to tie the setting of this bit to tir.experimental_dma_bypass_cache annotation.

Going to just add a comment about this for now. These intrins don't currently have any planned use so figure if we find one we can add increased functionality.

adstraw · 2023-01-13T19:35:50Z

python/tvm/tir/tensor_intrin/hexagon.py

+                    dtype="int32",
+                )
+            )
+            T.evaluate(T.tvm_call_packed("device_api.hexagon.dma_wait", 0, 0, dtype="int32"))


Queue = -1. Comments that Wait(queue, 0) means to wait for the queue to drain which is the sum total of the previous dma_copy.

adstraw · 2023-01-13T19:36:49Z

tests/python/contrib/test_hexagon/test_vtcm_bandwidth.py

        # 20 * KB,
        # 40 * KB,
        # 80 * KB,
        # 160 * KB,
        # 320 * KB,
        640 * KB,
        # MB,
-        # 2 * MB,


Did you mean to uncomment this? Makes the test run longer in CI.

Added a check if running in CI.

adstraw · 2023-01-13T19:37:04Z

tests/python/contrib/test_hexagon/test_vtcm_bandwidth.py

-    number = 1
-    repeat = 1
+    number = 10
+    repeat = 10


Did you mean to change this? Makes the test run longer in CI.

Same as above, should have added a check for CI awhile ago.

adstraw · 2023-01-13T19:39:14Z

python/tvm/tir/tensor_intrin/hexagon.py

@@ -163,3 +204,27 @@ def dot_product_32x2_i16i16i32_vdmpy(a: T.handle, b: T.handle, c: T.handle) -> N

 VRMPY_u8i8i32_VTCM_INTRIN = "dot_32x4_u8i8i32_vtcm_vrmpy"
 TensorIntrin.register(VRMPY_u8i8i32_VTCM_INTRIN, *generate_dot_product_32x4_u8i8i32("global.vtcm"))
+
+DMA_READ_1_u8 = "dma_read_1_u8"


I don't see users for most of these. Seems like it might be better to delete and allow users to create what is needed based on the test case or schedule?

Good point, removed ✅

adstraw · 2023-01-13T19:44:17Z

Nice work. Just wanted to note that we will need to (and have plans to) put our heads together on a unified approach to DMA lowering. Right now, the async DMA flow stems from TIR annotations added by the InjectSoftwarePipe pass. This PR suggests a tensorization approach for synchronous DMA. I feel we could benefit from a unified approach. I recently deleted an old (TE based) tensorization approach for sync DMA scheduling. Glad to have this new TIR based version and it may in fact be the path forward for a unified approach to DMA lowering. Let's put our heads together and get the right design.

Added some intrins for user dma on hexagon. Currently these seem to perform worse than all other options used in the test.

adstraw suggested changes Jan 13, 2023

View reviewed changes

nverke force-pushed the dma_intrins branch from dcc2b1a to 4edb627 Compare January 14, 2023 00:18

nverke added 4 commits January 13, 2023 16:18

[Hexagon] Add hexagon user DMA intrins for tensorization.

baee5d9

revert changes not needed

fb97e62

Add comments, change queue id to -1, and remove tests from CI

a04de7c

lint changes

51641c6

nverke force-pushed the dma_intrins branch from 4edb627 to 51641c6 Compare January 14, 2023 00:19

adstraw approved these changes Jan 17, 2023

View reviewed changes

mehrdadh approved these changes Jan 17, 2023

View reviewed changes

mehrdadh merged commit e25e618 into apache:main Jan 17, 2023

ysh329 mentioned this pull request Apr 17, 2023

[Release] v0.12.0 Release Candidate Notes #14645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hexagon] Add hexagon user DMA intrins for tensorization #13719

[Hexagon] Add hexagon user DMA intrins for tensorization #13719

nverke commented Jan 6, 2023 •

edited

Loading

tvm-bot commented Jan 6, 2023

adstraw Jan 13, 2023

nverke Jan 14, 2023

adstraw Jan 13, 2023

nverke Jan 14, 2023

adstraw Jan 13, 2023

nverke Jan 14, 2023

adstraw Jan 13, 2023

nverke Jan 14, 2023

adstraw Jan 13, 2023

nverke Jan 14, 2023

adstraw Jan 13, 2023

nverke Jan 13, 2023

adstraw Jan 13, 2023

nverke Jan 13, 2023

adstraw Jan 13, 2023

nverke Jan 14, 2023

adstraw commented Jan 13, 2023

[Hexagon] Add hexagon user DMA intrins for tensorization #13719

[Hexagon] Add hexagon user DMA intrins for tensorization #13719

Conversation

nverke commented Jan 6, 2023 • edited Loading

tvm-bot commented Jan 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adstraw commented Jan 13, 2023

nverke commented Jan 6, 2023 •

edited

Loading