[backend][amd] Support device print using hostcall #3476

antiagainst · 2024-03-27T03:21:42Z

This commit add device printf support in the AMD backend.
It moves the existing NVIDIA lowering logic to the common
conversion library and adds a new method in TargetInfo
for target specific code generation.

This right now only supports the hostcall mode, which
requires PCIe atomics. There is also a buffered mode,
see https://rocm.docs.amd.com/en/docs-5.7.0/release.html#non-hostcall-hip-printf
for details.

Follows implementation in https://reviews.llvm.org/D110448.

antiagainst · 2024-04-01T17:54:56Z

Note that the commits are structured to make reviewing easier. You can pretty much look at commits one by one; some of them just shuffle code around and have a NFC marker on it. The meaty parts are the last two commits.

ThomasRaoux

Awesome work! Thanks Lei!

third_party/nvidia/lib/TritonNVIDIAGPUToLLVM/TargetInfo.cpp

.github/workflows/integration-tests.yml

lib/Conversion/TritonGPUToLLVM/PrintOpToLLVM.cpp

jlebar · 2024-04-01T18:35:15Z

third_party/amd/lib/TritonAMDGPUToLLVM/TargetInfo.cpp

+  message = call(printStrFn, arguments).getResult();
+
+  // Emit the intrinsic function call to handle arguments iteratively.
+  // We can only handle at most 7 values each time.


Is this going to be a problem? For example if I have a 4D tensor, we will print out 3 program_id's, plus 4 tensor indices, plus one value. So the value will not be printed in the same printf statement as the 7 other values. But then will it be interleaved with other threads' printfs? If so that will make it basically useless...

Good question. I haven't looked into how print with hostcall is implemented in the driver stack. But these function calls have a isLast parameter chaining them together--only the last print function will set isLast as 1. That makes me think they are "atomic" in a sense. @scxiao do you know if that's indeed the case? Good to stress test and see how it behaves with more arguments.

…ang#3647, triton-lang#3144 and triton-lang#3476 (triton-lang#1806) Closes intel/intel-xpu-backend-for-triton#1807 --------- Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

antiagainst force-pushed the amd-print branch 5 times, most recently from 40a194d to 8cb5489 Compare March 30, 2024 17:58

antiagainst added 4 commits March 30, 2024 19:05

NFC: move PrintOpToLLVM to the common lib/Conversion directory

1f45f40

Rewire up PrintOp in third_party/nvidia

496046e

NFC: do early return to avoid one indent level

4289462

NFC: Create new virtual method for target-specific printf

03d92bb

antiagainst force-pushed the amd-print branch 3 times, most recently from 04ac6d1 to 28fabee Compare March 31, 2024 18:24

Wire up AMD support for print

1b5de85

antiagainst force-pushed the amd-print branch 2 times, most recently from ba7c48a to 8241010 Compare March 31, 2024 22:47

antiagainst changed the title ~~[backend] Add printf support to AMD backend~~ [backend][amd] Support device print using hostcall Mar 31, 2024

antiagainst marked this pull request as ready for review March 31, 2024 22:52

antiagainst requested review from goostavz, Superjomn and ptillet as code owners March 31, 2024 22:52

Adjust print test to support both cuda and hip

2a7d97f

antiagainst force-pushed the amd-print branch from 8241010 to 2a7d97f Compare March 31, 2024 23:00

ThomasRaoux approved these changes Apr 1, 2024

View reviewed changes

jlebar reviewed Apr 1, 2024

View reviewed changes

third_party/nvidia/lib/TritonNVIDIAGPUToLLVM/TargetInfo.cpp Outdated Show resolved Hide resolved

jlebar reviewed Apr 1, 2024

View reviewed changes

.github/workflows/integration-tests.yml Outdated Show resolved Hide resolved

jlebar reviewed Apr 1, 2024

View reviewed changes

lib/Conversion/TritonGPUToLLVM/PrintOpToLLVM.cpp Outdated Show resolved Hide resolved

jlebar reviewed Apr 1, 2024

View reviewed changes

antiagainst added 2 commits April 1, 2024 18:50

Fix naming and variable usage

8cf9ce1

Use one pytest invocation

4cc882e

zahimoud merged commit 38cd5ab into triton-lang:main Apr 1, 2024
5 checks passed

antiagainst deleted the amd-print branch April 1, 2024 20:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[backend][amd] Support device print using hostcall #3476

[backend][amd] Support device print using hostcall #3476

antiagainst commented Mar 27, 2024 •

edited

Loading

antiagainst commented Apr 1, 2024

ThomasRaoux left a comment

jlebar Apr 1, 2024

antiagainst Apr 1, 2024

[backend][amd] Support device print using hostcall #3476

[backend][amd] Support device print using hostcall #3476

Conversation

antiagainst commented Mar 27, 2024 • edited Loading

antiagainst commented Apr 1, 2024

ThomasRaoux left a comment

Choose a reason for hiding this comment

jlebar Apr 1, 2024

Choose a reason for hiding this comment

antiagainst Apr 1, 2024

Choose a reason for hiding this comment

antiagainst commented Mar 27, 2024 •

edited

Loading