[FRONTEND][BACKEND] Cleanup and re-enable optimization with fp8e4b15 #3521

ThomasRaoux · 2024-04-01T08:17:09Z

Multiple fixes to allow pipelining to happen when generating a matmul with fp8e4b15 inputs. Also clean useless code in the frontend.

…e4b15 Multiple fixes to allow pipelining to happen when generating a matmul with fp8e4b15 inputs. Also clean useless code in the frontend.

pawelszczerbuk

Looks good to me, just minor nits from my side!

pawelszczerbuk · 2024-04-01T17:59:19Z

lib/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.cpp

@@ -149,6 +149,24 @@ SmallVector<Value> packI32(const SmallVector<Value> &inValues, Type srcTy,
  }
  return outValues;
 }
+
+int getNumElmenetPerThreads(Type type, const LLVMTypeConverter *typeConverter) {


Typo: Elmenet -> Element

oops.. fixed

lib/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.cpp

pawelszczerbuk

Looks good to me, just minor nits from my side!

jlebar · 2024-04-01T18:13:50Z

I have a few review comments, please hold on for a sec till I'm out of the interview...

ThomasRaoux · 2024-04-01T18:30:55Z

I have a few review comments, please hold on for a sec till I'm out of the interview...

Thanks Justin, I'll push that for now to unblock my wheel update but please add your comments and I'll send a follow up PR

jlebar · 2024-04-01T17:26:17Z

lib/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.cpp

@@ -149,6 +149,24 @@ SmallVector<Value> packI32(const SmallVector<Value> &inValues, Type srcTy,
  }
  return outValues;
 }
+
+int getNumElmenetPerThreads(Type type, const LLVMTypeConverter *typeConverter) {


getNumElementsPerThread?

already addressed

lib/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.cpp

jlebar · 2024-04-01T17:28:30Z

lib/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.cpp

@@ -149,6 +149,24 @@ SmallVector<Value> packI32(const SmallVector<Value> &inValues, Type srcTy,
  }
  return outValues;
 }
+
+int getNumElmenetPerThreads(Type type, const LLVMTypeConverter *typeConverter) {


This function is only correct for inline asm, right? If so, can we change the name to indicate that? Also can we add a comment indicating why we do 32/size? (It's because inline asm implicitly packs elements in this way, I believe.)

no I think the function should work for any op.

Also can we add a comment indicating why we do 32/size?

sure

lib/Dialect/TritonGPU/Transforms/Utility.cpp

jlebar · 2024-04-01T19:06:02Z

lib/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.cpp

-    // need to reorder them so we iterate over the operands' elements in the
-    // same logical order.
-    for (unsigned i = 0; i < unpackedOperands.size(); ++i) {
-      unpackedOperands[i] = reorderValues(


Hm, where did this call to reorderValues go?

it was not correct, removed it

address some comments from #3521

ThomasRaoux force-pushed the fp8_bugfix branch from 2111309 to b8c08c1 Compare April 1, 2024 08:33

ThomasRaoux requested review from zahimoud, pawelszczerbuk and jlebar April 1, 2024 08:33

ThomasRaoux marked this pull request as ready for review April 1, 2024 08:33

ThomasRaoux requested a review from ptillet as a code owner April 1, 2024 08:33

ThomasRaoux force-pushed the fp8_bugfix branch from b8c08c1 to 1606905 Compare April 1, 2024 08:38

[FRONTEND][BACKEND] Cleanup and re-enable optimization when using fp8…

3c0e241

…e4b15 Multiple fixes to allow pipelining to happen when generating a matmul with fp8e4b15 inputs. Also clean useless code in the frontend.

ThomasRaoux force-pushed the fp8_bugfix branch from 1606905 to 3c0e241 Compare April 1, 2024 08:44

pawelszczerbuk reviewed Apr 1, 2024

View reviewed changes

pawelszczerbuk approved these changes Apr 1, 2024

View reviewed changes

Address review comments

5b624fc

ThomasRaoux merged commit ea40df4 into triton-lang:main Apr 1, 2024
5 checks passed

jlebar reviewed Apr 1, 2024

View reviewed changes

ThomasRaoux added a commit to ThomasRaoux/triton that referenced this pull request Apr 1, 2024

[NFC] Address review comments from triton-lang#3521

2cd1e17

jlebar reviewed Apr 1, 2024

View reviewed changes

ThomasRaoux mentioned this pull request Apr 1, 2024

[NFC] Address post-commit review comments #3526

Merged

ThomasRaoux added a commit that referenced this pull request Apr 1, 2024

[NFC] Address post-commit review comments (#3526)

b5115a2

address some comments from #3521

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FRONTEND][BACKEND] Cleanup and re-enable optimization with fp8e4b15 #3521

[FRONTEND][BACKEND] Cleanup and re-enable optimization with fp8e4b15 #3521

ThomasRaoux commented Apr 1, 2024

pawelszczerbuk left a comment

pawelszczerbuk Apr 1, 2024

ThomasRaoux Apr 1, 2024

pawelszczerbuk left a comment

jlebar commented Apr 1, 2024

ThomasRaoux commented Apr 1, 2024

jlebar Apr 1, 2024

ThomasRaoux Apr 1, 2024

jlebar Apr 1, 2024

ThomasRaoux Apr 1, 2024

jlebar Apr 1, 2024

ThomasRaoux Apr 1, 2024

[FRONTEND][BACKEND] Cleanup and re-enable optimization with fp8e4b15 #3521

[FRONTEND][BACKEND] Cleanup and re-enable optimization with fp8e4b15 #3521

Conversation

ThomasRaoux commented Apr 1, 2024

pawelszczerbuk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pawelszczerbuk left a comment

Choose a reason for hiding this comment

jlebar commented Apr 1, 2024

ThomasRaoux commented Apr 1, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment