-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FRONTEND][BACKEND] Cleanup and re-enable optimization with fp8e4b15 #3521
Conversation
…e4b15 Multiple fixes to allow pipelining to happen when generating a matmul with fp8e4b15 inputs. Also clean useless code in the frontend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just minor nits from my side!
@@ -149,6 +149,24 @@ SmallVector<Value> packI32(const SmallVector<Value> &inValues, Type srcTy, | |||
} | |||
return outValues; | |||
} | |||
|
|||
int getNumElmenetPerThreads(Type type, const LLVMTypeConverter *typeConverter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Elmenet -> Element
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops.. fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just minor nits from my side!
I have a few review comments, please hold on for a sec till I'm out of the interview... |
Thanks Justin, I'll push that for now to unblock my wheel update but please add your comments and I'll send a follow up PR |
@@ -149,6 +149,24 @@ SmallVector<Value> packI32(const SmallVector<Value> &inValues, Type srcTy, | |||
} | |||
return outValues; | |||
} | |||
|
|||
int getNumElmenetPerThreads(Type type, const LLVMTypeConverter *typeConverter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getNumElementsPerThread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already addressed
@@ -149,6 +149,24 @@ SmallVector<Value> packI32(const SmallVector<Value> &inValues, Type srcTy, | |||
} | |||
return outValues; | |||
} | |||
|
|||
int getNumElmenetPerThreads(Type type, const LLVMTypeConverter *typeConverter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is only correct for inline asm, right? If so, can we change the name to indicate that? Also can we add a comment indicating why we do 32/size? (It's because inline asm implicitly packs elements in this way, I believe.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no I think the function should work for any op.
Also can we add a comment indicating why we do 32/size?
sure
// need to reorder them so we iterate over the operands' elements in the | ||
// same logical order. | ||
for (unsigned i = 0; i < unpackedOperands.size(); ++i) { | ||
unpackedOperands[i] = reorderValues( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, where did this call to reorderValues go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was not correct, removed it
address some comments from #3521
Multiple fixes to allow pipelining to happen when generating a matmul with fp8e4b15 inputs. Also clean useless code in the frontend.