Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMD backend] Fix unit test test_dot_without_load #3338

Merged
merged 8 commits into from
Mar 12, 2024

Conversation

scxiao
Copy link
Contributor

@scxiao scxiao commented Mar 11, 2024

This PR is to fix the unit test test_dot_without_load by by changing dot op from vector<vector<type>> to vector<type>, so the constantOp can be converted to mfma dot_op with existing lowering code.

@scxiao scxiao requested a review from ptillet as a code owner March 11, 2024 18:29
@@ -87,7 +87,7 @@ Type TritonGPUToLLVMTypeConverter::getElementTypeForStruct(
return elemTy;
if (auto mfmaParent =
dotOpLayout.getParent().dyn_cast<AMDMfmaEncodingAttr>()) {
return vec_ty(elemTy, dotOpLayout.getKWidth());
return elemTy;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is not a special case anymore, we can just remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

@@ -43,7 +43,15 @@ namespace gpu {
unsigned getTotalElemsPerThread(Attribute layout, ArrayRef<int64_t> shape,
Type eltTy) {
if (auto tritonGPUAttr = layout.dyn_cast<TritonGPU_AttrTrait>()) {
return tritonGPUAttr.getTotalElemsPerThread(shape, eltTy);
unsigned elemNum = tritonGPUAttr.getTotalElemsPerThread(shape, eltTy);
Copy link
Collaborator

@zhanglx13 zhanglx13 Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better not to touch this "interface" function. The changes are for dotOp with mfma as parent anyway. Can we move the changes in DotOperandEncodingAttr::getTotalElemsPerThread?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Changed accordingly.

@scxiao scxiao force-pushed the fix_dot_without_load branch from 694cd67 to eda0398 Compare March 11, 2024 19:30
@zhanglx13 zhanglx13 requested a review from zahimoud March 11, 2024 19:39
@scxiao
Copy link
Contributor Author

scxiao commented Mar 12, 2024

Hi @ThomasRaoux, could you please take a look at this PR to see if you have any comments? Thanks.

Copy link
Collaborator

@ThomasRaoux ThomasRaoux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ThomasRaoux ThomasRaoux merged commit e0e5a36 into triton-lang:main Mar 12, 2024
4 checks passed
@ThomasRaoux
Copy link
Collaborator

@scxiao
Copy link
Contributor Author

scxiao commented Mar 12, 2024

looks like this fails CI: https://github.com/openai/triton/actions/runs/8250959347/job/22566858055

OK, let me check that. Sorry about that.

zhanglx13 added a commit that referenced this pull request Mar 12, 2024
zhanglx13 pushed a commit that referenced this pull request Mar 13, 2024
This PR is actually fix the regression in the reverted PR:
#3338, which caused a regression
for the test `test_masked_load_shared_memory`. The reason is for type
used in packing dot_op for bfloat16. We should use the type `i16` for
`bf16` when packing dot_op for mfma.

This time I ran all the tests in `test_core.py` locally, and all work
fine.
htyu pushed a commit to htyu/triton that referenced this pull request Mar 20, 2024
This PR is to fix the unit test `test_dot_without_load` by by changing
dot op from `vector<vector<type>>` to `vector<type>`, so the constantOp
can be converted to mfma `dot_op` with existing lowering code.
htyu pushed a commit to htyu/triton that referenced this pull request Mar 20, 2024
htyu pushed a commit to htyu/triton that referenced this pull request Mar 20, 2024
This PR is actually fix the regression in the reverted PR:
triton-lang#3338, which caused a regression
for the test `test_masked_load_shared_memory`. The reason is for type
used in packing dot_op for bfloat16. We should use the type `i16` for
`bf16` when packing dot_op for mfma.

This time I ran all the tests in `test_core.py` locally, and all work
fine.
karupayun pushed a commit to openxla/triton that referenced this pull request Apr 3, 2024
This PR is to fix the unit test `test_dot_without_load` by by changing
dot op from `vector<vector<type>>` to `vector<type>`, so the constantOp
can be converted to mfma `dot_op` with existing lowering code.
karupayun pushed a commit to openxla/triton that referenced this pull request Apr 3, 2024
karupayun pushed a commit to openxla/triton that referenced this pull request Apr 3, 2024
This PR is actually fix the regression in the reverted PR:
triton-lang#3338, which caused a regression
for the test `test_masked_load_shared_memory`. The reason is for type
used in packing dot_op for bfloat16. We should use the type `i16` for
`bf16` when packing dot_op for mfma.

This time I ran all the tests in `test_core.py` locally, and all work
fine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants