Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backend] hoist ttg.local_alloc only if its valid #3449

Merged
merged 1 commit into from
Mar 25, 2024

Conversation

chenyang78
Copy link
Contributor

We cannot hoist a triton_gpu.local_alloc op if its init argument is declared in the same scope as local_alloc's containing for loop.

@chenyang78 chenyang78 requested a review from ptillet as a code owner March 24, 2024 22:50
Copy link
Collaborator

@ThomasRaoux ThomasRaoux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. Can you change the condition as suggested.

Comment on lines 58 to 64
if (auto allocOp = dyn_cast<ttg::LocalAllocOp>(op)) {
// We cannot hoist the allocOp if any of its operands
// is declared in the forOp's body.
if (llvm::any_of(allocOp->getOperands(),
[&](Value operand) {
return operand.getParentBlock() == forOp.getBody();
}))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually can we skip any allocation that has an init value? I thought I did that but looks like I missed it.
Basically:

if (!allocOp.getInit())
  continue;

Copy link
Contributor Author

@chenyang78 chenyang78 Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick review, Thomas! Would it be possible that the init argument of the allocOp comes from the parent scope of the forOp? If that was the case, we might want something like the following?

      auto init = allocOp.getInit();
      if (!init || init.getParentBlock() != forOp.getBody())
        toHoist.push_back(&op);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would work but I don't think that's what we want to do. The idea here is to hoist the alloc created by the inner loop pipelining, we shouldn't try to move the other alloc ops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Make sense. Updated the PR. Thanks!

We should only hoist triton_gpu.local_alloc created by the inner
loop pipelining.

This PR fixed issue triton-lang#3431
Copy link
Collaborator

@ThomasRaoux ThomasRaoux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ThomasRaoux ThomasRaoux merged commit 67eca4a into triton-lang:main Mar 25, 2024
5 checks passed
karupayun pushed a commit to openxla/triton that referenced this pull request Apr 3, 2024
We cannot hoist a triton_gpu.local_alloc op if its init argument is
declared in the same scope as local_alloc's containing for loop.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants