Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FRONTEND] refactor compiler submodule #2701

Merged
merged 67 commits into from
Nov 30, 2023
Merged

[FRONTEND] refactor compiler submodule #2701

merged 67 commits into from
Nov 30, 2023

Conversation

ptillet
Copy link
Collaborator

@ptillet ptillet commented Nov 25, 2023

No description provided.

@ptillet ptillet closed this Nov 25, 2023
@ptillet ptillet reopened this Nov 26, 2023
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 19, 2023
…iton compile interface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 19, 2023
…erface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 19, 2023
…iton compile interface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 19, 2023
…erface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 20, 2023
…iton compile interface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 20, 2023
…erface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 20, 2023
…iton compile interface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 20, 2023
…erface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 20, 2023
…iton compile interface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.
- triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 20, 2023
…erface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.
- triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Dec 21, 2023
…5878)

Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.

Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.
- triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly.

Pull Request resolved: #115878
Approved by: https://github.com/jansel
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 21, 2023
…iton compile interface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.
- triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
shunting314 added a commit to pytorch/pytorch that referenced this pull request Dec 21, 2023
…erface"


Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.


Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.
- triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Dec 22, 2023
)

Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.

Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.
- triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly.

Pull Request resolved: #115878
Approved by: https://github.com/jansel
facebook-github-bot pushed a commit to pytorch/benchmark that referenced this pull request Jan 2, 2024
Summary:
Recent 2 triton PRs (triton-lang/triton#2701, triton-lang/triton#2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API.

Also there is some simplification between compilation call in subprocess and the one in main process
- previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that
- previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process.

Updated:
There are more interface change from triton side. E.g.
- tl.math.{min, max} now requires a propagate_nan argument
- JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton.
- triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly.

X-link: pytorch/pytorch#115878
Approved by: https://github.com/jansel

Reviewed By: jeanschmidt

Differential Revision: D52390214

Pulled By: shunting314

fbshipit-source-id: aca5d42e5977373869719564dc570774d5db1642
feihugis pushed a commit to feihugis/triton that referenced this pull request Feb 13, 2024
oraluben added a commit to oraluben/triton that referenced this pull request Mar 12, 2024
jlebar pushed a commit that referenced this pull request Mar 12, 2024
shunting314 pushed a commit to shunting314/triton that referenced this pull request Mar 14, 2024
shunting314 pushed a commit to shunting314/triton that referenced this pull request Mar 14, 2024
shunting314 pushed a commit to shunting314/triton that referenced this pull request Mar 19, 2024
ptillet added a commit that referenced this pull request Mar 19, 2024
Cherry pick a few PRs to make PyTorch 2.3 work with Triton 2.2.x . Also
this makes sure some python bindings needed by user defined triton
kernel exists in the triton branch.

---------

Co-authored-by: Philippe Tillet <phil@openai.com>
Co-authored-by: Manman Ren <manman.ren@gmail.com>
Co-authored-by: Manman Ren <mren@meta.com>
htyu pushed a commit to htyu/triton that referenced this pull request Mar 20, 2024
ptillet added a commit that referenced this pull request Apr 1, 2024
Cherry pick a few PRs to make PyTorch 2.3 work with Triton 2.2.x . Also
this makes sure some python bindings needed by user defined triton
kernel exists in the triton branch.

---------

Co-authored-by: Philippe Tillet <phil@openai.com>
Co-authored-by: Manman Ren <manman.ren@gmail.com>
Co-authored-by: Manman Ren <mren@meta.com>
binarman pushed a commit to binarman/triton that referenced this pull request Apr 2, 2024
ptillet added a commit that referenced this pull request Apr 3, 2024
Cherry pick a few PRs to make PyTorch 2.3 work with Triton 2.2.x . Also
this makes sure some python bindings needed by user defined triton
kernel exists in the triton branch.

---------

Co-authored-by: Philippe Tillet <phil@openai.com>
Co-authored-by: Manman Ren <manman.ren@gmail.com>
Co-authored-by: Manman Ren <mren@meta.com>
karupayun pushed a commit to openxla/triton that referenced this pull request Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant