Skip to content

Commit

Permalink
Merge pull request triton-lang#9 from openai/keren/todo
Browse files Browse the repository at this point in the history
[TODO] Add a bunch of TODOs
  • Loading branch information
Jokeren authored Jul 18, 2023
2 parents 1ada046 + 5af7bec commit 9db22d9
Showing 1 changed file with 18 additions and 1 deletion.
19 changes: 18 additions & 1 deletion TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,27 @@

## cleanups

* Hard coded alignment, don't why it exists and why it is that number
https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/lib/Analysis/Allocation.cpp#L165
https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/lib/Analysis/Allocation.cpp#L176
* Shared memory buffer sizes should be determined before graph coloring
https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/lib/Analysis/Allocation.cpp#L493
* Don't use any LLVM dialects in TritonGPU conversions
https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/lib/Analysis/Membar.cpp#L110
* We should have a more general barrier op
https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/lib/Analysis/Membar.cpp#L180
* tt.load's verifier has the `allowTensorPointerType` flag, which skips checking encoding if ptr is a tensor pointer. This is a hack and should be improved because it is unclear and violating the encoding is not good idea.
https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/lib/Dialect/Triton/IR/Traits.cpp#L10
* Don't call arith to LLVM conversion from MLIR
* linearize/delinearize helper have been duplicated (most likely due to layering problems). This should be merged

## bug fixes

* Currently lowering of InsertSliceAsyncV2Op doesn't work if the mask is not a scalar even though the op semantic allows it.
* smem base is being passed to the patterns. This seems broken when using function calls.
* smem base is being passed to the patterns. This seems broken when using function calls.
* Revisit the constraint for RewriteTensorPointer pass
https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/lib/Dialect/TritonGPU/Transforms/RewriteTensorPointer.cpp#L644
* The IR output of `make_tensor_ptr` is wrong. `!tt.ptr` is ignored in the output.
https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/include/triton/Dialect/Triton/IR/TritonOps.td#L562
* We rely on the `cuda-python` package currently, which prevents us from building triton on any node without CUDA installed. We should invoke TMA related functions in our thin CUDA wrapper.
https://github.com/openai/triton-hopper/blob/b6a6b32b0ee79e93247d20c95f15fd75039a40b9/python/triton/compiler/utils.py#L3

0 comments on commit 9db22d9

Please sign in to comment.