Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ttgir line number instead of source line number to help matching from ttgir to llir/ptx #3458

Merged
merged 5 commits into from
Apr 10, 2024

Conversation

manman-ren
Copy link
Collaborator

@manman-ren manman-ren commented Mar 25, 2024

This can help us analyzing the mapping from ttgir to llir/ptx. When used with performance tools, it can provide a breakdown on ttgir instructions.
Added one env variable:
"USE_TTGIR_LOC": will not emit location information in the dumped ttgir, will re-parse the ttgir file and use ttgir line numbers as debug info
When running on vector-add with USE_TTGIR_LOC=1:
ttgir: #loc = loc("/cache-path/add_kernel.ttgir":2:1)
llir: !3 = !DIFile(filename: "add_kernel.ttgir", directory: "/cache-path")

@manman-ren manman-ren requested a review from ptillet as a code owner March 25, 2024 16:55
@manman-ren
Copy link
Collaborator Author

Open for discussions. CC @Jokeren @bertmaher

@Jokeren
Copy link
Contributor

Jokeren commented Mar 25, 2024

DISABLE_LOC_DUMP is fine to me, but it might be better if DISABLE_LOC_DUMP refers to disabling all location information.
DISABLE_LOC_DUMP = 0, USE_TTGIR_LOC = 1, ttgir loc info.
DISABLE_LOC_DUMP = 1, USE_TTGIR_LOC = 1, no loc info.
DISABLE_LOC_DUMP = 0, USE_TTGIR_LOC = 0, python loc info.
DISABLE_LOC_DUMP = 1, USE_TTGIR_LOC = 0, no loc info.

@@ -276,6 +276,12 @@ def compile(src, target=None, options=None):
print(f"\nOverriding kernel with file {ir_filename}")
full_name = fn_override_manager.get_file(ir_filename)
next_module = parse(full_name, ext, context)
# use an env variable to parse ttgir from file
use_ttgir_loc = os.environ.get("USE_TTGIR_LOC", "0") == "1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this line outside of the loop

use_ttgir_loc = os.environ.get("USE_TTGIR_LOC", "0") == "1"
if use_ttgir_loc and ext == "ttgir":
ttgir_full_name = fn_cache_manager.get_file(ir_filename)
next_module = parse(ttgir_full_name, ext, context)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why parse it again if next_module is already a ttgir module?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to parse it again to get the line numbers of the ttgir file :] When there is no #loc in the ttgir file, parsing it again will set the location of ops in the ttgir to the line number in the ttgir file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it now. Thanks.

I think the following behavior might make more sense as I described above.

DISABLE_LOC_DUMP = 0, USE_TTGIR_LOC = 1, ttgir loc info.
DISABLE_LOC_DUMP = 1, USE_TTGIR_LOC = 1, no loc info.
DISABLE_LOC_DUMP = 0, USE_TTGIR_LOC = 0, python loc info.
DISABLE_LOC_DUMP = 1, USE_TTGIR_LOC = 0, no loc info.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bert is probably on PTO this week

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it now. Thanks.

I think the following behavior might make more sense as I described above.

DISABLE_LOC_DUMP = 0, USE_TTGIR_LOC = 1, ttgir loc info.
DISABLE_LOC_DUMP = 1, USE_TTGIR_LOC = 1, no loc info.
DISABLE_LOC_DUMP = 0, USE_TTGIR_LOC = 0, python loc info.
DISABLE_LOC_DUMP = 1, USE_TTGIR_LOC = 0, no loc info.

Yeah this makes sense. But we also have TRITON_DISABLE_LINE_INFO which decides whether we will have locinfo attached. DISABLE_LOC_DUMP is for whether or not to dump the locinfo when locinfo exists.

The question is when DISABLE_LOC_DUMP = 1 and TRITON_DISABLE_LINE_INFO = 0, should we have locinfo attached but not dumped?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to only use TRITON_DISABLE_LINE_INFO but not DISABLE_LOC_DUMP? I'm a bit worried about the growth of env flags.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to only use TRITON_DISABLE_LINE_INFO but not DISABLE_LOC_DUMP? I'm a bit worried about the growth of env flags.

If we set TRITON_DISABLE_LINE_INFO to 1, there will be no debug info/location info. I wonder if it makes sense to get rid of DISABLE_LOC_DUMP and only use USE_TTGIR_LOC, when USE_TTGIR_LOC is true, we will not dump location info in the cached ttgir so when we parse it again, it will use the ttgir line number.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that seems better to me

@manman-ren manman-ren force-pushed the use-ttgir-line-number branch from 8f8e7a0 to f2342fe Compare April 3, 2024 22:34
@manman-ren
Copy link
Collaborator Author

@Jokeren Can you take a look at this diff again? Thanks!

Copy link
Contributor

@Jokeren Jokeren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one last comment. And maybe you want to refactor the commit message.

python/src/ir.cc Outdated
@@ -491,7 +495,8 @@ void init_triton_ir(py::module &&m) {
throw std::runtime_error("Parse MLIR file failed.");
// locations are incompatible with ptx < 7.5 !
module->walk([](Operation *op) {
op->setLoc(UnknownLoc::get(op->getContext()));
if (!::triton::tools::getBoolEnv("USE_TTGIR_LOC"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it outside of walk

@manman-ren manman-ren force-pushed the use-ttgir-line-number branch from f2342fe to 8558685 Compare April 5, 2024 20:47
@manman-ren
Copy link
Collaborator Author

@Jokeren Can you take another look? Thanks!

@jlebar
Copy link
Collaborator

jlebar commented Apr 9, 2024

Can you add this to the README in the new environment variables section?

@manman-ren
Copy link
Collaborator Author

Can you add this to the README in the new environment variables section?

Yes! I will. Saw your additions to the README for debugging env variables.

mren2 added 5 commits April 9, 2024 17:16
This can help use analyzing the mapping from ttgir to llir/ptx. When used with performance
tools, it can provide a breakdown on ttgir instructions.

Added two env variables:
"DISABLE_LOC_DUMP": the cached IR files will not have location info
"USE_TTGIR_LOC": we will re-parse the ttgir file and use ttgir line numbers

When running on vector-add:
ttgir: #loc = loc("/cache-path/add_kernel.ttgir":2:1)
llir: !3 = !DIFile(filename: "add_kernel.ttgir", directory: "/cache-path")
@manman-ren manman-ren force-pushed the use-ttgir-line-number branch from 8558685 to 93a91f5 Compare April 10, 2024 00:20
@jlebar jlebar enabled auto-merge (squash) April 10, 2024 00:22
@jlebar jlebar merged commit efbf37f into triton-lang:main Apr 10, 2024
5 checks passed
@manman-ren manman-ren deleted the use-ttgir-line-number branch May 2, 2024 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants