Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use llvm-symbolizer's JSON output for symbolizing #879

Merged
merged 2 commits into from
Jul 27, 2024

Conversation

LeszekSwirski
Copy link
Contributor

In some edge cases (e.g. injected JIT symbols), function names can have
new lines. This breaks the llvm-symbolizer output parsing, and makes
pprof hang.

Conveniently, as of LLVM 13, llvm-symbolizer has a JSON output mode,
which is robust against all kinds of weirdness like new lines. We can
use this instead of the line-based parsing, and as a bonus we get much
simpler handling of multiple frames in a stack, as the JSON output
already returns these as an array.

This also requires splitting the CODE and DATA processing into separate
functions, since their JSON output is incompatible. For now, we keep the
DATA output as before, a slightly hacky but functional concatenation of
start + size, but this could be improved.

In some edge cases (e.g. injected JIT symbols), function names can have
new lines. This breaks the llvm-symbolizer output parsing, and makes
pprof hang.

Conveniently, as of LLVM 13, llvm-symbolizer has a JSON output mode,
which is robust against all kinds of weirdness like new lines. We can
use this instead of the line-based parsing, and as a bonus we get much
simpler handling of multiple frames in a stack, as the JSON output
already returns these as an array.

This also requires splitting the CODE and DATA processing into separate
functions, since their JSON output is incompatible. For now, we keep the
DATA output as before, a slightly hacky but functional concatenation of
start + size, but this could be improved.
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 77.08333% with 11 lines in your changes missing coverage. Please review.

Project coverage is 67.00%. Comparing base (0ed6a68) to head (9c35335).
Report is 31 commits behind head on main.

Files Patch % Lines
internal/binutils/addr2liner_llvm.go 77.08% 8 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #879      +/-   ##
==========================================
+ Coverage   66.86%   67.00%   +0.13%     
==========================================
  Files          44       44              
  Lines        9824     9798      -26     
==========================================
- Hits         6569     6565       -4     
+ Misses       2794     2780      -14     
+ Partials      461      453       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@aalexand aalexand merged commit 813a5fb into google:main Jul 27, 2024
31 checks passed
@insilications
Copy link
Contributor

insilications commented Aug 11, 2024

With this merged, I think #823 can be easily solved? The same JSON frame data from llvm-symbolizer that is being read can provide StartLine for Function.start_line.

If this change is welcome, I can create a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants