Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print the detailed type on heap snapshot #47503

Merged
merged 3 commits into from
Nov 23, 2022

Conversation

gbaraldi
Copy link
Member

@gbaraldi gbaraldi commented Nov 9, 2022

This adds even more details to the node. It might even be too much.
It would be cool to be able to still group them, but keep the extra details.
Fixes #47502

@IanButterworth
Copy link
Sponsor Member

I think this is the right approach. Grouping can be handled by the viewer tooling.

Though, what impact does this have on collection time?

@gbaraldi
Copy link
Member Author

gbaraldi commented Nov 9, 2022

I haven't checked, if it's too much I can just yank the part of the code that specifically does the datatype printing. Though since this function already used to fallback to static_show I don't expect much of a difference.

@IanButterworth
Copy link
Sponsor Member

master

$ ./julia --startup-file=no
...
julia> @time Profile.take_heap_snapshot()
  2.125155 seconds (37 allocations: 3.156 KiB, 45.84% gc time)
"/home/ian/Documents/GitHub/julia2/julia/114986_40155041450905.heapsnapshot"

julia> @time Profile.take_heap_snapshot()
  2.156030 seconds (37 allocations: 3.156 KiB, 46.53% gc time)
"/home/ian/Documents/GitHub/julia2/julia/114986_40159572227999.heapsnapshot"

julia> @time Profile.take_heap_snapshot()
  2.110962 seconds (37 allocations: 3.156 KiB, 45.86% gc time)
"/home/ian/Documents/GitHub/julia2/julia/114986_40163564691475.heapsnapshot"

this PR

$ ./julia --startup-file=no
...
julia> @time Profile.take_heap_snapshot()
  2.907817 seconds (37 allocations: 3.117 KiB, 56.23% gc time)
"/home/ian/Documents/GitHub/julia/109127_37985089277755.heapsnapshot"

julia> @time Profile.take_heap_snapshot()
  2.911053 seconds (37 allocations: 3.117 KiB, 56.54% gc time)
"/home/ian/Documents/GitHub/julia/109127_37989526095958.heapsnapshot"

julia> @time Profile.take_heap_snapshot()
  2.872300 seconds (37 allocations: 3.117 KiB, 56.00% gc time)
"/home/ian/Documents/GitHub/julia/109127_37993550442678.heapsnapshot"

@gbaraldi
Copy link
Member Author

gbaraldi commented Nov 9, 2022

That's quite a bit. I guess yanking the code might be worth it then. Though it's also adding a lot more information so I guess it also makes sense that it takes longer.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Nov 9, 2022

Seems okay. It is not expected to be fast

Comment on lines 220 to 225
jl_static_show(str, a);
name = StringRef((const char*)str_.buf, str_.size);
self_size = sizeof(jl_datatype_t);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should just use static show for everything in this function, instead of having all these custom if-statements for printing??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static_show is a bit slow, and for some of these objects there isn't that much to be gained. Also static_show might be too specific in some cases, which might actually hinder usability

Copy link
Member

@NHDaly NHDaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excelllllllent! :) Thanks @gbaraldi!

@gbaraldi
Copy link
Member Author

gbaraldi commented Nov 9, 2022

We might want to add more specific objects to record_node_to_gc_snapshot so that they are grouped by type not also by name.

@@ -446,7 +453,6 @@ static inline void _record_gc_edge(const char *node_type, const char *edge_type,
auto to_node_idx = record_node_to_gc_snapshot(b);

auto &from_node = g_snapshot->nodes[from_node_idx];
from_node.type = g_snapshot->node_types.find_or_create_string_id(node_type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 what was in the node_type param here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It got passed into the function. It's an argument to it, I still need to remove it :)

? sizeof(jl_array_t)
: (size_t)jl_datatype_size(type);

self_size = (size_t)jl_datatype_size(type);
// print full type into ios buffer and get StringRef to it.
// The ios is cleaned up below.
ios_need_close = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what do the else-case nodes get for node_type? "object", still?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though i want to catch all the cases if possible.

@gbaraldi
Copy link
Member Author

Do we want to keep "normal" julia objects as "objects" or do we want to say they are for example jl_value_t so they can be grouped up? Or more generally, what level of granularity do we want. I think types, symbols, strings, arrays and the rest might be fine. Since we are static_showing most things now they aren't as grouped as before, so adding ways of grouping them up might be interesting.

@NHDaly
Copy link
Member

NHDaly commented Nov 11, 2022

yeah, interesting question. i'm not sure.

I think i'd probably say that keeping normal julia objects as "object" is the right call. For users who aren't using this tool to profile Julia itself, but are rather using this tool to profile their applications, i think it would be annoying to group all objects into one jl_value_t bucket, since that's where all of their valuable data is.

So i think keeping them as objects is right. 👍

@NHDaly
Copy link
Member

NHDaly commented Nov 16, 2022

Are you waiting for another review, @gbaraldi? Do you need more input anywhere?

@gbaraldi
Copy link
Member Author

The changes are fine, I just need to do some cleanup and it should be good to merge

@vchuravy vchuravy added the backport 1.9 Change should be backported to release-1.9 label Nov 16, 2022
@vtjnash
Copy link
Sponsor Member

vtjnash commented Nov 17, 2022

If performance is too much of an issue, we could attempt replacing find_or_create_string_id with something that instead maps a jl_value_t to a more-lazy-printed string, so that we only make the string once for each type

@IanButterworth
Copy link
Sponsor Member

@gbaraldi were you planning to try Jameson's idea out, or should we merge this?

@gbaraldi
Copy link
Member Author

I don't think there is much benefit into memoizing the strings here, we have repeated types, but they are also the simpler ones to print.

@IanButterworth IanButterworth merged commit 27ebaa7 into JuliaLang:master Nov 23, 2022
KristofferC pushed a commit that referenced this pull request Nov 28, 2022
@KristofferC KristofferC mentioned this pull request Dec 14, 2022
26 tasks
@KristofferC KristofferC removed the backport 1.9 Change should be backported to release-1.9 label Dec 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Heap Snapshot node names aren't parameterized for Types
6 participants