Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thread_ids::Vector option to Profile.init() #39746

Closed
wants to merge 1 commit into from

Conversation

NHDaly
Copy link
Member

@NHDaly NHDaly commented Feb 19, 2021

  • This option will configure julia's profiler to only run on the
    provided thread ids! :)

  • Adds a global int mask to allow toggling profiling for up to 64
    threads in a performant way. I think if you have more than 64 threads,
    it's okay that you can't profile individual threads since it's
    unlikely to be very meaningful by then...

Fixes #39743

- This option will configure julia's profiler to only run on the
  provided thread ids! :)

- Adds a global int mask to allow toggling profiling for up to 64
  threads in a performant way. I think if you have more than 64 threads,
  it's okay that you can't profile individual threads since it's
  unlikely to be very meaningful by then...
@NHDaly NHDaly marked this pull request as draft February 19, 2021 04:45
@NHDaly NHDaly requested a review from vchuravy February 19, 2021 04:45
n_cur = ccall(:jl_profile_maxlen_data, Csize_t, ())
delay_cur = ccall(:jl_profile_delay_nsec, UInt64, ())/10^9
if n === nothing && delay === nothing
_init_threadid_filter(thread_ids)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, now that I think about it, I guess thread_ids should be "sticky," too, like the other params? I was trying to be super safe here to make sure it's always reset back to normal to preserve backwards compatibility, but I think that is someone if using the girls, it would make more sense for it to work like the others.

I can change this in the morning. :)

Should this also get added to the return value? Would we want to return the mask or the array? If we want to return the array, I can either add a getter for the mask and reconstruct the vector from it, or we can store the vector as a global in Julia, in parallel with the mask. I think a getter for the mask makes more sense.

end
threadid_mask |= 0x1 << (tid-1)
end
ccall(:jl_profile_init_threadid_filter, Cvoid, (UInt64,), threadid_mask)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, we should move this outside the if block, so that we can reset the filter when passing an empty vector.

@vchuravy
Copy link
Member

Hmm, I would rather record on which thread the backtrack was collected. In pprof I think you can add that as a field. I suspect that would be more useful than limiting profiling to a specific set of threads (how do you know that they are of more interest then the others)

@NHDaly
Copy link
Member Author

NHDaly commented Feb 22, 2021

Hrmmm yeah, you're maybe right.

... Can you filter out by threadid in pprof if we add it as a field? I guess in the worst case we could do it in our julia API even if you can't do it through the GUI.

Hehe blech, but Valentin it will be so much harder to make the changes to pipe this through everywhere! This is the easier/lazier option 😅

Still, i think you're probably right that this would be a nice thing to record, even if we didn't want to filter it... it'll just be more work to get that set up i think.

how do you know that they are of more interest then the others?

Yeah, I was basically only imagining using this on thread 1, because e.g. if you have some tasks that you only schedule via @async, you can be sure that they'll only run there (though of course you don't know what else might schedule there).

In this case, we were screwing around with trying to fix our server's responsiveness problems with this silly package: https://github.com/RelationalAI-oss/DevilSpawn.jl. But it didn't actually have much impact on responsiveness, and I was curious to see why, since in theory we shouldn't be scheduling (as much) stuff on the main thread, so i was interested to see what was still scheduling there and blocking the thread.

@IanButterworth
Copy link
Member

IanButterworth commented Jul 29, 2021

Thanks @NHDaly, this issue bugs me too (julia-vscode/julia-vscode#1881)

I think there's an argument that it would be good to have both the ability to select which threads to profile, and add the thread info into the backtraces.
Selecting which threads to profile in advance would save on CPU overhead and use up less of the profile buffer.

Given selecting which threads to profile is relatively simple and wouldn't need a change of any of the profile data reading downstream code, could this PR be considered before both are implemented?

@NHDaly I'd be happy to test this. Perhaps before I do that there seem to be some pending changes, and a rebase given a lot has changed since this

@NHDaly
Copy link
Member Author

NHDaly commented Jul 29, 2021

+1 Thanks @IanButterworth - i agree! I think those are good points. Especially if you're running on a system with close to a hundred threads, you'd need a super big buffer, and if you know ahead of time which threads you want to profile, it can help a lot to apply this filter up front 👍


Let's try to pick this up again 👍

@vchuravy
Copy link
Member

and if you know ahead of time which threads you want to profile, it can help a lot to apply this filter up front +1

With migration, how do you know? I maintain my objection xD

@NHDaly
Copy link
Member Author

NHDaly commented Jul 29, 2021

With migration, how do you know? I maintain my objection xD

A few possible reasons:

  1. Because some people are (naughtily) running their tasks with @async
  2. You might be interested in just sampling a smaller number of threads to get a smaller profile? 🤷
  3. Assuming that there's some resolution to Severe thread starvation issues #41586 that involves a separate thread pool for "high-priority / low-latency tasks", we could want to profile only that thread pool. (This is the main motivation for me, personally, fwiw. We've been experimenting with this and it seems to be helping our liveness quite a bit.)

@vchuravy
Copy link
Member

  1. Those folks don't know either, since they don't know from which thread their @async code got called.
  2. Sure, so can we make sampling more efficient?
  3. Then we should add it on a per thread-pool level once/if we have that.

(Nathan, I just really want to nerd-snipe you into doing the hard work xD)

@NHDaly
Copy link
Member Author

NHDaly commented Jan 2, 2022

Now that we've added thread_ids to the profile results (#41742), i think we can close this. Filtering by thread could now be done client-side in the profile viewers. :)

Thanks for pushing, @vchuravy 😉

@NHDaly NHDaly closed this Jan 2, 2022
@NHDaly NHDaly deleted the nhd-profile-threadid_mask branch January 2, 2022 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Profiling specific threads?
3 participants