Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add measureme integration for profiling the interpreted program #1791

Merged
merged 6 commits into from
May 30, 2021

Conversation

Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented May 8, 2021

This PR uses the measureme crate to profile the call stack of the
program being interpreted by Miri. This is accomplished by starting a
measureme 'event' when we enter a function call, and ending the event
when we exit the call. The measureme tooling can be used to produce a
call stack from the generated profile data.

Limitations:

  • We currently record every single entry/exit. This might generate very
    large profile outputs for programs with a large number of function
    calls. In follow-up work, we might want to explore sampling (e.g. only
    recording every N function calls).
  • This does not integrate very well with Miri's concurrency support.
    Each event we record starts when we push a frame, and ends when we pop
    a frame. As a result, the timing recorded for a particular frame will include all of the work Miri does before that frame completes, including executing another thread.

The measureme integration is off by default, and must be enabled via
-Zmiri-measureme=<output_name>

@Aaron1011
Copy link
Member Author

Sample output:

fn main() {
    for i in 0..1000 {
        let a = i.to_string();
    }
    println!("Hello, world!");
}

produces the following Chromium performance profile (generated using the crox tool from https://github.com/rust-lang/measureme):

miri_profile

@oli-obk
Copy link
Contributor

oli-obk commented May 8, 2021

* This does not integrate very well with Miri's concurrency support.
  Each event we record starts when we push a frame, and ends when we pop
  a frame. As a result, switching between virtual threads will cause
  events from different threads to be interleaved. Additionally, the
  recorded for a particular frame will include all of the work Miri does
  before that frame completes, including executing another thread.

how is this done in real executions with threads? Can we reuse the same infrastructure?

@Aaron1011
Copy link
Member Author

how is this done in real executions with threads? Can we reuse the same infrastructure?

You can pass a thread_id parameter to measureme. I don't really know what it does, or what timings actually look like on a multi-threaded compiler. However, I suspect that the fact that the OS schedules the threads makes it easier to get accurate counts, since things like instruction counts will automatically do the right thing. Miri is responsible for 'scheduling' all of its virtual threads, so we'll probably need to give measureme the ability to 'pause' and 'unpause' an interval event.

@RalfJung
Copy link
Member

RalfJung commented May 8, 2021

Miri has "thread IDs", why can't we pass those to measureme?

@Aaron1011
Copy link
Member Author

The PR now passes the active thread ID to measureme, which appears to do the right thing:

multithread_miri

I suspect that it will take more work to get accurate timing counts when the execution is interleaved, but the stacks should now be shown separately.

@RalfJung
Copy link
Member

The PR now passes the active thread ID to measureme, which appears to do the right thing:

That's amazing. :)
Could you update the PR description? I assume at least one of the limitations is resolved by this.

Cargo.toml Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
src/eval.rs Outdated Show resolved Hide resolved
src/machine.rs Outdated Show resolved Hide resolved
src/machine.rs Outdated Show resolved Hide resolved
src/machine.rs Outdated Show resolved Hide resolved
src/machine.rs Outdated Show resolved Hide resolved
src/machine.rs Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors commented May 16, 2021

☔ The latest upstream changes (presumably #1801) made this pull request unmergeable. Please resolve the merge conflicts.

Aaron1011 added 3 commits May 29, 2021 17:01
This PR uses the `measureme` crate to profile the call stack of the
program being interpreted by Miri. This is accomplished by starting a
measureme 'event' when we enter a function call, and ending the event
when we exit the call. The `measureme` tooling can be used to produce a
call stack from the generated profile data.

Limitations:
* We currently record every single entry/exit. This might generate very
  large profile outputs for programs with a large number of function
  calls. In follow-up work, we might want to explore sampling (e.g. only
  recording every N function calls).
* This does not integrate very well with Miri's concurrency support.
  Each event we record starts when we push a frame, and ends when we pop
  a frame. As a result, switching between virtual threads will cause
  events from different threads to be interleaved. Additionally, the
  recorded for a particular frame will include all of the work Miri does
  before that frame completes, including executing another thread.

The `measureme` integration is off by default, and must be enabled via
`-Zmiri-measureme=<output_name>`
@Aaron1011
Copy link
Member Author

@RalfJung I've addressed your comments

src/machine.rs Outdated Show resolved Hide resolved
@RalfJung
Copy link
Member

This is a great step forward for performance debugging in Miri, thanks a lot @Aaron1011. :)
@bors r+

@bors
Copy link
Contributor

bors commented May 30, 2021

📌 Commit c89a5d6 has been approved by RalfJung

@bors
Copy link
Contributor

bors commented May 30, 2021

⌛ Testing commit c89a5d6 with merge 178ae8e...

@bors
Copy link
Contributor

bors commented May 30, 2021

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 178ae8e to master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants