Easily find the bottleneck in your Ruby app.
RubyProf::FlameGraphPrinter is a ruby-prof printer
that outputs a fold stack file that's compatible with FlameGraph.
It is created based on RubyProf::CallStackPrinter
.
The result can be passed to FlameGraph to generate an interactive stack trace visualization (click on the image to see the demo).
FlameGraph is a way to visualize stack trace, making it very obvious where in the program takes the longest time. It is a Perl script takes a "fold stack" file and generates a nice, interactive SVG. The fold stack is usually generated from DTrace or Prof data using stackcollapse.pl, which is included with FlameGraph.
I created this gem because I want to find out where the bottleneck is in SlimWiki's specs, but I don't know DTrace and just want the result quick.
To learn more about Flame Graphs, check these out:
- Official Flame Graphs Website by Brendan Gregg
- Node.js in Flames, which is the article that introduced me to flame graph (via Node Weekly Issue #62)
- Blazing Performance with Flame Graphs talk at USENIX/LISA13 (slideshare) (video)
gem 'ruby-prof-flamegraph'
Just require 'ruby-prof-flamegraph'
and use RubyProf::FlameGraphPrinter
as your printer for ruby-prof.
For vanilla ruby-prof, see example.rb.
For rspec-prof, RSpecProf.printer_class = RubyProf::FlameGraphPrinter
Then, to generate an image, pipe the output to FlameGraph's flamegraph.pl
. Here's how the example SVG is generated:
bundle exec ruby example.rb | \
~/GitHub/FlameGraph/flamegraph.pl --countname=ms --width=728 > example.svg
Taken from [flamegraph.pl
]:
The input is stack frames and sample counts formatted as single lines. Each frame in the stack is semicolon separated, with a space and count at the end of the line.
Each line in the output looks like this:
Thread:ID;Fiber:ID;Class#method1 (call_count) time_taken_in_milliseconds
Here's an example:
Thread:123;Fiber:123;Capybara::Server#boot (1) 0.163681
Thread:123;Fiber:123;Capybara::Server#boot (1);<Module::Capybara>#server (1) 0.099445
Thread:123;Fiber:123;Capybara::Server#boot (1);<Module::Capybara>#server (1);Kernel#block_given? (1) 0.044838
Thread:123;Fiber:123;Capybara::Server#boot (1);Proc#call (1) 0.019682
Thread:123;Fiber:123;Capybara::Server#boot (1);Proc#call (1);<Module::Capybara>#run_default_server (1) 0.147611
The call count is included so that the number of calls is shown in the graph.
Sometimes, you may want to post process the data before passing it to flamegraph.pl
, such as...
-
Capybara spawns new WEBrick thread each time, putting each HTTP server instance in a different stack. We can strip the Thread ID and Fiber ID from the output first:
processed = 0 while data = gets data.gsub! %r{^Thread:\d+;Fiber:\d+;}, '' puts data processed += 1 if processed % 1000 == 0 $stderr.puts "Processed #{processed} lines" end end $stderr.puts "Finished processing #{processed} lines"
ruby remove-thread-id.rb < profile.txt > profile2.txt
-
Sometimes, you may want to concatenate multiple profiles. This causes the stack to separate based on the call count. To put calls to same method on the same stack, the call count should be removed.
processed = 0 while data = gets data.gsub! %r{\s\(\d+\);}, ';' puts data processed += 1 if processed % 1000 == 0 $stderr.puts "Processed #{processed} lines" end end $stderr.puts "Finished processing #{processed} lines"
ruby remove-call-count.rb < profile2.txt > profile3.txt
I am relying on external scripts to do this processing for maximum flexibility (so that I always have the raw data to adjust it as needed).