Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use framework-specific route info and handle consecutive path segments containing IDs in Collector #245

Merged
merged 3 commits into from
Feb 5, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion UPGRADING.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ recommends exporting a default value for any time series you know will exist in
For series with no labels, other Prometheus clients (including Go, Java, and Python) do
this automatically, so we have matched that behaviour in the 3.x.x series.

## Path generation fix in Collector middleware
## Added `SCRIPT_NAME` to path labels in Collector middleware

Previously, we did not include `Rack::Request`'s `SCRIPT_NAME` when building paths in
`Prometheus::Middleware::Collector`. We have now added this, which means that any
Expand All @@ -151,6 +151,22 @@ This will most typically be present when mounting several Rack applications in t
server process, such as when using [Rails
Engines](https://guides.rubyonrails.org/engines.html).

## Improved stripping of IDs/UUIDs from paths in Collector middleware

Where available (currently for applications written in the Sinatra and Grape frameworks),
we now use framework-specific equivalents to `PATH_INFO` in
`Prometheus::Middleware::Collector`, which means that rather than having path segments
replaced with the generic `:id` and `:uuid` placeholders, you'll see the route as you
defined it in your framework.

For frameworks where that information isn't available to us (most notably Rails), we still
fall back to using `PATH_INFO`, though we have also improved how we strip IDs/UUIDs from
it. Previously, we would only strip them from alternating path segments due to the way we
were matching them. We have improved that matching so it works even when there are
IDs/UUIDs in consecutive path segments.

You may notice the path label change for some of your endpoints.

## Improved validation of label names

Earlier versions of the Ruby Prometheus client performed limited validation of label names
Expand Down
54 changes: 51 additions & 3 deletions lib/prometheus/middleware/collector.rb
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ def trace(env)
end

def record(env, code, duration)
path = [env["SCRIPT_NAME"], env['PATH_INFO']].join
path = generate_path(env)

counter_labels = {
code: code,
Expand All @@ -87,10 +87,58 @@ def record(env, code, duration)
nil
end

# While `PATH_INFO` is framework agnostic, and works for any Rack app, some Ruby web
# frameworks pass a more useful piece of information into the request env - the
# route that the request matched.
#
# This means that rather than using our generic `:id` and `:uuid` replacements in
# the `path` label for any path segments that look like dynamic IDs, we can put the
# actual route that matched in there, with correctly named parameters. For example,
# if a Sinatra app defined a route like:
#
# get "/foo/:bar" do
# ...
# end
#
# instead of containing `/foo/:id`, the `path` label would contain `/foo/:bar`.
#
# Sadly, Rails is a notable exception, and (as far as I can tell at the time of
# writing) doesn't provide this info in the request env.
def generate_path(env)
if env['sinatra.route']
route = env['sinatra.route'].partition(' ').last
elsif env['grape.routing_args']
# We are deep in the weeds of an object that Grape passes into the request env,
# but don't document any explicit guarantees about. Let's have a fallback in
# case they change it down the line.
#
# This code would be neater with the safe navigation operator (`&.`) here rather
# than the much more verbose `respond_to?` calls, but unlike Rails' `try`
# method, it still raises an error if the object is non-nil, but doesn't respond
# to the method being called on it.
route = nil

route_info = env.dig('grape.routing_args', :route_info)
if route_info.respond_to?(:pattern)
pattern = route_info.pattern
if pattern.respond_to?(:origin)
route = pattern.origin
end
end

# Fall back to PATH_INFO if Grape change the structure of `grape.routing_args`
route ||= env['PATH_INFO']
else
route = env['PATH_INFO']
end

[env['SCRIPT_NAME'], route].join
end

def strip_ids_from_path(path)
path
.gsub(%r{/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}(/|$)}, '/:uuid\\1')
.gsub(%r{/\d+(/|$)}, '/:id\\1')
.gsub(%r{/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}(?=/|$)}, '/:uuid\\1')
.gsub(%r{/\d+(?=/|$)}, '/:id\\1')
end
end
end
Expand Down
76 changes: 76 additions & 0 deletions spec/prometheus/middleware/collector_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,82 @@
expect(registry.get(metric).get(labels: labels)).to include("0.1" => 0, "0.5" => 1)
end

it 'handles consecutive path segments containing IDs' do
expect(Benchmark).to receive(:realtime).and_yield.and_return(0.3)

get '/foo/42/24'

metric = :http_server_requests_total
labels = { method: 'get', path: '/foo/:id/:id', code: '200' }
expect(registry.get(metric).get(labels: labels)).to eql(1.0)

metric = :http_server_request_duration_seconds
labels = { method: 'get', path: '/foo/:id/:id' }
expect(registry.get(metric).get(labels: labels)).to include("0.1" => 0, "0.5" => 1)
end

it 'handles consecutive path segments containing UUIDs' do
expect(Benchmark).to receive(:realtime).and_yield.and_return(0.3)

get '/foo/5180349d-a491-4d73-af30-4194a46bdff3/5180349d-a491-4d73-af30-4194a46bdff2'

metric = :http_server_requests_total
labels = { method: 'get', path: '/foo/:uuid/:uuid', code: '200' }
expect(registry.get(metric).get(labels: labels)).to eql(1.0)

metric = :http_server_request_duration_seconds
labels = { method: 'get', path: '/foo/:uuid/:uuid' }
expect(registry.get(metric).get(labels: labels)).to include("0.1" => 0, "0.5" => 1)
end

it 'prefers sinatra.route to PATH_INFO' do
metric = :http_server_requests_total

env('sinatra.route', 'GET /foo/:named_param')
get '/foo/7'
env('sinatra.route', nil)
expect(registry.get(metric).values.keys.last[:path]).to eql("/foo/:named_param")
end

it 'prefers grape.routing_args to PATH_INFO' do
metric = :http_server_requests_total

# request.env["grape.routing_args"][:route_info].pattern.origin
#
# Yes, this is the object you have to traverse to get the path.
#
# No, I'm not happy about it either.
grape_routing_args = {
route_info: double(:route_info,
pattern: double(:pattern,
origin: '/foo/:named_param'
)
)
}

env('grape.routing_args', grape_routing_args)
get '/foo/7'
env('grape.routing_args', nil)
expect(registry.get(metric).values.keys.last[:path]).to eql("/foo/:named_param")
end

it "falls back to PATH_INFO if the structure of grape.routing_args changes" do
metric = :http_server_requests_total

grape_routing_args = {
route_info: double(:route_info,
pattern: double(:pattern,
origin_but_different: '/foo/:named_param'
)
)
}

env('grape.routing_args', grape_routing_args)
get '/foo/7'
env('grape.routing_args', nil)
expect(registry.get(metric).values.keys.last[:path]).to eql("/foo/:id")
end

context 'when the app raises an exception' do
let(:original_app) do
lambda do |env|
Expand Down