-
Notifications
You must be signed in to change notification settings - Fork 576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[otelhttp] transport metrics #3769
base: main
Are you sure you want to change the base?
[otelhttp] transport metrics #3769
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #3769 +/- ##
=====================================
Coverage 79.4% 79.4%
=====================================
Files 166 166
Lines 10360 10401 +41
=====================================
+ Hits 8230 8266 +36
- Misses 1996 1999 +3
- Partials 134 136 +2
|
This would need a changelog, and tests. |
Add tests and changelog. |
Hmm CI shows a data race message that I can't reproduce locally. |
For reference: https://github.com/open-telemetry/opentelemetry-go-contrib/actions/runs/4883405160/jobs/8714784942 Have you tried running multiple times? E.g. by calling EDIT: EDIT 2: The data race is in your code. #3769 (comment). PS. I managed to get the data race on my machine (but the probability to hit it is extremely low. I had to run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for this. I was looking for this feature. Until this gets merged I decided to follow your lead and did something similar in what I'm writing. I'm not a maintainer but my review is some things I found.
@@ -37,6 +37,14 @@ const ( | |||
ServerLatency = "http.server.duration" // Incoming end to end duration, microseconds | |||
) | |||
|
|||
// Client HTTP metrics. | |||
const ( | |||
ClientRequestCount = "http.client.request_count" // Outgoing request count total |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer defined in the semantic convention. And doesn't look to be actually used in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is not used, we are supposed to get this information from the count of duration
. I left it there becase the one for server was also there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the server one is still there, so I'm keeping this one also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please adopt the v1.17
semantic convection for HTTP Client.
@RangelReale are you still willing to work on this? |
yes, I''m available either to follow up or if someone wants to takeover also fine. |
It is fine if you follow up. I will do my best to take look in the beginning of next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are build failures.
Moreover, this PR MUST NOT add anything more than defined in https://github.com/open-telemetry/opentelemetry-specification/blob/v1.17.0/specification/metrics/semantic_conventions/http-metrics.md#http-client
@Aneurysm9, can you please review this PR (as module codeowner)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did my best to carefully review this PR. Nice work. A left a few comments that needs to be addressed. Some should lead to simplification of the PR.
// WithRequestAttributeGetter extracts additional attributes from the request. | ||
func WithRequestAttributeGetter(fn func(req *http.Request) []attribute.KeyValue) Option { | ||
return optionFunc(func(o *config) { | ||
o.GetRequestAttributes = fn | ||
}) | ||
} | ||
|
||
// WithResponseAttributeGetter extracts additional attributes from the response. | ||
func WithResponseAttributeGetter(fn func(req *http.Response) []attribute.KeyValue) Option { | ||
return optionFunc(func(o *config) { | ||
o.GetResponseAttributes = fn | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People may NOT want to add always the same attributes to spans and metrics. While for spans it is not a problem for metrics it leads to high cardinality.
I strongly suggest to remove this functionality from this PR. We can create a different solution in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
} | ||
metricdatatest.AssertEqual(t, want, sm.Metrics[1], metricdatatest.IgnoreTimestamp()) | ||
|
||
// Duration value is not predictable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We can check that it is "non-zero" 😉
- Would adding a
IgnoreValue
option tometricdatatest
help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a PR in the main repo to add this "IgnoreValue" option.
reader := metric.NewManualReader() | ||
meterProvider := metric.NewMeterProvider(metric.WithReader(reader)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the changes needed in this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, will remove.
"go.opentelemetry.io/otel/trace" | ||
) | ||
|
||
func assertClientScopeMetrics(t *testing.T, sm metricdata.ScopeMetrics, attrs attribute.Set) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think we usually add helper functions at the bottom of the file (below tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
want = metricdata.Metrics{ | ||
Name: "http.client.response_content_length", | ||
Data: metricdata.Sum[int64]{ | ||
DataPoints: []metricdata.DataPoint[int64]{{Attributes: attrs, Value: 13}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value is tightly coupled to the test where the function is used. Maybe it would be better just to inline the function into the test as right now it is not really reusable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is a copy/change from the handler test, maybe it can be better to leave both equals for future changes?
counters map[string]metric.Int64Counter | ||
valueRecorders map[string]metric.Float64Histogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using maps here? How about simply having the mutiple instrument fields (many metric.Int64Counter
and metric.Float64Histogram
fields instead of these two maps).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the same thing done by the server handler, this pattern seems to be used in a lot of instrumentations.
Co-authored-by: Robert Pająk <pellared@hotmail.com>
Curious if this is currently blocked by anything? Would be awesome to have this merged and released when possible |
} | ||
|
||
// Add metrics | ||
attributes := semconvutil.HTTPClientRequest(r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably needs something similar to
opentelemetry-go-contrib/instrumentation/net/http/otelhttp/internal/semconvutil/httpconv.go
Line 106 in fdfa6e3
func HTTPServerRequestMetrics(server string, req *http.Request) []attribute.KeyValue { |
o := metric.WithAttributes(attributes...) | ||
t.counters[ClientRequestContentLength].Add(ctx, bw.read.Load(), o) | ||
if err == nil { | ||
t.counters[ClientResponseContentLength].Add(ctx, res.ContentLength, o) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ContentLength may not be what we want here, could be negative. since bodyWrapper already tracks count of bytes read from the response, it's probably better to refer to that count here
[EDIT] I see that newWrappedBody
is used instead of bodyWrapper .. so maybe it's not a drop-in replacement. nonetheless we need a more accurate byte count than http.Response.ContentLength is guaranteed to provide.
@@ -109,21 +140,64 @@ func (t *Transport) RoundTrip(r *http.Request) (*http.Response, error) { | |||
ctx = httptrace.WithClientTrace(ctx, t.clientTrace(ctx)) | |||
} | |||
|
|||
readRecordFunc := func(int64) {} | |||
if t.readEvent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
readEvent
and readRecordFunc
map to "read" in the span .. but it's the client's Request object, so it's actually a "write" event - right? because it's being written to the server?
"read" events should be recorded from the bytes consumed by the http.Response.Body, right?
@RangelReale What is the latest on this? Do you need any help? |
@Sovietaced I've not been following too much the latest changes, if you want to help/assume this fork, it is fine with me. |
Thanks! I have made a fork and will be going through the code tomorrow. I will open up another pull request once I deeply understand the code and have addressed any outstanding comments here. |
I have an initial draft of this work continued here: #4707 Would appreciate any reviews |
Add HTTP client metrics following the semantic conventions.
As users of
Transport
are not expected to have a handler where they can customize the attributes using aLabeler
, callback function options were added to add attributes from the request and the reponse objects.Fixes #3134
Fixes #1336