Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics query capability to query service #3061

Merged

Conversation

albertteoh
Copy link
Contributor

Signed-off-by: albertteoh albert.teoh@logz.io

Which problem is this PR solving?

Short description of the changes

  • Add metrics query capabilities to QueryService.
  • As it's an opt-in reader, it has been added to QueryServiceOptions instead of being a mandatory reader like the span or dependencies reader.

Signed-off-by: albertteoh <albert.teoh@logz.io>
@codecov
Copy link

codecov bot commented Jun 6, 2021

Codecov Report

Merging #3061 (3f91010) into master (7dedc46) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3061      +/-   ##
==========================================
+ Coverage   95.84%   95.85%   +0.01%     
==========================================
  Files         233      234       +1     
  Lines       10057    10071      +14     
==========================================
+ Hits         9639     9654      +15     
  Misses        347      347              
+ Partials       71       70       -1     
Impacted Files Coverage Δ
cmd/query/app/querysvc/metrics_query_service.go 100.00% <100.00%> (ø)
cmd/collector/app/server/zipkin.go 73.07% <0.00%> (-3.85%) ⬇️
cmd/query/app/static_handler.go 96.77% <0.00%> (ø)
plugin/storage/badger/spanstore/reader.go 96.08% <0.00%> (+0.71%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7dedc46...3f91010. Read the comment docs.

Signed-off-by: albertteoh <albert.teoh@logz.io>
@albertteoh albertteoh marked this pull request as ready for review June 6, 2021 09:50
@albertteoh albertteoh requested a review from a team as a code owner June 6, 2021 09:50
@albertteoh albertteoh requested a review from objectiser June 6, 2021 09:50
@@ -51,6 +54,9 @@ type QueryService struct {
options QueryServiceOptions
}

// QSOption is the functional option for configuring QueryServiceOptions.
type QSOption func(qOpts *QueryServiceOptions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this as public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original motivation was to use it here:

func (qOpts *QueryOptions) BuildQueryServiceOptions(storageFactory storage.Factory, logger *zap.Logger) *querysvc.QueryServiceOptions {

to avoid modifying the caller's signature if metrics querying is opted-out, but I think it's not a particularly large impact to simply add it as a parameter and pass in nil if opted-out, so I've removed this option for now.

@@ -125,6 +131,32 @@ func (qs QueryService) GetDependencies(ctx context.Context, endTs time.Time, loo
return qs.dependencyReader.GetDependencies(ctx, endTs, lookback)
}

// MetricsQueryEnabled returns whether if metric query capabilities are enabled, opted-in by setting the
// METRICS_STORAGE_TYPE environment variable.
func (qs QueryService) MetricsQueryEnabled() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed as public? For tests you can check qs.options directly

Copy link
Contributor Author

@albertteoh albertteoh Jun 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To allow the HTTP and gRPC handlers to return early if the metrics query endpoints are called but not enabled like so:

HTTP Handler:

if !aH.queryService.MetricsQueryEnabled() {
	aH.handleError(w, errMetricsQueryDisabled, http.StatusNotImplemented)
	return
}

GRPC Handler:

if !g.queryService.MetricsQueryEnabled() {
	return nil, status.Error(codes.Unimplemented, "metrics querying is currently disabled")
}

An alternative could be to remove this endpoint and have the QueryService.GetLatencies, etc. functions check for nil MetricsReader then return a MetricsQueryDisabled error type and use errors.Is to handle the "not implemented" case.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would not design it this way. If you expose a method on the API like GetLatencies, it should be safe to call this method. It may return an error, but it should not crash the process. So the nil check has to be inside the service implementation.

Another way to do this is to have a different API altogether, MetricsQueryService. That would work better with your approach, but instead of having MetricsQueryEnabled() the RPC handler will simply check if the service reference is not nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks for the suggestion, I like the MetricsQueryService idea as it mirrors the proto service design as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro I've moved metrics-related endpoints into a new MetricsQueryService and also added nil checks from each endpoint.


// GetLatencies is the queryService implementation of metricsstore.Reader.
func (qs QueryService) GetLatencies(ctx context.Context, params *metricsstore.LatenciesQueryParameters) (*metrics.MetricFamily, error) {
return qs.options.MetricsReader.GetLatencies(ctx, params)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you will want to test for nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, these nil checks are performed by the handlers so at this point, MetricsReader should never be nil. However, if we take the proposed alternative of introducing a MetricsQueryDisabled error, then we could perform the nil check here.

@albertteoh
Copy link
Contributor Author

@yurishkuro, are there any further questions/concerns that need addressing in this PR?

albertteoh added 2 commits June 9, 2021 22:01

// GetLatencies is the queryService implementation of metricsstore.Reader.
func (mqs MetricsQueryService) GetLatencies(ctx context.Context, params *metricsstore.LatenciesQueryParameters) (*metrics.MetricFamily, error) {
if mqs.metricsReader == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that you have a dedicated service, I wouldn't do nil check in it. I would instead provide a "disabled implementation" that always returns error from all methods, and then use that implementation from the RPC handler when the reader is not configured. This way you don't have nil checks anywhere, yet everything is nil-safe.

@albertteoh albertteoh merged commit 3d1236c into jaegertracing:master Jun 9, 2021
@albertteoh albertteoh deleted the 2954-add-metrics-to-query-service branch June 9, 2021 22:16
@albertteoh
Copy link
Contributor Author

Thanks Yuri! I'll look into your suggestion for the next PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants