Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Restrict query by header tag #2053

Merged

Conversation

robskillington
Copy link
Collaborator

… and strip certain labels from results

What this PR does / why we need it:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:


Does this PR require updating code package or user-facing documentation?:


@arnikola arnikola changed the title [query] WIP add ability to restrict all fetches for a query by labels… [query] Restrict query by header tag Nov 28, 2019
}

fetchOpts.RestrictFetchOptions = newOrExistingRestrictFetchOptions(fetchOpts)
// TODO: filter through TagOptions
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this still needs to happen yeah? Otherwise we'll get wrong tag options being used for these matchers?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess; it's only really relevant (atm) if you're overriding the name tag, but will do it here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a follow up, this was refactored to no longer be neccessary

return &built, nil
}

func applyOptions(
Copy link
Collaborator Author

@robskillington robskillington Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so for query results the handlers filters it out but for complete tags storage filters it out?

Perhaps this should just be done at the handler level to make it consistent?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we discussed this offline; it's largely based on where parsing happens for the fetch itself; the DAG needs to be built before we can add it to the fetch path unfortunately

) (index.Query, error) {
matchers := fetchQuery.TagMatchers
// If no matchers provided, explicitly set this to an AllQuery
withOpts := fetchQuery.WithAppliedOptions(options)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just fetchQuery = fetchQuery.WithAppliedOptions(options) so that code below doesn't accidently use fetchQuery.TagMatchers or anything like that? (Also can imagine in the future other things mutate the fetch query?)

@@ -121,6 +124,7 @@ func encodeFetchRequest(
func encodeTagMatchers(modelMatchers models.Matchers) (*rpc.TagMatchers, error) {
matchers := make([]*rpc.TagMatcher, len(modelMatchers))
for i, matcher := range modelMatchers {
fmt.Println("matcher", matcher)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed?

}

if len(o.MustApplyMatchers) > 0 {
fmt.Println("Applhying matchers", o.MustApplyMatchers)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed?

if err != nil {
return nil, err
}
fmt.Println(" matchers", matchers)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed?

@robskillington
Copy link
Collaborator Author

Looks like FetchStripLabels = "M3-Fetch-Strip-Labels" isn't being used anywhere. These should take a set of labels, perhaps as JSON, to strip yeah? Or we can purely take them out too, but it will make it less flexible what callers can do. (i.e. they can't restrict a label value to match regexp and get the actual value of the label back themselves on the resulting series).

Comment on lines 68 to 69
RestrictFetchType RestrictFetchType = 1;
RestrictFetchTags RestrictFetchTags = 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this may be contentious since I'm dropping backcompat here, but I figure query hosts are pretty trivial to upgrade; can make it more back compatible though

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this is probably ok. I don't think there's many users using either of these fields.

Maybe we should make it fields 3, 4 instead of 1, 2 though so it doesn't try to unpack something during a deploy and it unpacks a weird value it's not expecting?

FetchRestrictLabels = "M3-Fetch-Restrict-Labels"

// FetchStripLabels strips certain labels from the result.
FetchStripLabels = "M3-Fetch-Strip-Labels"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we were moving towards using just one header, called M3-Query-Options-JSON?

Perhaps:

QueryOptionsJSONHeader = "M3-Query-Options-JSON"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah oops forgot to prune this one

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving to M3-Restrict-By-Tags-JSON

@@ -96,7 +96,9 @@ func (h *PromReadInstantHandler) ServeHTTP(w http.ResponseWriter, r *http.Reques
QueryContextOptions: models.QueryContextOptions{
LimitMaxTimeseries: fetchOpts.Limit,
}}
if restrictOpts := fetchOpts.RestrictFetchOptions; restrictOpts != nil {

Copy link
Collaborator Author

@robskillington robskillington Dec 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the result from read(...) not need to be filtered out too for instant queries with prometheus.FilterSeriesByOptions(result.series, fetchOpts)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved it to a common file.

}

if str := req.Header.Get(FetchRestrictLabels); str != "" {
var opts *storage.RestrictFetchOptions
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should we call this storage.QueryOptions to match the header QueryOptionsJSONHeader = "M3-Query-Options-JSON" we landed on?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving to M3-Restrict-By-Tags-JSON

) (queryFanoutType, ClusterNamespaces, error) {
coversRangeFilter := newCoversRangeFilter(coversRangeFilterOptions{
now: now,
queryStart: start,
})

if fetchOptions == nil {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will restrict fetch options always be not nil if no headers were set?

Currently we don't set it to a non-nil value unless one of the headers were set:

if restrict := fetchOpts.RestrictFetchOptions; restrict != nil {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline.

func encodeRestrictFetchOptionsByType(
o *storage.RestrictByType,
) (*rpcpb.RestrictFetchType, error) {
if err := o.Validate(); err != nil {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does o == nil need to be checked?

},
}
for _, test := range tests {
// t.Run(fmt.Sprintf("%s", test.value), func(t *testing.T) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove or keep perhaps?

@@ -56,6 +56,9 @@ const (
// metrics type.
MetricsStoragePolicyHeader = "M3-Storage-Policy"

// QueryOptionsJSONHeader provides tag options to enforces on queries.
QueryOptionsJSONHeader = "M3-Restrict-By-Tags-JSON"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably update the header name to match how the header reads (to be consistent with other headers defined here).

So QueryOptionsJSONHeader -> RestrictByTagsJSONHeader?

@codecov
Copy link

codecov bot commented Dec 2, 2019

Codecov Report

Merging #2053 into master will decrease coverage by 12%.
The diff coverage is 2.3%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #2053      +/-   ##
=========================================
- Coverage    67.6%   55.5%   -12.1%     
=========================================
  Files        1005     971      -34     
  Lines       87887   85037    -2850     
=========================================
- Hits        59418   47257   -12161     
- Misses      24443   34469   +10026     
+ Partials     4026    3311     -715
Flag Coverage Δ
#aggregator 73.9% <ø> (-8.2%) ⬇️
#cluster 73.6% <ø> (+30.2%) ⬆️
#collector 41.5% <ø> (-37.2%) ⬇️
#dbnode 63.1% <ø> (-6.9%) ⬇️
#m3em 53.3% <ø> (-16.6%) ⬇️
#m3ninx 74.9% <ø> (+4%) ⬆️
#m3nsch 72.1% <100%> (+12.5%) ⬆️
#metrics 25% <ø> (+7.3%) ⬆️
#msg 74.7% <ø> (ø) ⬆️
#query 5.9% <0%> (-48.3%) ⬇️
#x 81.7% <ø> (+30.5%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 65b7f92...1ca734e. Read the comment docs.

type tester struct{}

// Ensure tester is a TestingT and set a global `t`.
var t require.TestingT = &tester{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we move these globals to the top of the file?

},
})

resp, err := queryWithHeader(url, string(m))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: m is already a string here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops yeah refactored some stuff and forgot to remove the conversion

for i, d := range data {
require.Equal(t, 2, len(d.Metric))
require.Equal(t, name, d.Metric["__name__"])
require.Equal(t, clusters[i], d.Metric["cluster"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check the value of d.Metric["val"] here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is testing the case where you set a Restrict field, but don't explicitly set any values in Strip. This is expected to strip out any tags which satisfy the Restrict matchers, so we shouldn't have any val tags in the output (ensured by requiring the length to be 2 above)

data := resp.Data.Result
data.Sort()
require.Equal(t, len(data), 2)
clusters := []string{"coordinator-cluster-a", "coordinator-cluster-b"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if the cluster/field names are the same, could we turn them into consts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, was definitely cutting corners on this file ha

keys := opts.RestrictQueryOptions.GetRestrictByTag().GetFilterByNames()
if len(keys) > 0 {
for i, s := range series {
series[i].Tags = s.Tags.TagsWithoutKeys(keys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever push these filters down to the dbnode?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We push them down through the FetchOptions, e.g. in the WithAppliedOptions function (here)[https://sourcegraph.com/github.com/m3db/m3/-/blob/src/query/storage/index.go#L128], where the additional matchers are added to the query as it's translated to an index.Query (the format used by the db)

This is actually on the path back from dbnode, and it's removing the unneeded tags. For full queries (as opposed to tag completion or search queries), we actually have to complete all processing steps first before we can strip the tags out, to ensure any functions that rely on matching tags to work correctly. If you remove them too early, a case where you add a matcher that looks like foo~=".*" is likely to break as you may have multiple different series with the same "ID", and any aggregations will likely not match series correctly

func matchByName(name string) (models.MatchType, error) {
t := models.MatchEqual
switch name {
case "EQUAL":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we make these consts as well? could re-use the consts in tests as well.

int64 lookbackDuration = 3;
FanoutOption unaggregated = 4;
FanoutOption aggregated = 5;
FanoutOption aggregatedOptimized = 6;
bool includeResolution = 7;
}

message RestrictFetchOptions {
message RestrictQueryOptions {
RestrictFetchType RestrictFetchType = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we change these to Query* as well? Since we changed the name of the outer message from Fetch -> Query.


func encodeRestrictQueryOptions(
o *storage.RestrictQueryOptions,
) (*rpcpb.RestrictQueryOptions, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check o == nil here as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it will create a RestrictQueryOptions with the internals being nil, but having the RestrictQueryOptions probably makes more sense 👍


func decodeRestrictQueryOptions(
p *rpc.RestrictQueryOptions,
) (*storage.RestrictQueryOptions, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check for nil here as well?

@robskillington
Copy link
Collaborator Author

LGTM, let's followup on feedback from @notbdu async since we'd like to test this on master

@robskillington robskillington merged commit 4c44f1d into master Dec 3, 2019
@robskillington robskillington deleted the r/force-fetch-tags-with-restrict-strip-labels branch December 3, 2019 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants