-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rudimentary multi-cluster support for m3coordinator #785
Add rudimentary multi-cluster support for m3coordinator #785
Conversation
Codecov Report
@@ Coverage Diff @@
## master #785 +/- ##
==========================================
- Coverage 78.32% 78.27% -0.06%
==========================================
Files 359 362 +3
Lines 30387 30766 +379
==========================================
+ Hits 23800 24081 +281
- Misses 5025 5088 +63
- Partials 1562 1597 +35
Continue to review full report at Codecov.
|
|
||
isolationGroup := strings.TrimSpace(host.IsolationGroup) | ||
if isolationGroup == "" { | ||
isolationGroup = "local" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add "local"
as var at top of file
Endpoint: fmt.Sprintf("http://%s:%d", hostnameGroup.Hostname, port), | ||
Hostname: hostnameGroup.Hostname, | ||
Port: uint32(port), | ||
Id: id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason we are accessing the protobuf directly here instead of using the placement.Instance interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because an internal handler is called where the instances are submitted by JSON but described by protobuf. (This is a composite handler and uses the internal handlers directly as if a user was sending a request)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, it feels slightly dangerous though, cause usually it's safe to rename a field in protobuf, but since we are converting it into json here (which also pretty far away from m3cluster, where the pb is defined), it may change the how the json is represented, should we decouple the pb and json representation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry its the opposite, the caller (person/tool issuing the request) forms the JSON. The actual internal handler uses protobufs to create the instances from. This is another internal handler that calls this sub-internal handler, using protobuf (no JSON conversion there).
This is fine for now, these endpoints are for operators, if we were to create specific JSON shapes for every sub-component it would take a great overhead to maintain these endpoints. We'd probably also make the same changes we are making to our protobuf schemas anyhow so they would avoid drifting from each other.
IsolationGroup: isolationGroup, | ||
Zone: zone, | ||
Weight: weight, | ||
Endpoint: fmt.Sprintf("%s:%d", host.Address, host.Port), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend fill in the Hostname
field with id
or host.Address
for now, so that later when we rename that field in placement.Instance
it's easier for us to update here. According to the protobuf definition, seems host.Address is what you want here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing.
blockSize = minBlockSize | ||
} else if blockSize > maxBlockSize { | ||
blockSize = maxBlockSize | ||
if blockSize < minRecommendCalculateBlockSize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this isn't really a recommended size if it's the hard minimum for blockSize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this should be read is, if the "calculated block size" (here blockSize
) is less than the "minimum to recommend when calculating block size" (here minRecommendCalculateBlockSize
) then just set it to the "minimum to recommend when calculating block size" (i.e. minRecommendCalculateBlockSize
).
Does that make sense? So it is recommended, its just the minimum we would recommend when making recommendation based on calculating block sizes when specifying expected datapoints per block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, just it's a bit weird calling it Recommended
since there's an implication that you can ignore the recommendation and override it?
Might be getting too pedantic about it though, take it or leave it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can have smaller block sizes, you just have to specify it explicitly. Well actually no that's not true, I'll add it to the protobuf so you can use either the expected datapoints per block or an explicit block size.
Then that way it won't be the hard minimum, it'll the minimum that we recommend. You can go below the minimum we recommend by explicitly using a block size smaller yourself.
IsolationGroup: isolationGroup, | ||
Zone: zone, | ||
Weight: weight, | ||
Endpoint: fmt.Sprintf("%s:%d", host.Address, host.Port), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth a sanity check to verify that address
is valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be a hostname or an IP address unfortunately. So in that case, it's tough to know whether its a hostname (alphanumeric) or IPv4 or IPv6. To do validation may cause valid hostnames to be deemed invalid potentially I believe?
} | ||
` | ||
assert.Equal(t, stripAllWhitespace(expectedResponse), string(body), | ||
xtest.Diff(mustPrettyJSON(t, expectedResponse), mustPrettyJSON(t, string(body)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
// DefaultServiceZone is the default service ID zone | ||
DefaultServiceZone = "embedded" | ||
// HeaderClusterServiceName is the header used to specify the service name. | ||
HeaderClusterServiceName = "Cluster-Service-Name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should we prefix custom headers with X-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the rage now is to abide by the "do not use X-
..." RFC, hah
https://tools.ietf.org/html/rfc6648
@@ -48,20 +48,20 @@ func TestPlacementService(t *testing.T) { | |||
mockClient.EXPECT().Services(gomock.Not(nil)).Return(mockServices, nil) | |||
mockServices.EXPECT().PlacementService(gomock.Not(nil), gomock.Not(nil)).Return(mockPlacementService, nil) | |||
|
|||
placementService, err := Service(mockClient, config.Configuration{}) | |||
placementService, err := Service(mockClient, http.Header{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a test that parsing the new headers will pull out values correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, will do.
continue | ||
} | ||
|
||
// Does not exist already or more finer grained, add result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: or finer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not exist already or lower resolution, add result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opted for // Does not exist already or more precise, add result
. The "higher" or "lower" is confusing with resolution, so I went with "precise" terminology.
aggregated1MonthRetention1MinuteResolution *client.MockSession | ||
} | ||
|
||
func (s testSessions) forEach(fn func(session *client.MockSession)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
} | ||
err := store.Write(context.TODO(), writeQuery) | ||
assert.Error(t, err) | ||
assert.True(t, strings.Contains(err.Error(), "no configured cluster namespace"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do something like assert.EqualError
? Might be a bit iffy if you don't have the exact error string we're expecting though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's why I prefer the Contains... check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, could it be worth at some point pulling out a bunch of test utilities like this into a custom module to allow us to reuse them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, although it's pretty simply so I'm fine to also just rely on the standard library and stretchr
without adding another package into the mix for now.
case session == sessions.aggregated1MonthRetention1MinuteResolution: | ||
f = fetches[1] | ||
default: | ||
panic("unexpected session") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to use require.FailNow
to avoid a panic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing.
result, err := store.FetchTags(context.TODO(), searchReq, &storage.FetchOptions{Limit: 100}) | ||
require.NoError(t, err) | ||
|
||
require.Equal(t, len(fetches), len(result.Metrics)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: require.Len(t, result.Metrics, len(fetches))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require.Len(...)
does some weird things including spewing out weird output when the length is wrong rather than just "2 != 3", so I always avoid it now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh damn, been telling people to use X.Len
instead of Equal
in reviews... 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works fine until the test stops passing, and then you spend like a full minute or two wondering whats going on lol.
} | ||
|
||
// Validate will validate the cluster namespace definition. | ||
func (def UnaggregatedClusterNamespaceDefinition) Validate() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of Validate, maybe have a NewUnaggregatedClusterNamespaceDefinition ? That will eliminate the need to always call validate after struct creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm so the whole point I have this as a struct, is so you can UnaggregatedClusterNamespaceDefinition{NamespaceID: "foo", Retention: ...}
etc to see the names for the parameters you're specifying.
To turn this into just a NewThing(...)
function would compromise the reason I opted for a struct construction. Do you feel strongly about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't feel strongly so feel free to ignore. Its interesting approach since users are now supposed to call validate right after object creation. In my mind, validate is more useful when you're creating something in multiple stages or the creator is disjoint from the validator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so I tried to separate basically the "definitions" or declarative options to pass to the NewClusters(...)
method so you kind of build a declarative tree of what you want to ultimately construct, then NewClusters(...)
basically turns these into concrete things that are wired up to clusters, etc.
So in some ways you can look at this "definition" structs as builders/object creation.
} | ||
|
||
// NewClusters instantiates a new Clusters instance. | ||
func NewClusters( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we consider decoupling namespace from resolution/retention ? Eg: allow support for multiple unaggregated namespaces, or have multiple aggregated namespaces for the same retention/resolution ? Each of them will have a unique namespace id. The users might eventually be allowed to pass in the namespace id to the read request and we would only read from that namespace.
That will help us be more flexible in our definition of clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have multiple unaggregated clusters or multiple aggregated namespaces for same retention/resolution it's not clear how they are load balanced and which clusters/namespaces cover which parts of the metrics space.
I'd be ok with creating a container type, however you won't be able to really use it because I definitely don't want to add sharding to work out which unaggregated cluster to send a single write to, etc, in this changeset. So I'd just end up throwing an error if you actually did have multiple per unaggregated or retention/resolution pair... in which case is it useful or should we just add this later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was more thinking in terms of multi tenancy : https://github.com/grafana/metrictank/blob/master/docs/multi-tenancy.md. Do you see a way to achieve it or is it not worth having ? Note that metrictank also allows some data to be shared across tenants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can and should support it first class, that is a good idea. I opened an issue here which discusses more in depth:
#790
for _, namespace := range namespaces { | ||
namespace := namespace // Capture var | ||
|
||
clusterStart := now.Add(-1 * namespace.Attributes().Retention) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this, can you just check if !clusterStart.Before(query.Start) {...}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to include the case where query.Start matches the clusterStart however, which that logic would not cover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if query.Start is same as clusterStart, you want to continue ? If yes, then !Before should work since that includes After as well as equals case? Maybe i'm missing something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So !clusterStart.Before(query.Start) { continue }
is equivalent to clusterStart.Equal(query.Start) || clusterStart.After(query.Start) { continue }
which is undesirable here.
Basically I want to include the cluster in the fanout if clusterStart.Equal(query.Start)
because that means it can fulfill the query (just). Hence why I don't want to use !clusterStart.Before(..)
.
|
||
wg.Add(1) | ||
go func() { | ||
r, err := s.fetch(namespace, m3query, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to exit on first error ? If yes, then probably use ExecuteParallel, if not, then probably add a comment since we shortcircuit on first error in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to create temporary objects (i.e. allocation per fanout thing) which is costly, that's why I use a single static waitgroup and only if there are errors does it allocate a per fanout element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so its cheaper to do everything in parallel instead of failing on first error and needing allocations ? I suppose I can change the other functions later then to stop using ExecuteParallel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically cheapest is zero allocation if possible, though this is not always possible, everywhere that is a new allocation and then shared (i.e. escapes to heap) is a place that probably should be pooled if in the hot path.
That's why at least in the very high frequency call path I pretty much always opt for least amount of allocations as possible.
Unfortunately ExecuteParallel(...)
while a nice abstraction, requires a struct that fits an interface for each executing goroutine, and because the other goroutine needs a ref to it they all need to be heap allocated. Which means you either need pooling to avoid allocating, or just to avoid using something needing an allocation per goroutine, like for instance just a waitgroup.
for _, namespace := range namespaces { | ||
namespace := namespace // Capture var | ||
|
||
clusterStart := now.Add(-1 * namespace.Attributes().Retention) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this, can you just check if !clusterStart.Before(query.Start) {...}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to include the case where query.Start matches the clusterStart however, which that logic would not cover.
|
||
wg.Add(1) | ||
go func() { | ||
result.add(s.fetchTags(namespace, m3query, opts)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above: do we want to exit on first error ? If yes, then probably use ExecuteParallel, if not, then probably add a comment since we shortcircuit on first error in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to create temporary objects (i.e. allocation per fanout thing) to pass to ExecuteParallel which is costly, that's why I use a single static waitgroup and only if there are errors does it allocate a per fanout element.
id := s.Name() | ||
existing, exists := r.dedupeMap[id] | ||
if exists && existing.attrs.Resolution <= attrs.Resolution { | ||
// Already exists and resolution is already more finer grained |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already exists and resolution is higher ?
continue | ||
} | ||
|
||
// Does not exist already or more finer grained, add result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not exist already or lower resolution, add result
return | ||
} | ||
|
||
if r.result == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it a bug to initialize r.result but not r.dedupeMap ? Seems like this may not work since for the second result, you will just add duplicate things to the back of the seriesList
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in the case where you just have a single result, you don't need to initialize dedupeMap
and it's initialized lazily, the second time you add a result. At that point it builds the dedupe map with the previous result, then you start checking if currently result overlaps with it.
Hence this logic should be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I missed the part of it creating the dedup map with previous result.
return | ||
} | ||
|
||
if r.result == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, probably a bug to add to r.result but not to r.dedupeMap ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this logic should be correct.
@@ -186,5 +190,5 @@ func (c ClustersStaticConfiguration) NewClusters() (Clusters, error) { | |||
} | |||
|
|||
return NewClusters(unaggregatedClusterNamespace, | |||
aggregatedClusterNamespaces) | |||
aggregatedClusterNamespaces...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Since we're splitting newlines here anyway, might look better as
NewClusters(
unaggregatedClusterNamespace,
aggregatedClusterNamespaces...,
)
?
@@ -127,7 +128,7 @@ type clusters struct { | |||
// NewClusters instantiates a new Clusters instance. | |||
func NewClusters( | |||
unaggregatedClusterNamespace UnaggregatedClusterNamespaceDefinition, | |||
aggregatedClusterNamespaces []AggregatedClusterNamespaceDefinition, | |||
aggregatedClusterNamespaces ...AggregatedClusterNamespaceDefinition, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between using vararg vs the array here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving pending rebase on master for tags
update; outstanding comments are regarding minor issues
} else { | ||
// Use the maximum block size if we don't find a recommended one based on retention | ||
default: | ||
// Use the maximum block size if we no fields set to recommend block size from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: wonky comment, Use the maximum block size if we don't find a recommended one based on request parameters
maybe�?
Or ...have no fields...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, sure thing.
for _, namespace := range namespaces { | ||
namespace := namespace // Capture var | ||
|
||
clusterStart := now.Add(-1 * namespace.Attributes().Retention) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if query.Start is same as clusterStart, you want to continue ? If yes, then !Before should work since that includes After as well as equals case? Maybe i'm missing something here.
} | ||
|
||
// Validate will validate the cluster namespace definition. | ||
func (def UnaggregatedClusterNamespaceDefinition) Validate() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't feel strongly so feel free to ignore. Its interesting approach since users are now supposed to call validate right after object creation. In my mind, validate is more useful when you're creating something in multiple stages or the creator is disjoint from the validator.
} | ||
|
||
// NewClusters instantiates a new Clusters instance. | ||
func NewClusters( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was more thinking in terms of multi tenancy : https://github.com/grafana/metrictank/blob/master/docs/multi-tenancy.md. Do you see a way to achieve it or is it not worth having ? Note that metrictank also allows some data to be shared across tenants.
func (errs *syncMultiErrs) add(err error) { | ||
errs.Lock() | ||
errs.multiErr = errs.multiErr.Add(err) | ||
errs.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why you don't defer error here but do it in the following func ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah no big reason, if it was in the hot path I'd always avoid defer since there is some overhead there. Since I was returning a value in the other one it is slightly easier to write, the defer way. I'll make these both avoid defer though, just in case this does get reused by someone else in a hot path.
numUnaggregatedClusterNamespaces int | ||
numAggregatedClusterNamespaces int | ||
unaggregatedClusterNamespaceCfg = &unaggregatedClusterNamespaceConfiguration{} | ||
aggregatedClusterNamespacesCfgs []*aggregatedClusterNamespacesConfiguration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really need slice of pointers here (locality) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the main reason I'm using pointers here is that I need to set the client session and error back on the struct itself once I try connecting to the cluster. It seemed simpler just to use a pointer, and since this is a slow path (i.e. once at initialization) I didn't think too much about it.
I'd be all about using non-pointers if it was a hot code path.
src/coordinator/models/tag.go
Outdated
@@ -125,25 +125,31 @@ func (m Matchers) ToTags() (Tags, error) { | |||
return tags, nil | |||
} | |||
|
|||
var ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you rebase ?
This adds rudimentary multi-cluster support for m3coordinator so that we can begin to send metrics to different namespaces for different retention periods, etc.
Right now this is not optimal as it retrieves data from all matching namespaces that has retention long enough to cover the query, then returns the most granular datapoints that come back and discards any lower granularity datapoints that also came back. This is because we do not have a distributed index in the current M3 open source offering. For read workloads that do not require tens of thousands or hundreds of thousands realtime alert evaluations, this should be quite sufficient given that the write volume absolutely dwarfs the read volume.
At some point we'll hopefully have a distributed index that we can use for all use cases and deprecate this behavior.
Next: