[WIP] Add rudimentary downsampling for m3coordinator #744

robskillington · 2018-06-04T15:21:55Z

No description provided.

robskillington · 2018-06-05T06:55:26Z

@xichen2020 @nikunjgit any feedback before I start productionizing this?

xichen2020 · 2018-06-05T14:11:44Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+	return false
+}
+
+type rollupIDProvider struct {


Can you add some comments explaining what this type is responsible for? It's hard to tell by just reading the code so far.

Sure thing, I'll add a comment above the type. (It just constructs a rollup ID when necessary, and can be pooled)

xichen2020 · 2018-06-05T14:14:29Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+}
+
+func (p *rollupIDProvider) Next() bool {
+	p.index++


This is confusing to me. Can you instead having length simply return len(p.tagPairs) and here do a length check first before you increment the index?

Reading the code after this and coming back, it seems you are pretending that there is an extra rollup tag pair here which is to presumably satisfy the encoder interface(?), is that what's happening here?

Yes, that's what's happening. I'll leave a comment to this effect. I wanted to avoid another alloc of TagPairs.

xichen2020 · 2018-06-05T14:15:20Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+	p.tagPairs = tagPairs
+	p.rollupTagIndex = -1
+	for idx, pair := range tagPairs {
+		if bytes.Compare(rollupTagName, pair.Name) < 0 {


What if there is no rollup tag in the list of tag pairs provided?

(As per other comment, there is meant to be always the rollup tag injected so there will never be one in the tag pairs provided as we inject it ourselves)

xichen2020 · 2018-06-05T14:17:09Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+	return p.length() - p.index - 1
+}
+
+func (p *rollupIDProvider) Duplicate() ident.TagIterator {


This only does a shallow copy (e.g., the tag encoder is shared between itself and the clone). Is that ok?

I only made this to fulfill the interface, in practice it won't be duplicated. I can make a deep copy though just to be careful.

xichen2020 · 2018-06-05T14:19:22Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+	iterPool *encodedTagsIteratorPool,
+) ([]byte, []byte, error) {
+	// ID is always the encoded tags for downsampling IDs
+	metricTags := id


The id is passed in and returned without getting used in this method, can we just remove it?

It's part of the method required, but I can move that to the lambda calling it perhaps.

xichen2020 · 2018-06-05T14:23:35Z

src/cmd/services/m3coordinator/downsample/downsampler.go

+
+	tagsFilterOptions := filters.TagsFilterOptions{
+		NameTagKey: metricNameTagName,
+		NameAndTagsFn: func(id []byte) ([]byte, []byte, error) {


Does this mean for prom metrics name is a separate tag just like any other normal tags?

Yes indeed.

xichen2020 · 2018-06-05T14:28:50Z

src/cmd/services/m3coordinator/downsample/downsampler.go

+		SetGaugePrefix(nil).
+		SetTimerPrefix(nil)
+
+	shardSet := make([]shard.Shard, numShards)


Is the placement always statically configured?

Yes, this downsampler always just runs local to the coordinator.

xichen2020 · 2018-06-05T14:31:40Z

src/cmd/services/m3coordinator/downsample/downsampler.go

+	}
+
+	campaignOpts = campaignOpts.SetLeaderValue(leaderValue)
+	electionCluster := integration.NewClusterV3(nil, &integration.ClusterConfig{


:\ so the coordinator also starts an embedded single-node etcd cluster? Why not pass in the etcd cluster externally so we can use the embedded KV cluster in M3DB nodes if we choose to?

So we don't have an external etcd cluster. What actually needs to happen here is we need to remove the dependency of a real etcd cluster from the leader service, it should just take a struct that can fulfill some things and abstract the underlying implementation (which happens to be etcd).

That way we don't need etcd at all for just a local single node in-memory aggregator.

xichen2020 · 2018-06-05T14:37:47Z

src/cmd/services/m3coordinator/downsample/policy.go

+	// NonePolicy is the none downsampling policy.
+	NonePolicy NonePolicy `yaml:"nonePolicy"`
+	// AggregationPolicy is the aggregation downsampling policy.
+	AggregationPolicy AggregationPolicy `yaml:"aggregationPolicy"`


So does this currently mean that if this is provided the downsampling is effectively achieved by aggregating datapoints within a resolution window?

Correct, although you can have multiple policies enabled.

xichen2020 · 2018-06-05T14:42:18Z

src/coordinator/storage/local/storage.go

+	for _, s := range result.SeriesList {
+		id := s.Name()
+		existing, exists := r.dedupeMap[id]
+		if exists && existing.attrs.Resolution < attrs.Resolution {


Do you want to make this configurable as opposed to always picking the finest resolution?

We should yes, perhaps I'll leave a followup though as we probably don't need to provide that for v1.

xichen2020 · 2018-06-05T14:51:15Z

Did a pass with a focus on the aggregator related logic and some of the query logic, @nikunjgit can probably do a more thorough review on the query part.

nikunjgit

Overall, I like the approach of making the handler figure out the downsampling and keeping that out of storage. At a high level, I have a few suggestions:

Consider moving downsample out of coordinator into m3aggregator. Downsampler is pretty generic already.
Consider separating out diffs for downsampling and clustering. Both are pretty big features and make this diff a little confusing.
For clustering, we should probably discuss how to avoid fanout to every cluster on every request. Doesn't look like we would ever want to do that for our internal use cases and it might be too soon to assume that external users might need it.
For downsampling, provide a way for it to work while the storage being a prometheus remote. It looks possible with multi-clusters but maybe we shouldn't tie them together ? People might want to downsample but may not be comfortable with managing many clusters.

nikunjgit · 2018-06-21T15:30:42Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

@@ -0,0 +1,330 @@
+// Copyright (c) 2018 Uber Technologies, Inc.


downsample_id is a little verbose since it's already in downsample ?

nikunjgit · 2018-06-22T15:09:08Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+// Ensure encodedTagsIterator implements id.ID
+var _ id.ID = &encodedTagsIterator{}
+
+type encodedTagsIterator struct {


feel like the tag iterator stuff should belong somewhere else rather than with the downsampler. It is not already in m3db ?

It needs to live somewhere that cares about metric ID format. Since m3aggregator and m3db are both agnostic to the metric IDs and how tags are encoded they aren't the best location for this code to leave. m3coordinator however knows about the concrete formats, i.e. prometheus metric ID format. This package is the only one that needs to know about how tags are encoded for internal use in the downsampler, it would be leaky to make it need to live elsewhere and be imported into the downsample package (especially since the types are pretty nasty and concern optimizations for their use locally in this package).

nikunjgit · 2018-06-22T15:15:38Z

src/cmd/services/m3coordinator/downsample/downsampler.go

+		instrumentOpts          = o.InstrumentOptions
+	)
+	if o.StorageFlushConcurrency > 0 {
+		storageFlushConcurrency = storageFlushConcurrency


Cheers, thanks for the catch.

nikunjgit · 2018-06-22T15:38:27Z

src/cmd/services/m3coordinator/handler/prometheus/remote/write.go

+			request := newLocalWriteRequest(write, h.store)
+			requests = append(requests, request)
+		}
+		writeRawErr = execution.ExecuteParallel(ctx, requests)


this will block. So essentially we'd be blocking for writeRaw but not for WriteAgg ? Maybe just make everything a execution.Request and then ExecuteParallel.

nikunjgit · 2018-06-22T15:58:49Z

src/cmd/services/m3coordinator/downsample/downsampler.go

@@ -0,0 +1,756 @@
+// Copyright (c) 2018 Uber Technologies, Inc.


does this even belong within coordinator ? Consider keeping this within m3aggregator ? Seems like it's pretty generic already. You're just using a few coordinator specific things during flush, which can easily be abstracted out.

I'm not sure if this belongs in m3aggregator. This is essentially using m3aggregator as a library and this is the code that configures the various config options and creates the aggregator, which IMO should live outside the aggregator library. Additionally, there's a lot of logic here that parses tags etc, which should live outside m3aggregator since the aggregator has no notion of tags and it would be strange to introduce it there.

This downsampler is essentially a single node aggregation package, currently it's not used outside of the coordinator and I doubt would really have a home anywhere else (also the only two options we should ever really offer is simple, non-HA downsampling in the m3coordinator, or proper HA downsampling with the m3aggregator in a clustered setup).

I'm inclined to leave it here until it's used anywhere else (and hopefully never used anywhere else, since you should be using the m3aggregator otherwise). Also it helps avoid needing to make it super generic and optimize it for it's use in the m3coordinator.

codecov · 2018-07-02T21:34:32Z

Codecov Report

Merging #744 into master will decrease coverage by 21.63%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master    #744       +/-   ##
==========================================
- Coverage   78.33%   56.7%   -21.64%     
==========================================
  Files         355     355               
  Lines       30188   30730      +542     
==========================================
- Hits        23649   17425     -6224     
- Misses       4995   11745     +6750     
- Partials     1544    1560       +16

Flag	Coverage Δ
#coordinator	`0% <0%> (-57.39%)`	⬇️
#dbnode	`69.35% <ø> (-12.41%)`	⬇️
#m3ninx	`19.2% <ø> (-53.5%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18a32b8...880e087. Read the comment docs.

arnikola · 2018-06-19T20:54:21Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+) *encodedTagsIterator {
+	return &encodedTagsIterator{
+		tagDecoder: tagDecoder,
+		bytes:      checked.NewBytes(nil, nil),


Worth getting this from a pool?

arnikola · 2018-06-20T00:38:45Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+
+// TagValue returns the value for a tag value.
+func (it *encodedTagsIterator) TagValue(tagName []byte) ([]byte, bool) {
+	it.tagDecoder.Reset(it.bytes)


Should this clone the tagIterator first, then reset that? If I'm reading this right if you ever call TagValue, it'll mess with the current iterator position, and if you were trying to get value for a missing name, you'd use up the iterator?

Would it be worth it to decode the tagDecoder to a list of tags on Reset for this?

arnikola · 2018-06-20T00:42:35Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+
+type encodedTagsIteratorPool struct {
+	tagDecoderPool serialize.TagDecoderPool
+	pool           pool.ObjectPool


Worth giving this guy a xpool.CheckedBytesWrapperPool too?

arnikola · 2018-06-20T00:47:19Z

src/cmd/services/m3coordinator/downsample/downsample_id.go

+}
+
+func (p *encodedTagsIteratorPool) Init() {
+	p.tagDecoderPool.Init()


Is there a chance this is already initialized, eg if this will try to re-use an existing decoder pool?

arnikola · 2018-07-09T20:26:09Z

src/cmd/services/m3coordinator/downsample/ids.go

+) *encodedTagsIterator {
+	return &encodedTagsIterator{
+		tagDecoder: tagDecoder,
+		bytes:      checked.NewBytes(nil, nil),


Worth getting this from a pool?

arnikola · 2018-07-12T20:50:36Z

src/cmd/services/m3coordinator/downsample/downsampler.go

+}
+
+func (h *downsamplerFlushHandler) Close() {
+}


nit: can we get a comment here? Either // noop or // TODO depending on what's required

arnikola · 2018-07-12T20:54:49Z

src/cmd/services/m3coordinator/downsample/downsampler.go

+func (w *downsamplerFlushHandlerWriter) Write(
+	mp aggregated.ChunkedMetricWithStoragePolicy,
+) error {
+	w.wg.Add(1)


It seems a little odd to me to have this function touching the internal wait group; what's the difference between doing it this way and calling Flush at the end v.s. the calling function handling the parallelism?

arnikola · 2018-07-12T20:57:00Z

src/cmd/services/m3coordinator/downsample/downsampler.go

+
+func (a *metricsAppender) SamplesAppender() (SamplesAppender, error) {
+	// Sort tags
+	sort.Sort(a.tags)


Would it be better to insert the tags in the sorted location?

arnikola · 2018-07-12T20:59:51Z

src/cmd/services/m3coordinator/downsample/downsampler.go

+	id := a.encodedTagsIteratorPool.Get()
+	id.Reset(unownedID)
+	now := time.Now()
+	fromNanos, toNanos := now.Add(-1*a.clockOpts.MaxNegativeSkew()).UnixNano(),


nit: now.Sub, might be easier to read across two lines?

arnikola · 2018-07-12T21:00:39Z

src/cmd/services/m3coordinator/downsample/downsampler.go

+	numRollups := matchResult.NumRollups()
+	for i := 0; i < numRollups; i++ {
+		rollup, ok := matchResult.RollupsAt(i, now.UnixNano())
+		if !ok {


nit: Better to flip this and add samplesAppender if ok

robskillington · 2018-07-16T16:57:53Z

Deprecating in favor of new downsampling PR #796, now that the multi-cluster support has landed.

Add rudimentary downsampling for m3coordinator

3040f12

xichen2020 reviewed Jun 5, 2018

View reviewed changes

nikunjgit reviewed Jun 22, 2018

View reviewed changes

Merge branch 'master' into r/coordinator-downsampling

d64abfb

Rob Skillington added 2 commits July 2, 2018 18:10

Address feedback

df2d6ed

Rename downsampled ID logic file and fix some lint issues

880e087

arnikola reviewed Jul 12, 2018

View reviewed changes

robskillington closed this Jul 16, 2018

prateek deleted the r/coordinator-downsampling branch September 29, 2018 18:09

		@@ -0,0 +1,330 @@
		// Copyright (c) 2018 Uber Technologies, Inc.

		@@ -0,0 +1,756 @@
		// Copyright (c) 2018 Uber Technologies, Inc.

[WIP] Add rudimentary downsampling for m3coordinator #744

[WIP] Add rudimentary downsampling for m3coordinator #744

Conversation

robskillington commented Jun 4, 2018

robskillington commented Jun 5, 2018 • edited Loading

Choose a reason for hiding this comment

robskillington Jul 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jul 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jul 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xichen2020 commented Jun 5, 2018

nikunjgit left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jul 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jul 2, 2018 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Jul 2, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington commented Jul 16, 2018

robskillington commented Jun 5, 2018 •

edited

Loading

robskillington Jul 2, 2018 •

edited

Loading

robskillington Jul 2, 2018 •

edited

Loading

robskillington Jul 2, 2018 •

edited

Loading

robskillington Jul 2, 2018 •

edited

Loading

robskillington Jul 2, 2018 •

edited

Loading

codecov bot commented Jul 2, 2018 •

edited

Loading