Use a single index Results when querying across blocks #1474

robskillington · 2019-03-19T00:48:00Z

…ss-blocks

codecov · 2019-03-19T15:54:30Z

Codecov Report

Merging #1474 into master will decrease coverage by 6.8%.
The diff coverage is 65%.

@@           Coverage Diff            @@
##           master   #1474     +/-   ##
========================================
- Coverage    70.9%     64%   -6.9%     
========================================
  Files         841     834      -7     
  Lines       71895   71174    -721     
========================================
- Hits        50990   45614   -5376     
- Misses      17561   22333   +4772     
+ Partials     3344    3227    -117

Flag	Coverage Δ
#aggregator	`69.3% <ø> (-13.1%)`	⬇️
#cluster	`67.7% <ø> (-18.2%)`	⬇️
#collector	`47.9% <ø> (-15.8%)`	⬇️
#dbnode	`76.4% <65%> (-4.4%)`	⬇️
#m3em	`68.3% <ø> (-4.9%)`	⬇️
#m3ninx	`71.2% <ø> (-3%)`	⬇️
#m3nsch	`51.1% <ø> (ø)`	⬆️
#metrics	`17.6% <ø> (ø)`	⬆️
#msg	`74.9% <ø> (ø)`	⬆️
#query	`56.3% <ø> (-9.8%)`	⬇️
#x	`69.4% <ø> (-7%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dec08fe...f05ce65. Read the comment docs.

codecov · 2019-03-19T15:54:56Z

Codecov Report

Merging #1474 into master will decrease coverage by 7.5%.
The diff coverage is 72.5%.

@@            Coverage Diff            @@
##           master   #1474      +/-   ##
=========================================
- Coverage    70.9%   63.3%    -7.6%     
=========================================
  Files         841     712     -129     
  Lines       71880   62527    -9353     
=========================================
- Hits        50982   39635   -11347     
- Misses      17564   19825    +2261     
+ Partials     3334    3067     -267

Flag	Coverage Δ
#aggregator	`61.1% <ø> (-21.2%)`	⬇️
#cluster	`84.3% <ø> (-1.5%)`	⬇️
#collector	`47.9% <ø> (-15.8%)`	⬇️
#dbnode	`68.8% <69.8%> (-12.1%)`	⬇️
#m3em	`68.3% <ø> (-4.9%)`	⬇️
#m3ninx	`69.8% <ø> (-4.5%)`	⬇️
#m3nsch	`51.1% <ø> (ø)`	⬆️
#metrics	`17.5% <ø> (ø)`	⬆️
#msg	`74.9% <ø> (ø)`	⬆️
#query	`72.2% <ø> (+6.2%)`	⬆️
#x	`63.8% <100%> (-12.6%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fec4334...a8f8a2d. Read the comment docs.

arnikola

Looks good in general

arnikola · 2019-03-19T15:01:20Z

src/dbnode/storage/index/results.go

 )

 type results struct {
+	sync.RWMutex


More of a question, but do we usually have a struct contain a mu sync.RWMutex or have it be a mutex (promoted?)? Is there a difference besides accessors?

depends on access pattern mainly: A RWMutex is useful when you have potential multiple readers that could share underlying data without clobbering each other; if all accessors always need r/w access - then a mutex is fine.

Sorry, meant the distinction between something like

type results struct { sync.RWMutex }

v.s.

type results struct { mu sync.RWMutex }

I prefer the prior, unless it's an exported struct in which case the latter to reduce scope of calling those methods.

arnikola · 2019-03-19T15:07:33Z

src/dbnode/storage/index/results.go

@@ -55,6 +58,15 @@ func NewResults(opts Options) Results {

 func (r *results) AddDocument(
 	d doc.Document,
+) (added bool, size int, err error) {
+	r.Lock()
+	added, size, err = r.addDocumentWithLock(d)


nit: Do we commonly named returns? In this file it's a bit mixed up e.g. Namespace() uses standard returns, is that ok or would it be better to refactor these to the more standard return type?

This is to keep the existing stuff here. Usually we prefer to return structs actually instead of named returns.

arnikola · 2019-03-19T15:12:27Z

src/dbnode/storage/index/results.go

+	noFinalize := r.noFinalize
+	r.RUnlock()
+
+	if noFinalize {
 		return
 	}



Shouldn't this lock?

When putting back in the pool? Sure I'm not opposed to that, but technically just before you add it to the pool you're basically declaring you're done with the object.

On second thoughts, can't lock during entire call to Finalize as Reset itself acquires a write lock, that's why we release the lock earlier in the method.

arnikola · 2019-03-19T15:13:08Z

src/dbnode/storage/index/results.go

@@ -170,6 +243,9 @@ func (r *results) Finalize() {
 }

 func (r *results) NoFinalize() {
+	r.Lock()
+	defer r.Unlock()


Does this need a defer? Seems straightforward with no branching

Sure, probably don't need it here.

arnikola · 2019-03-19T15:15:03Z

src/dbnode/storage/index/results.go

+	}
+	return numPartialUpdates, size, nil
+}
+
 func (r *results) AddIDAndTags(


I think this method is not used any more, may be able to remove both it and AddDocument

Yeah true, will do - need to remove the tests.

arnikola · 2019-03-19T15:27:53Z

src/dbnode/storage/index/types.go

@@ -120,6 +132,22 @@ type Results interface {
 	) (added bool, size int, err error)
 }

+// AddDocumentsBatchResultsOptions is a set of options to use when adding
+// results for a query.
+type AddDocumentsBatchResultsOptions struct {


Would it make more sense to add this to Options? This doesn't seem like something that you'd necessarily change per query, but may be wrong

arnikola · 2019-03-19T15:36:04Z

src/dbnode/storage/index/results.go

 }

 func (r *results) Reset(nsID ident.ID) {
+	r.Lock()
+	defer r.Unlock()


Does this need a defer? Seems straightforward with no branching

Sure, probably don't need it here.

arnikola · 2019-03-19T16:08:01Z

src/dbnode/storage/index/block_test.go

-	require.True(t, ident.NewTagIterMatcher(
-		ident.MustNewTagStringsIterator("bar", "baz")).Matches(
-		ident.NewTagsIterator(t1)))
+	results.WithMap(func(rMap *ResultsMap) {


nit: this is used in multiple places, maybe make it into a test function?

arnikola · 2019-03-19T16:12:15Z

src/dbnode/storage/index/block.go

+		// Reset batch
+		var emptyDoc doc.Document
+		for i := range batch {
+			batch[i] = emptyDoc


Worth doing some sort of Memset optimization here instead?

This is the memset optimization thankfully, just using a single zero var value and then setting everything with range i := ... to that value.

arnikola · 2019-03-19T16:16:40Z

src/dbnode/storage/index/block.go

 	iterCloser := safeCloser{closable: iter}
 	execCloser := safeCloser{closable: exec}

 	defer func() {
+		b.docsPool.Put(batch)


Do we need to reset the batch here?

No thankfully the Put method on the array will zero out each elem and also resize it to zero.

richardartoul · 2019-03-19T15:55:27Z

src/dbnode/storage/index/results.go

@@ -157,7 +226,11 @@ func (r *results) Reset(nsID ident.ID) {
 }

 func (r *results) Finalize() {
-	if r.noFinalize {
+	r.RLock()


I don't know how I feel about this. If there is any amount of concurrency here then you've completely misused the object and you have a really bad bug anyway. Using a lock here kind of confuses that point

True, but I figured I was locking everything and it is kind of strange to lock only some state rather than all.

yeah I dont have strong feelings either way, just took me a second to think about why you might be doing it

richardartoul · 2019-03-19T15:56:05Z

src/dbnode/storage/index/results.go

@@ -170,6 +243,9 @@ func (r *results) Finalize() {
 }

 func (r *results) NoFinalize() {
+	r.Lock()
+	defer r.Unlock()
+
 	// Ensure neither the results object itself, or any of its underlying


I think this is my typo originally, but can you do or->nor

Sure thing.

richardartoul · 2019-03-19T15:57:36Z

src/dbnode/storage/index.go

@@ -895,24 +895,16 @@ func (i *nsIndex) Query(
 			exhaustive bool
 			returned   bool
 		}{
-			merged:     nil,
+			merged:     i.resultsPool.Get(),


Does it make sense to call this merged now that we push it all the way down? Maybe just call it results

Sure thing, renamed.

richardartoul · 2019-03-19T16:00:21Z

src/dbnode/storage/index.go

@@ -1064,8 +1017,6 @@ func (i *nsIndex) Query(
 	}

 	results.Lock()
-	// Signal not to add any further results since we've returned already.
-	results.returned = true


Are we losing the ability to do this signaling? Seems like this might make the impact of timed out queries even worse since they'll keep updating this map...Actually isn't it broken because they'll keep trying to add results to a map that may have been returned to the pool?

Actually, where do the results get returned to the pool? I don't see it in this method or in the RPC method.

Regardless, if we want to pool this thing you may need to add ref-counting or some type of unique query identifier or something

So I now use a lifetime to protect against writing to the results once we return from the Query call to the index, this prevents writing to results during cancellation or any other early return code path.

richardartoul · 2019-03-19T16:54:34Z

src/dbnode/storage/index/block.go

+		batch = batch[:0]
+	}
+
+	// Put last batch if remainding


richardartoul · 2019-03-19T16:56:37Z

src/dbnode/storage/index/results.go

 }

-func (r *results) Map() *ResultsMap {
-	return r.resultsMap
+func (r *results) WithMap(fn func(results *ResultsMap)) {


Can you change this to WithReadOnlyMap? I initially thought this method made it safe to mutate the map but thats not the case

Sure thing.

prateek · 2019-03-19T19:56:53Z

src/dbnode/storage/index/types.go

+	// from seriesID -> seriesTags, comprising index results.
+	// A function is required to ensure that the results is used
+	// while a read lock for the results is held.
+	WithMap(fn func(results *ResultsMap))


instead of this, how would you feel about adding a Seal() and changing Map() to return (*ResultsMap, error). Would allow callers to avoid the extra lambda.

Yeah, I think this is actually the approach I'll take also to address the pooling issues that Richie was concerned about (i.e. not knowing when it's finalized and potentially adding values to it after it is finalized)

@robskillington I think if you have an early time out / cancel, you have to give up and not return it to the pool cause even if you seal it there is no way to know when its safe to unseal it unless you ref count it

So I now use a lifetime to protect against writing to the results once we return from the Query call to the index, this prevents writing to results during cancellation or any other early return code path.

…x query returns due timeout or other error

… github.com:m3db/m3 into r/reuse-single-index-results-querying-across-blocks

…ss-blocks

richardartoul · 2019-03-21T15:54:04Z

src/dbnode/server/server.go

-	resultsPool.Init(func() index.Results { return index.NewResults(indexOpts) })
+	resultsPool.Init(func() index.Results {
+		// NB(r): Need to initialize after setting the index opts so
+		// it seems the same reference of the options as is set


This sentence reads like word salad lol. I think you were trying to say: "so it sees the same reference to the indexOptions"

…ss-blocks

richardartoul · 2019-03-21T16:01:55Z

src/dbnode/storage/index.go

-				}
-			}
-		}
-
 		// If block had more data but we stopped early, need to notify caller.


super nit: This would read a lot better if you move this comment below the early return cause right now it seems like a bug at first glance cause you say need to notify the caller but then the code immediately returns without doing anything

Sounds good, will do.

richardartoul · 2019-03-21T16:03:15Z

src/dbnode/storage/index.go

 		}
-		results.Unlock()

 		if alreadyNotExhaustive {


This is a little weird, can you just move the break into the previous conditional block that has the same condition?

Sure thing, this is a side effect of the refactor actually, good catch.

richardartoul · 2019-03-21T16:08:37Z

src/dbnode/storage/index.go

 		// If block had more data but we stopped early, need to notify caller.
 		if blockExhaustive {
 			return
 		}
-		results.exhaustive = false
+		state.exhaustive = false
 	}

 	for _, block := range blocks {
 		// Capture block for async query execution below.
 		block := block

 		// Terminate early if we know we don't need any more results.


Took me a second to understand this, could you clarify the comment to something like:

"We're looping through all the blocks that we need to query and kicking off parallel queries which are bounded by the queryWorkersPool's maximum concurrency. This means that it's possible at this point that we've completed querying one or more blocks and already exhausted the maximum number of results that we're allowed to return. If thats the case, there is no value in kicking off more parallel queries, so we break out of the loop."""

I know its a little verbose but I've seen multiple iterations of this code already and other people coming in with fresh eyes will probably have more trouble following.

Thank you for writing the comment, will do - hah 👍

richardartoul · 2019-03-21T16:14:03Z

src/dbnode/storage/index.go

-	results.Lock()
-	// Signal not to add any further results since we've returned already.
-	results.returned = true
+	state.Lock()
 	// Take reference to vars to return while locked, need to allow defer


I dont understand this comment. What deadlock are you protecting against? The only Lock/Defer that is see is in the execBlockQuery func and I don't see how that would deadlock with this code

I'll remove that statement, there used to be other stuff going on.

richardartoul · 2019-03-21T16:19:41Z

src/dbnode/storage/index/block.go

+	queryValid := cancellable.TryCheckout()
+	if !queryValid {
+		// Query not valid any longer, do not add results and return early
+		return batch, 0, errCancelledQuery


hmm this error will still get propagated up to the level above and stored in the state multierr.....I guess thats fine though since you take a copy of the multierr when you return

Yeah, we always lock when we access multierr.

richardartoul · 2019-03-21T16:25:01Z

src/dbnode/storage/index/results.go

+
+	r.opts = opts
+
+	// finalize existing held nsID


could you add periods to the comments in this file

Sure thing, this is old code but I'll update it.

PS we should really create a lint rule so you don't have to manually ask this btw (not urgent, but something we should think about)

richardartoul · 2019-03-21T16:26:05Z

src/dbnode/storage/index/results.go

+	r.nsID = nsID
+
+	// reset all values from map first
+	for _, entry := range r.resultsMap.Iter() {


shouldnt you be doing this in Finalize() not Reset?

Oh I see you call reset from finalize

richardartoul · 2019-03-21T16:54:34Z

src/dbnode/storage/index/results.go

 	}
+
+	// reset all keys in the map next


Can you add a quick comment saying this will finalize the keys

richardartoul · 2019-03-21T16:56:52Z

src/dbnode/storage/index/results.go

-) (added bool, size int, err error) {
-	added = false
+) (bool, int, error) {
+	added := false


super nit: I'd personally find this more readable if you just explicitly returned false/true in each code path

richardartoul · 2019-03-21T16:59:42Z

src/dbnode/storage/index/types.go

+	// take a copy of the bytes backing the documents so the original can be
+	// modified after this function returns without affecting the results map.
+	// If documents with duplicate IDs are added, they are simply ignored and
+	// the first document added with an ID is returned.


Maybe a TODO here saying we may need to change the exact behavior here once the index becomes mutable

Sure thing.

richardartoul · 2019-03-21T17:58:26Z

src/x/resource/lifetime.go

+// is already cancelled this will return false, otherwise it will return
+// true and guarantee the lifetime is not cancelled until the checkout
+// is returned.
+func (l *CancellableLifetime) TryCheckout() bool {


Can you add to the comment: If this returns true you MUST call ReleaseCheckout later, but if it returns false calling ReleaseCheckout will panic (pretty sure unlocking an unlocked lock panics

Yeah sure thing.

Added comments.

arnikola · 2019-03-21T15:50:28Z

src/x/resource/lifetime_test.go

+	require.False(t, ok)
+
+	// Ensure that cancel finished
+	for {


nit: Instead of doing loop here, maybe add a channel that the cancel goroutine pushes a struct to after setting the bool?

arnikola · 2019-03-21T18:39:19Z

src/dbnode/storage/index.go

-		}
-		alreadyNotExhaustive := opts.Limit > 0 && mergedSize >= opts.Limit
+		size := results.Size()
+		alreadyNotExhaustive := opts.LimitExceeded(size)


nit: can this be renamed to alreadyExceededLimit?

arnikola · 2019-03-21T18:44:26Z

src/dbnode/storage/index/block.go

+		indexOpts.SegmentBuilderOptions(),
+		indexOpts.FSTSegmentOptions(),
+		compaction.CompactorOptions{
+			MmapDocsData: opts.ForegroundCompactorMmapDocsData,


Looks like there may be an error here; backgroundCompactor uses ForegroundCompactorMmapDocsData, and foregroundCompactor uses BackgroundCompactorMmapDocsData

Hah, good call - TY for this catch.

arnikola · 2019-03-21T18:53:00Z

src/dbnode/storage/index/types.go

-// Results is a collection of results for a query.
+// Results is a collection of results for a query, it is synchronized
+// when access to the results set is used as documented by the methods.
+// It cannot be written to after it is sealed, until it's reopened by


I don't think it can be sealed in the current implementation, so this comment may be wrong?

True, thanks will update.

arnikola · 2019-03-21T18:54:46Z

src/dbnode/storage/index/results.go

-	return r.resultsMap
+	r.RLock()
+	v := r.resultsMap
+	r.RUnlock()


Should this make a copy of the resultsMap? Otherwise won't downstream consumers be able to modify this map without a lock or have their map changed?

I added a comment to the call to Map about this being unsafe. We can't really take a copy since it will cause a ton of allocations.

arnikola · 2019-03-21T18:55:13Z

src/dbnode/storage/index_query_concurrent_test.go

-					DoAndReturn(func(q index.Query, opts index.QueryOptions, r index.Results) (bool, error) {
+					Query(gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any()).
+					DoAndReturn(func(
+						l *resource.CancellableLifetime,


meganit: these can all be _

arnikola · 2019-03-21T18:56:51Z

src/dbnode/storage/index/results.go

-) (added bool, size int, err error) {
-	added = false
+) (bool, int, error) {
+	added := false


arnikola · 2019-03-21T18:59:30Z

src/dbnode/storage/index/results_test.go

+	testOpts = optionsWithDocsArrayPool(NewOptions(), 1, 1)
+}
+
+func optionsWithDocsArrayPool(opts Options, size, capacity int) Options {


Just wondering what the purpose of this function is rather than just doing this in the init; just a convention?

We call this method elsewhere.

arnikola · 2019-03-21T19:02:11Z

src/dbnode/storage/index/compaction/compactor.go

+		copy(allDocsCopy, allDocs)
+		fstData.DocsReader = docs.NewSliceReader(0, allDocsCopy)
+	} else {
+		// Otherwise encode and reference the encoded bytes as mmap'd bytes.


I'm not really familiar with this path, but just verifying it's ok to have DocsReader be nil here?

Yup, if we supply the docs data and docs index data then its safe to have DocsReader be nil.

…ss-blocks

Rob Skillington and others added 2 commits March 19, 2019 00:47

Use a single index Results when querying across blocks

274bae1

Merge branch 'master' into r/reuse-single-index-results-querying-acro…

f05ce65

…ss-blocks

arnikola reviewed Mar 19, 2019

View reviewed changes

richardartoul reviewed Mar 19, 2019

View reviewed changes

prateek reviewed Mar 19, 2019

View reviewed changes

Rob Skillington added 7 commits March 20, 2019 23:18

Use cancellable lifetimes to defend against altering state after inde…

760b41f

…x query returns due timeout or other error

Merge branch 'r/reuse-single-index-results-querying-across-blocks' of…

57fe7ae

… github.com:m3db/m3 into r/reuse-single-index-results-querying-across-blocks

Fix typos in comments

c94cb18

Fix build error

6f434a2

Fix unit tests

c04b5e0

Fix prop test

fa8331b

Fix metalint and small limit for quorum fetch tagged tests

2f5db2d

arnikola mentioned this pull request Mar 21, 2019

Poor performance for tag completion endpoints #1453

Closed

6 tasks

Merge branch 'master' into r/reuse-single-index-results-querying-acro…

361747f

…ss-blocks

richardartoul reviewed Mar 21, 2019

View reviewed changes

Rob Skillington and others added 4 commits March 21, 2019 13:24

Add ability to mmap docs data for foreground/background compaction

fe32fc2

Fix comment typos

534de74

Fix NewBlock build errors

98205b5

Merge branch 'master' into r/reuse-single-index-results-querying-acro…

28e78d1

…ss-blocks

richardartoul approved these changes Mar 21, 2019

View reviewed changes

richardartoul reviewed Mar 21, 2019

View reviewed changes

arnikola reviewed Mar 21, 2019

View reviewed changes

robskillington and others added 3 commits March 21, 2019 16:35

Merge branch 'master' into r/reuse-single-index-results-querying-acro…

672886a

…ss-blocks

Address feedback

ff46b45

Add comment to block query method

a8f8a2d

robskillington merged commit b9205e8 into master Mar 21, 2019

robskillington deleted the r/reuse-single-index-results-querying-across-blocks branch March 21, 2019 21:25

arnikola mentioned this pull request Mar 25, 2019

[index] Aggregating results on storage side #1463

Merged

2 tasks

Use a single index Results when querying across blocks #1474

Use a single index Results when querying across blocks #1474

Conversation

robskillington commented Mar 19, 2019

codecov bot commented Mar 19, 2019

Codecov Report

codecov bot commented Mar 19, 2019 • edited Loading

Codecov Report

arnikola left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnikola Mar 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Mar 19, 2019 •

edited

Loading

arnikola Mar 19, 2019 •

edited

Loading