Feat: A more robust provider finder for sessions (for now) and soon for all bitswap #60

hannahhoward · 2019-01-23T02:52:50Z

Goals

Build a robust provider finder for sessions so that we can deduplicate provider logic (#52) and soon merge GetBlock w/ Session.GetBlocks (#49)

Implementation

A manager for find provider requests that:

dedups requests
rate limits requests
manages timeouts
handles connecting to peers once they are found

For discussion

Will writeup general concurrency architecture tomorrow

hannahhoward · 2019-01-23T02:53:22Z

Some tests are mildly unreliable, need to work on a bit more.

hannahhoward · 2019-01-23T22:14:41Z

Tests for this package should be more reliable now (still need to work on other BS ones)

Add a manger for querying providers on blocks, in charge of managing requests, deduping, and rate limiting

Add functionality to timeout find provider requests so they don't run forever

Integrate the ProviderQueryManager into the SessionPeerManager and bitswap in general re #52, re #49

Add debug logging for the provider query manager and make tests more reliable

Removed a minor condition check that could fail in some cases just due to timing, but not a code issue

michaelavila · 2019-02-01T18:28:15Z

providerquerymanager/providerquerymanager.go

+}
+
+type receivedProviderMessage struct {
+	k cid.Cid


nitpick/nonblocking: should we get more explicit with these names? key instead of k and session instead of ses?

well ses is gone.

Honestly, I struggle with what to call k -- cause k is actually a CID, not a key. But, key or k is widely used, and cid is usually the name of the package unfortunately. Maybe contentID. What do you use?

Yeah I've noticed the same, I've been using key and cid mixed in different places, maybe we can start using just key consistently? And I'll update on my end too?

yea I'd be done for that but I want to all agree on what we use cause it's a problem through the bitswap codebase

Stebalien · 2019-02-01T21:07:15Z

What's the motivation for breaking up the "Network" interface that way?

Stebalien · 2019-02-01T21:13:19Z

providerquerymanager/providerquerymanager.go

+				wg.Add(1)
+				go func(p peer.ID) {
+					defer wg.Done()
+					err := pqm.network.ConnectTo(pqm.ctx, p)


Have you considered #51 (comment)? That plus switching to sessions everywhere seems like a more robust solution.

Yes, and, I'm inclined to not address that until we get sessions everywhere merged first. This is necessary to do that. Hence: chicken/egg. Arguably it could all be one big PR, but I'm also generally disinclined to do things that way. Let me know if that's ok to proceed in this manner.

Stebalien · 2019-02-01T21:15:05Z

providerquerymanager/providerquerymanager.go

+			findProviderCtx, cancel := context.WithTimeout(pqm.ctx, pqm.findProviderTimeout)
+			pqm.timeoutMutex.RUnlock()
+			defer cancel()
+			providers := pqm.network.FindProvidersAsync(findProviderCtx, k, maxProviders)


This is going to keep working even if we no longer need to find any providers.

I believe I have fixed this. The context is no longer derived from the overall query manager, but a context for this keys request that itself is cancelled if all sessions requesting the key cancel their request.

providerquerymanager/providerquerymanager.go

michaelavila · 2019-02-02T00:05:47Z

providerquerymanager/providerquerymanager.go

+					k:   k,
+				}
+				// clear out any remaining providers
+				for range incomingProviders {


Out of curiosity, why do this drain?

don't want to block incase there are outstanding sends.

but maybe there is a better way to do this.

Thought about it a bit more, and yea, this is neccesary. The reason is this:
messages get processed serially in the run loop. One of those messages might be an incoming provider found message that gets broadcast to everyone who is waiting. We don't want that to block, so we need to make sure the channel is read, emptied, and closed (which happens once the cancel is processed) before we proceed.

removed session id user completely from providerquerymanager

Make sure if all requestors cancel their request to find providers on a peer, the overall query gets cancelled

hannahhoward · 2019-02-04T23:47:48Z

What's the motivation for breaking up the "Network" interface that way?

@Stebalien Well the session peer manager now only uses the connection manager (other network calls are through the Provider Query Manager) so I figured why not simplify even further? Makes tests simpler among other things.

Stebalien

Almost good to go.

Stebalien · 2019-02-05T06:20:37Z

providerquerymanager/providerquerymanager.go

+			case <-pqm.ctx.Done():
+				return
+			case <-sessionCtx.Done():
+				pqm.providerQueryMessages <- &cancelRequestMessage{


Probably needs to select on pqm.ctx.Done().

Also, I think this needs to read from incomingProviders at the same time. Otherwise, we're relying on the buffer in providerQueryMessages being large enough to fit this message.

right. I hate go. :P will fix.

fixed, I think

Stebalien · 2019-02-05T06:24:05Z

providerquerymanager/providerquerymanager.go

+			pqm.timeoutMutex.RLock()
+			findProviderCtx, cancel := context.WithTimeout(fpr.ctx, pqm.findProviderTimeout)
+			pqm.timeoutMutex.RUnlock()
+			defer cancel()


This will defer till the function returns.

is that bad? this cancel now only serves to make sure the context gets cleaned up. the cancel due to all the requests cancelling is the parent of this context.

Defered functions will get pushed onto a stack which will continue to grow until the worker returns.

Basically, it'll leak a bunch of memory.

Got it. I get a linter error if I don't name it and call it. I moved the discard to right after the for range of channel loop, which seems like the first safe time to cancel it.

Keep channels unblocked in cancelling request -- refactored to function. Also cancel find provider context as soon as it can be.

Stebalien · 2019-02-05T22:38:48Z

We should probably test this on the gateway, if possible. This now blocks the provider workers on actually connecting to the providers so that could have unintended consequences.

…provider Feat: A more robust provider finder for sessions (for now) and soon for all bitswap This commit was moved from ipfs/go-bitswap@bb89789

ghost assigned hannahhoward Jan 23, 2019

ghost added the status/in-progress In progress label Jan 23, 2019

hannahhoward requested review from Stebalien and eingenito January 23, 2019 02:52

hannahhoward force-pushed the feat/enhance-session-provider branch from 7478032 to bdaf8b3 Compare January 23, 2019 22:14

hannahhoward mentioned this pull request Jan 24, 2019

fix(tests): stabilize session tests #63

Merged

hannahhoward force-pushed the feat/enhance-session-provider branch from bdaf8b3 to 6af352a Compare January 26, 2019 01:45

hannahhoward changed the base branch from master to feat/stabilize-tests January 26, 2019 02:20

hannahhoward added 5 commits January 30, 2019 13:22

feat(bitswap): Add a ProvideQueryManager

5db627f

Add a manger for querying providers on blocks, in charge of managing requests, deduping, and rate limiting

feat(ProviderQueryManager): manage timeouts

1f2b49e

Add functionality to timeout find provider requests so they don't run forever

feat(ProviderQueryManager): integrate in sessions

843391e

Integrate the ProviderQueryManager into the SessionPeerManager and bitswap in general re #52, re #49

fix(ProviderQueryManager): fix test + add logging

1eb28a2

Add debug logging for the provider query manager and make tests more reliable

fix(providequerymanager): improve test stability

56d9e3f

Removed a minor condition check that could fail in some cases just due to timing, but not a code issue

hannahhoward force-pushed the feat/enhance-session-provider branch from c5678f1 to 56d9e3f Compare January 30, 2019 21:22

hannahhoward changed the base branch from feat/stabilize-tests to master January 30, 2019 21:24

michaelavila reviewed Feb 1, 2019

View reviewed changes

Stebalien reviewed Feb 1, 2019

View reviewed changes

michaelavila reviewed Feb 2, 2019

View reviewed changes

refactor(providerquerymanager): don't use session ids

92717db

removed session id user completely from providerquerymanager

hannahhoward force-pushed the feat/enhance-session-provider branch from 23f9b87 to 92717db Compare February 4, 2019 20:20

hannahhoward added 2 commits February 4, 2019 12:31

fix(providerquerymanager): minor fixes to capture all cancellations

51e82a6

feat(providerquerymanager): cancel FindProvidersAsync correctly

b48b3c3

Make sure if all requestors cancel their request to find providers on a peer, the overall query gets cancelled

Stebalien suggested changes Feb 5, 2019

View reviewed changes

fix(providerquerymanager): minor channel cleanup

30f40ec

Keep channels unblocked in cancelling request -- refactored to function. Also cancel find provider context as soon as it can be.

hannahhoward force-pushed the feat/enhance-session-provider branch from 44fe93e to 30f40ec Compare February 5, 2019 19:07

Stebalien approved these changes Feb 5, 2019

View reviewed changes

hannahhoward merged commit bb89789 into master Feb 5, 2019

ghost removed the status/in-progress In progress label Feb 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: A more robust provider finder for sessions (for now) and soon for all bitswap #60

Feat: A more robust provider finder for sessions (for now) and soon for all bitswap #60

hannahhoward commented Jan 23, 2019

hannahhoward commented Jan 23, 2019

hannahhoward commented Jan 23, 2019 •

edited

Loading

michaelavila Feb 1, 2019

hannahhoward Feb 4, 2019

michaelavila Feb 4, 2019

hannahhoward Feb 5, 2019

Stebalien commented Feb 1, 2019

Stebalien Feb 1, 2019

hannahhoward Feb 4, 2019

Stebalien Feb 1, 2019

hannahhoward Feb 4, 2019

michaelavila Feb 2, 2019

hannahhoward Feb 4, 2019

hannahhoward Feb 4, 2019

hannahhoward Feb 4, 2019

hannahhoward commented Feb 4, 2019

Stebalien left a comment

Stebalien Feb 5, 2019

Stebalien Feb 5, 2019

hannahhoward Feb 5, 2019

hannahhoward Feb 5, 2019

Stebalien Feb 5, 2019

hannahhoward Feb 5, 2019

Stebalien Feb 5, 2019

Stebalien Feb 5, 2019

hannahhoward Feb 5, 2019

Stebalien commented Feb 5, 2019

Feat: A more robust provider finder for sessions (for now) and soon for all bitswap #60

Feat: A more robust provider finder for sessions (for now) and soon for all bitswap #60

Conversation

hannahhoward commented Jan 23, 2019

Goals

Implementation

For discussion

hannahhoward commented Jan 23, 2019

hannahhoward commented Jan 23, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien commented Feb 1, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hannahhoward commented Feb 4, 2019

Stebalien left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien commented Feb 5, 2019

hannahhoward commented Jan 23, 2019 •

edited

Loading