Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improve sessions implementation #495

Merged
merged 10 commits into from
Apr 15, 2024
Merged

Conversation

achingbrain
Copy link
Member

  • Sessions are created synchronously
  • The root CID of a session is filled on the first CID retrieval for zero-delay session creation
  • Providers are found and queried for the root block directly, any that have it are added to the session
  • Further peers are added to the session as more CIDs are requested
  • Further peers are searched for when no current peers have the block for a requested CID
  • Providers that have errored (e.g. protocol selection failure) are excluded from the session
  • Bitswap only queries provider peers, not directly connected peers
  • HTTP Gatways are loaded from the routing
  • When providers are returned without multiaddrs we try to load them without blocking yielding of other providers
  • Common session code has been moved into an abstract superclass to remove duplication.

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if necessary (this includes comments as well)
  • I have added tests that prove my fix is effective or that my feature works

Moves most common session code into an abstract superclass to remove
duplication.

- Sessions are created synchronously
- The root CID of a session is filled on the first CID retrieval
- Providers are found and queried for the root block directly, any that have it are added to the session
- Providers that have errored (e.g. protocol selection failure) are excluded from the session
- Bitswap only queries provider peers, not directly connected peers
- HTTP Gatways are loaded from the routing
- When providers are returned without multiaddrs we try to load them without blocking yielding of other providers
Copy link
Member

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whew. I'm a big fan of all the listed improvements. Various comments for things I didn't understand and some minor improvements we could make, but overall lgtm

packages/bitswap/src/network.ts Show resolved Hide resolved

return {
announce: async (cid, block, options) => {
await this.bitswap.notify(cid, block, options)
},

retrieve: async (cid, options) => {
return session.want(cid, options)
return session.retrieve(cid, options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice. I like the idea of normalizing session usage to blockBroker terminology 🎉

async createSession (root: CID, options?: CreateSessionOptions<BitswapWantBlockProgressEvents>): Promise<BlockBroker<BitswapWantBlockProgressEvents, BitswapNotifyProgressEvents>> {
const session = await this.bitswap.createSession(root, options)
createSession (options?: CreateSessionOptions<BitswapWantBlockProgressEvents>): BlockBroker<BitswapWantBlockProgressEvents, BitswapNotifyProgressEvents> {
const session = this.bitswap.createSession(options)

return {
announce: async (cid, block, options) => {
await this.bitswap.notify(cid, block, options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking maybe this should be something like session.announce, but.. we want all bitswap peers to know about blocks we have now right?

It feels like there may be something to optimize here. If we've got a session of bitswap negotiations going on, we want any peers we sent a WANT, to receive a HAVE/DONTWANT from us, but if we didn't send a WANT/WANTHAVE a want to a peer, and they didn't send a WANT/WANTHAVE to us, we don't necessarily need to notify that peer that we have the block, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want all bitswap peers to know about blocks we have now right?

Yes, if any peers want the block we are announcing (e.g. it's in the want list they've previously sent us), the protocol says we are to send it to them.

we want any peers we sent a WANT, to receive a HAVE/DONTWANT from us

If we've sent them a WANT-BLOCK or WANT-HAVE, and we acquire the block by other means before they reply with HAVE or DONT-HAVE, we send them an updated wantlist cancelling that particular WANT-*.

if we didn't send a WANT/WANTHAVE a want to a peer, and they didn't send a WANT/WANTHAVE to us, we don't necessarily need to notify that peer that we have the block, right?

We'll only notify them if they've previously sent us a WANT-BLOCK or a WANT-HAVE and they've not subsequently cancelled that WANT-*.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like the this.bitswap instance will handle this for us then, and we don't need to use a scoped-down session.notify.

thanks for the explainer

}
}

function filterMultiaddrs (multiaddrs: Multiaddr[], allowInsecure: boolean, allowLocal: boolean): Multiaddr[] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow for future customizations with config obj?

Suggested change
function filterMultiaddrs (multiaddrs: Multiaddr[], allowInsecure: boolean, allowLocal: boolean): Multiaddr[] {
function filterMultiaddrs (multiaddrs: Multiaddr[], { allowInsecure, allowLocal }: { allowInsecure: boolean, allowLocal: boolean }): Multiaddr[] {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 I usually think if the final arg is an object, then it's an options object, and all properties should be optional.

Comment on lines 59 to 60
// increase max providers so we can find another more suitable peer
this.maxProviders++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should provide a mechanism to cap how much this can increase. if a user customizes maxProviders and we increase based on failed queries to certain providers, we're not respecting that config.

maybe we can increase a validProviders count and leave maxProviders to the provided config value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this increases is to allow adding extra session peers after the existing peers prove bad or unreliable.

When we add new peers we query the routing for new providers - if this is deterministic we'd get the same set of providers back, so we need a method to exclude known bad peers, which is why this sets a "failed" flag on the peer and allows the session size to increase.

This probably isn't very scalable since we end up storing an unbounded list of bad peers. I've refactored the code to use a bloom filter to exclude bad peers, this will only ever occupy the memory used by the filter, which is fixed based on the number of hashes it is expected to contain.

Comment on lines 136 to 138
if (options.signal?.aborted === true) {
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (options.signal?.aborted === true) {
return
}
if (options.signal?.aborted === true) {
// not throwing the aborted signal because xxxxx
return
}

packages/utils/src/abstract-session.ts Show resolved Hide resolved
packages/utils/src/abstract-session.ts Outdated Show resolved Hide resolved
packages/utils/src/index.ts Show resolved Hide resolved
packages/utils/src/utils/networked-storage.ts Outdated Show resolved Hide resolved
@achingbrain achingbrain merged commit 9ea934e into main Apr 15, 2024
4 checks passed
@achingbrain achingbrain deleted the fix/sessions-improvements branch April 15, 2024 13:13
@achingbrain achingbrain mentioned this pull request Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: 🎉 Done
Development

Successfully merging this pull request may close these issues.

2 participants