-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
large refactor of bitswap, implement wantmanager to manage wantlist #1218
Conversation
@@ -35,30 +34,6 @@ func TestClose(t *testing.T) { | |||
bitswap.Exchange.GetBlock(context.Background(), block.Key()) | |||
} | |||
|
|||
func TestProviderForKeyButNetworkCannotFind(t *testing.T) { // TODO revisit this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test made my head hurt. Pretty sure it wasnt testing anything useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well it's testing what error pops out of bitswap when: (a) we have a provider, but (b) it is not reachable. we need to keep these tests. if the name is too complicated, maybe:
TestUnreachableProvider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what you describe is not what is being tested here. all it appears to be testing is that a context expires
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test does appear to be testing it fine, see below. But the timeout may be way too small. maybe 1us not ns.
// create an identity and provide a block.
pinfo := p2ptestutil.RandTestBogusIdentityOrFatal(t)
rs.Client(pinfo).Provide(context.Background(), block.Key()) // but not on network
solo := g.Next()
defer solo.Exchange.Close()
ctx, _ := context.WithTimeout(context.Background(), time.Nanosecond)
// getting the block should fail, because the provider isn't there.
_, err := solo.Exchange.GetBlock(ctx, block.Key())
// the way it fails is that thectx expires.
if err != context.DeadlineExceeded {
The big problem here, is that we want backpressure everywhere but in bitswap... |
that's concerning. can you repro in an isolated case? i can sink into it and check it out |
// TODO move this to peerstore | ||
) | ||
|
||
type PeerManager struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may want to give an overview of what PeerManager is for here
without yet understanding it fully, this seems to me like a great step forward. |
I hope so! 😄 |
@jbenet this is RFCR, i'm going to rebase in a little bit, but the big stuff is there. This is MUCH faster for transfer speeds on the open internet, theres an occasional pause of ten seconds that i'm still tracking down, but it never truly hangs or degrades. It will always continue after bitswaps 10 second 're-request' timer ticks. In practice, ive only seen the pause once every three or four runs. |
cece493
to
0e20492
Compare
if err != nil { | ||
log.Error(err) | ||
// TODO: cant connect, what now? | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's ok to check if you really wil send to mq.p
. if there's a connection, this call returns early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, thoughts on what to do when it fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ConnectTo
fails-- connection failed. So it's like a disconnection, i'd say?
@whyrusleeping one thing i'm missing is: how do partial updates happen? |
partial wantlist updates happen the same way as they normally do. you just send your message through the peermanager instead of the network, and if multiple messages pile up while youre either dialing or sending a previous message, they get combined |
could you point me to funcs + flow? i'm failing to see it :( |
a partial update happen here: https://github.com/ipfs/go-ipfs/pull/1218/files#diff-9d5aff71f348a6f1cdcd472cd7bd4415R391 via a broadcast (which just means, send to everyone) |
}() | ||
|
||
if mq.wlmsg == nil || msg.Full() { | ||
mq.wlmsg = msg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if partial updates come this way too, not sure this should be getting set? maybe only if Full()
? not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or is mq.wlmsg
a message "staged" to be sent out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the wlmsg
is the one we are staging to be sent out, and in runQueue
we grab it, and send it out upon receiving the work signal
@whyrusleeping i don't grasp this changeset well-- the datastructs aren't obviously clear, for example: type msgQueue struct {
p peer.ID
lk sync.Mutex
wlmsg bsmsg.BitSwapMessage
work chan struct{}
done chan struct{}
} turns out this type wantlistMsgQueue struct {
p peer.ID
lk sync.Mutex // guards ... only next? or what else?
next bsmsg.BitSwapMessage
work chan struct{}
done chan struct{}
} the other datastructures could use some clarifying too. the control flow is hard. maybe try drawing it? it's possible that drawing it will either reveal the complexity OR the simplicity -- i'm maybe not seeing it right. |
its only a full wantlist if the caller makes it a full wantlist. im not sure what youre seeing here |
if mq.wlmsg == nil || msg.Full() {
mq.wlmsg = msg
return
} here, if we arent holding a message, we take the one being added. Otherwise, down here, we add the entries from the passed in wantlist to the one we are holding, to combine them into one, saving us a send call // TODO: add a msg.Combine(...) method
for _, e := range msg.Wantlist() {
if e.Cancel {
mq.wlmsg.Cancel(e.Key)
} else {
mq.wlmsg.AddEntry(e.Key, e.Priority)
}
} |
@@ -326,7 +316,7 @@ func (bs *Bitswap) sendWantlistToProviders(ctx context.Context, entries []wantli | |||
|
|||
// TODO(brian): handle errors | |||
func (bs *Bitswap) ReceiveMessage(ctx context.Context, p peer.ID, incoming bsmsg.BitSwapMessage) error { | |||
defer log.EventBegin(ctx, "receiveMessage", p, incoming).Done() | |||
//defer log.EventBegin(ctx, "receiveMessage", p, incoming).Done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is version controlled, so we should delete instead of commenting-out code we want to kill.
479cefa
to
f9f85ff
Compare
3c1e74f
to
b827e4e
Compare
@whyrusleeping b0eb0d2 breaks. all others pass. |
noted, ill rebase that out |
b827e4e
to
7f3bb79
Compare
Also more refactoring of bitswap in general, including some perf improvements and eventlog removal. clean up, and buffer channels move some things around correctly buffer work messages more cleanup, and improve test perf remove unneccessary test revert changes to bitswap message, they werent necessary
large refactor of bitswap, implement wantmanager to manage wantlist
After speaking with @whyrusleeping we decided to merge this in now, as it is already better than master. |
yeap, its better, not perfect. but we can address the new issues as they come up |
This PR moves all wantlist update logic into the wantmanager, this ensures that outgoing messages are synchronized and coalesced as needed. This greatly improves bitswap performance and correctness.