-
Notifications
You must be signed in to change notification settings - Fork 37
Conversation
Bugz: - Remove CancelDial from addConn, which would short-circuit the dials waiting for its result - Fix race in decref reaching 0 Test issues: - Adjust for the new signature of the DialFunc - Connections can be empty shims in tests, so Close can panic - Comment out TestBasicDialSync which is now not applicable and needs to be rewritten - Account for multerrors instead of flat DialErrors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about an approach where a new dial joins existing dials?
- Spawn a dial worker.
- Each dial sends a message down a channel to the dial worker with options & address filters.
- When a dial joins, check all undialed addresses (addresses in the peerstore minus the set of already tried addresses) against the address filter and consider adding them to the back of the dial queue.
- When we form connections, check them against the address filters.
- When a dial "leaves" (cancels), remove addresses that no longer match any filter.
I'm open to approaches, but the important parts are:
- Seamlessly handle multiple dials joining/leaving without wasting work.
- Eventually, we want this to morph into something like the "dialer v2 work". Whatever we do here, we want to be able to "replan" dials to pending addresses while the dial is active. The current approach will force us to wait for the current dial to complete.
defer ad.decref() | ||
|
||
dialCtx := ad.dialContext(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: by go convention, xContext
means "x but with a context". Rename to deriveDialContext
.
break startDial | ||
case <-ad.waitch: | ||
// we lost a race, we need to try again because the connection might not be what we want | ||
ad.decref() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we dereferencing? Won't that cause the in-progress dial to be canceled if the original caller bails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we'll just restart, but that's pretty inefficient if we were half-way through opening a bunch of connections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out, we only invoke this code if the active dial is currently "finishing". So we're not going to cancel anything.
// for mock tests | ||
if c == nil || c.conn == nil { | ||
return nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
???? This is really scary. Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary after fixing a bug, will remove!
ad.addrsLk.Lock() | ||
defer ad.addrsLk.Unlock() | ||
|
||
for _, a := range addrs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we never bother to dial one of these addresses because we're canceled? We'll just skip that overall.
Test case:
- Start dial A.
- Start dial B.
- Cancel dial A.
- Dial B should pick up where dial A left off.
With the current code, dial B will, I believe, fail to dial any of the addresses dial A was trying to dial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, the dials will still happen -- they use the background context, so there should be no interference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see.
Closing in favor of #250 which accomplishes the same goals with much cleaner code. |
This implements the necessary logic for dealing with simultaneous dials which may have conflicting constraints.
TBD: tests