-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add explore state machine to expand population of routing table #934
Conversation
* Move coord and kadt packages to internal * go mod tidy * go fmt * Move kadt out of internal and add RoutingTable interface
exploreCfg.Clock = cfg.Clock | ||
exploreCfg.Timeout = cfg.QueryTimeout | ||
|
||
schedule, err := routing.NewDynamicExploreSchedule(14, cfg.Clock.Now(), time.Hour, 1, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also: // TODO: expose more config
v2/internal/coord/routing.go
Outdated
// A RoutingBehaviour provices the behaviours for bootstrapping and maintaining a DHT's routing table. | ||
const ( | ||
// IncludeQueryID is the id for connectivity checks performed by the include state machine. | ||
// This identifier used for routing network responses to the state machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This identifier used for routing network responses to the state machine. | |
// This identifier is used for routing network responses to the state machine. |
v2/internal/coord/routing.go
Outdated
IncludeQueryID = coordt.QueryID("include") | ||
|
||
// ProbeQueryID is the id for connectivity checks performed by the probe state machine | ||
// This identifier used for routing network responses to the state machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This identifier used for routing network responses to the state machine. | |
// This identifier is used for routing network responses to the state machine. |
v2/internal/coord/routing.go
Outdated
@@ -28,6 +38,9 @@ type RoutingBehaviour struct { | |||
// probe is the node probing state machine, responsible for periodically checking connectivity of nodes in the routing table | |||
probe coordt.StateMachine[routing.ProbeEvent, routing.ProbeState] | |||
|
|||
// probe is the routing table explore state machine, responsible for increasing the occupanct of the routing table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// probe is the routing table explore state machine, responsible for increasing the occupanct of the routing table | |
// explore is the routing table explore state machine, responsible for increasing the occupancy of the routing table |
@@ -112,7 +127,7 @@ func (r *RoutingBehaviour) notify(ctx context.Context, ev BehaviourEvent) { | |||
case *EventGetCloserNodesSuccess: | |||
span.SetAttributes(attribute.String("event", "EventGetCloserNodesSuccess"), attribute.String("queryid", string(ev.QueryID)), attribute.String("nodeid", ev.To.String())) | |||
switch ev.QueryID { | |||
case "bootstrap": | |||
case routing.BootstrapQueryID: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possible to also lift this constant out of the routing
package or move all constants into that? just to be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends where the query or request is initiated. The connectivity checks performed by probe and include are initiated by the routing behaviour so the constants belong there. The bootstrap and explore queries are initiated by the state machines.
I'd like to move the behaviours into the packages with the state machines at some point so the constants would all end up there too.
queryID := coordt.QueryID("bootstrap") | ||
|
||
qry, err := query.NewFindCloserQuery[K, N, any](b.self, queryID, b.self.Key(), iter, tev.KnownClosestNodes, qryCfg) | ||
qry, err := query.NewFindCloserQuery[K, N, any](b.self, BootstrapQueryID, b.self.Key(), iter, tev.KnownClosestNodes, qryCfg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a great benefit now but there's a coordt.Message
type
qry, err := query.NewFindCloserQuery[K, N, any](b.self, BootstrapQueryID, b.self.Key(), iter, tev.KnownClosestNodes, qryCfg) | |
qry, err := query.NewFindCloserQuery[K, N, coordt.Message](b.self, BootstrapQueryID, b.self.Key(), iter, tev.KnownClosestNodes, qryCfg) |
// before proceeding to the next bucket. | ||
// | ||
// The frequency of running an explore varies by bucket distance, such that closer buckets are processed more frequently. | ||
type Explore[K kad.Key[K], N kad.NodeID[K]] struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will an expore operation also be triggered when we are low on peers in the routing table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not automatically. But it can be added. I will create an issue for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Target: st.Target, | ||
} | ||
case *query.StateQueryFinished[K, N]: | ||
span.SetAttributes(attribute.String("out_state", "StateExploreFinished")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in other places I used tele.AttrOutEvent(...)
. I also used the defer to capture the event that's passed out of this function.
v2/internal/coord/routing/explore.go
Outdated
func (s *DynamicExploreSchedule) NextCpl(ts time.Time) (int, bool) { | ||
// is an explore due yet? | ||
next := (*s.cpls)[0] | ||
if !next.Due.After(ts) { | ||
// update its schedule | ||
|
||
interval := float64(s.interval) + float64(s.interval)*float64(s.maxCpl-next.Cpl)*s.multiplier | ||
interval *= 1 + rand.Float64()*s.jitter | ||
|
||
next.Due = ts.Add(s.cplInterval(next.Cpl)) | ||
heap.Fix(s.cpls, 0) // update the heap | ||
return next.Cpl, true | ||
} | ||
|
||
return -1, false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more go-like to unnest the happy-path:
func (s *DynamicExploreSchedule) NextCpl(ts time.Time) (int, bool) {
// is an explore due yet?
next := (*s.cpls)[0]
if next.Due.After(ts) {
return -1, false
}
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, will make that change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
v2/internal/coord/routing/explore.go
Outdated
interval := float64(s.interval) + float64(s.interval)*float64(s.maxCpl-next.Cpl)*s.multiplier | ||
interval *= 1 + rand.Float64()*s.jitter | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to cplInterval
interval := float64(s.interval) + float64(s.interval)*float64(s.maxCpl-next.Cpl)*s.multiplier | |
interval *= 1 + rand.Float64()*s.jitter |
Two general comments:
|
Yes, by design. Do you think that is an issue?
We had some discussion on this in the design issue and @guillaumemichel seemed to be of the opinion that exploring closer buckets more frequently was useful (one comment here). My reasoning is that since every peer discovered is a candidate for inclusion in the routing table, querying for nearby peers will almost certainly discover peers that populate further buckets first. Bucket 0 is likely to fill up with every explore we do.
Yes, this would be quite simple to add. |
Cool, thanks for the clarification! Makes sense 👍
Nope, all good! Just wanted to verify my understanding of what's going on. |
No description provided.