-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: remove Liveness and Acceptance from NodeInfo #3785
Conversation
Important Auto Review SkippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
@coderabbitai /review |
Tip For best results, initiate chat on the files or code changes. @frrist, I have reviewed the changes made in the PR as per your request. Here's a summary of the key modifications relevant to the PR's objectives: Changes in
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Definitely an improvement to break into different types rather than jamming more fields into the existing one.
Couple of comments about the word liveness, as it feels a bit weird to use it in CLI output, but it might just be me.
Slight concern about the deletions and whether they affect libp2p usage, as we no longer test with the libp2p transport, it might be worth testing manually (if you haven't already) to ensure it continues to work.
cmd/cli/job/describe.go
Outdated
@@ -102,7 +102,7 @@ func (o *DescribeOptions) printHeaderData(cmd *cobra.Command, job *models.Job) { | |||
{Left: "Name", Right: job.Name}, | |||
{Left: "Namespace", Right: job.Namespace}, | |||
{Left: "Type", Right: job.Type}, | |||
{Left: "State", Right: job.State.StateType}, | |||
{Left: "Liveness", Right: job.State.StateType}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the state of the job, rather than the node, is Liveness the correct word here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, my editor was a little overzealous with the name change, will revert this.
cmd/cli/job/executions.go
Outdated
@@ -97,7 +97,7 @@ var ( | |||
Value: func(e *models.Execution) string { return strconv.FormatUint(e.Revision, 10) }, | |||
} | |||
executionColumnState = output.TableColumn[*models.Execution]{ | |||
ColumnConfig: table.ColumnConfig{Name: "State", WidthMax: 10, WidthMaxEnforcer: text.WrapText}, | |||
ColumnConfig: table.ColumnConfig{Name: "Liveness", WidthMax: 10, WidthMaxEnforcer: text.WrapText}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question, is Liveness the word we want to use for the execution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this was a mistake - continuing to blame the enthusiasm of my editor
cmd/cli/job/history.go
Outdated
@@ -108,7 +108,7 @@ var historyColumns = []output.TableColumn[*models.JobHistory]{ | |||
}, | |||
}, | |||
{ | |||
ColumnConfig: table.ColumnConfig{Name: "New State", WidthMax: 20, WidthMaxEnforcer: text.WrapText}, | |||
ColumnConfig: table.ColumnConfig{Name: "New Liveness", WidthMax: 20, WidthMaxEnforcer: text.WrapText}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New Liveness sounds a bit weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will revert
pkg/compute/management_client.go
Outdated
@@ -101,7 +102,7 @@ func (m *ManagementClient) RegisterNode(ctx context.Context) error { | |||
func (m *ManagementClient) deliverInfo(ctx context.Context) { | |||
// We _could_ avoid attempting an update if we are not registered, but | |||
// by doing so we will get frequent errors that the node is not | |||
// registered. | |||
// registered.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// registered.c | |
// registered. |
if err := n.store.Add(ctx, models.NodeState{ | ||
Info: request.Info, | ||
Approval: n.defaultApprovalState, | ||
// NB(forrest): by virtue of a compute node calling this endpoint we can consider it connected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be okay, but using other signals than the heartbeat might lead to flapping, so if that happens, this might be a reasonable place to look.
Info: request.Info, | ||
// the nodes approval state is assumed to be approved here, but re-use existing state | ||
Approval: existing.Approval, | ||
// TODO can we assume the node is connected here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. It might be best to inject a heartbeat as the liveness is held in memory rather than the store and we probably want to avoid situations where the node is not in the liveness map and we accidentally send Connected rather than the default of Disconnected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aren't still needed for libp2p are they?
Based on conversation with @wdbaruni and @seanmtracey we need to evaluate if this change breaks backwards compatibility and if it does handle it gracefully by adjusting marshaling or adjusting the topic we exchange these messages on to include a version |
- fixes #3783 - Introduces NodeState type used to track NodeInfo, Liveness, and Acceptance. Removes the idea of Livenss and Acceptance from data sent by compute nodes to the Requester(s) since compute nodes should not influence their Liveness or Acceptance. Those are values related to the Requesters view of the network.
- its not needed and necessary obfuscation
4e640d9
to
e9641c3
Compare
e9641c3
to
2c3a92b
Compare
This change has been validated with a v1.3.0 Requester communicating with a v1.3.1-rc Compute node and v1.3.0 Compute node as well as a v1.3.1-rc Requester node communicating with a v1.3.1-rc Compute node and v1.3.0 Compute node. Meaning it’s backwards and forwards compatible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments
// TODO if we ever pass a pointer to this type and use `==` comparison on it we're gonna have a bad time | ||
// implement an `Equal()` method for this type and default to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this sounds bad. can you open an issue to track this TODO?
type livenessContainer struct { | ||
CONNECTED NodeConnectionState | ||
DISCONNECTED NodeConnectionState | ||
HEALTHY NodeConnectionState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is HEALTHY?
@@ -85,7 +86,7 @@ type NATSTransport struct { | |||
natsClient *nats_helper.ClientManager | |||
computeProxy compute.Endpoint | |||
callbackProxy compute.Callback | |||
nodeInfoPubSub pubsub.PubSub[models.NodeInfo] | |||
nodeInfoPubSub pubsub.PubSub[models.NodeState] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't PubSub be about NodeInfo and not NodeState?
resourceMap *concurrency.StripedMap[models.Resources] | ||
heartbeats *heartbeat.HeartbeatServer | ||
defaultApprovalState models.NodeApproval | ||
defaultApprovalState models.NodeMembershipState | ||
} | ||
|
||
type NodeManagerParams struct { | ||
NodeInfo routing.NodeInfoStore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be Store
type Chain struct { | ||
discoverers []orchestrator.NodeDiscoverer | ||
ignoreErrors bool | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember you saying at one point that you are removing all chain implementations and replacing them with simple arrays. While the Chain looks like a simple array, it implements NodeDiscoverer
interface which allows different components to just accept the interface rather than an array of interfaces. This simplifies testing, error handling (e.g. ignoreErrors
), and can allow chain implementations with AND
or OR
logic.
Ahh .. I didn't see auto-merge was enabled. Anyways, there are few comments that can be addressed in a follow-up PR if the comments are valid |
- fixes #3783 - Introduces NodeState type used to track NodeInfo, Connection, and Membershio. Removes the idea of Connection and Membership from data sent by compute nodes to the Requester(s) since compute nodes should not influence their Connection state or mmembership. Those are values related to the Requesters view of the network. --------- Co-authored-by: frrist <forrest@expanso.io>
NodeInfo
withunknown
approval status which overrides previous approvals/rejections #3783