x-snap-config: Do not check if node is part of the cluster #688

bschimke95 · 2024-09-20T16:31:00Z

Follow-up fix for: #633

In the mentioned PR, we wait until k8sd is up to sync the config. We exit early if the node is not part of a cluster. However, a closer look at the implementation of NodeStatus shows that this would skip any error that we wait for:

func (c *k8sd) NodeStatus(ctx context.Context) (apiv1.NodeStatusResponse, bool, error) {
	response, err := query(ctx, c, "GET", apiv1.NodeStatusRPC, nil, &apiv1.NodeStatusResponse{})
	if err != nil {
		// Error 503 means the node is not initialized yet
		var statusErr api.StatusError
		if errors.As(err, &statusErr) {
			if statusErr.Status() == http.StatusServiceUnavailable {
				return apiv1.NodeStatusResponse{}, false, nil
			}
		}

		return apiv1.NodeStatusResponse{}, false, err <- Returns partOfCluster == false if an error happens. 
	}

	return response, true, nil
}

This PR removes the wrong check.

addyess · 2024-09-20T17:01:57Z

src/k8s/cmd/k8s/k8s_x_snapd_config.go

-				_, partOfCluster, err := client.NodeStatus(cmd.Context())
-				if !partOfCluster {
-					cmd.PrintErrf("Node is not part of a cluster: %v\n", err)
-					env.Exit(1)
-				}
+				_, _, err := client.NodeStatus(cmd.Context())


So, if NodeStatus() always returns partofCluster=false for any error why do we no longer need this exit early?

i'm not sure i have the context from #633 to properly review this. Maybe @mateoflorido

basically you're saying any error from NodeStatus at all exits early-- and that's wrong?

So, WaitUntilReady will retry if the bool return value is false. If something in the error return value is returned, then this exits immediately.

NodeStatus second return value indicates if the node is part of a cluster or not. However, it will also return false for this argument if an error is returned (third return value). The current implementation would always exit with Node is notpart of cluster even if NodeStatus returned and error (which means the "is part of cluster" return value is not valid).

AHH! i didn't look down at lines 70-71. Thank you

addyess

LGTM, thanks for the explanation!

Ignore part of cluster check

135b3d1

bschimke95 requested a review from a team as a code owner September 20, 2024 16:31

addyess reviewed Sep 20, 2024

View reviewed changes

addyess approved these changes Sep 20, 2024

View reviewed changes

bschimke95 merged commit 6b15893 into main Sep 23, 2024
19 of 20 checks passed

bschimke95 deleted the bschimke95/fix-snap-sync branch September 23, 2024 11:11

evilnick pushed a commit to evilnick/k8s-snap that referenced this pull request Nov 14, 2024

Ignore part of cluster check (canonical#688)

c859b1a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x-snap-config: Do not check if node is part of the cluster #688

x-snap-config: Do not check if node is part of the cluster #688

bschimke95 commented Sep 20, 2024

addyess Sep 20, 2024

addyess Sep 20, 2024

bschimke95 Sep 20, 2024

addyess Sep 20, 2024

addyess left a comment

x-snap-config: Do not check if node is part of the cluster #688

x-snap-config: Do not check if node is part of the cluster #688

Conversation

bschimke95 commented Sep 20, 2024

addyess Sep 20, 2024

Choose a reason for hiding this comment

addyess Sep 20, 2024

Choose a reason for hiding this comment

bschimke95 Sep 20, 2024

Choose a reason for hiding this comment

addyess Sep 20, 2024

Choose a reason for hiding this comment

addyess left a comment

Choose a reason for hiding this comment