Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(kwok): prevent quitting when scaling down node group #6336

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

qianlei90
Copy link
Contributor

@qianlei90 qianlei90 commented Dec 2, 2023

What type of PR is this?

/kind bug

What this PR does / why we need it:

When using the Kwok provider, CA quits when scaling down a node group because the Kwok provider cannot retrieve the node group name from a fake node. This PR primarily aims to fix this issue.

Additionally, I have fixed the target size when scaling up and down the node group.

Which issue(s) this PR fixes:

kwok-provider-config
apiVersion: v1
data:
  config: |-
    apiVersion: v1alpha1
    readNodesFrom: configmap
    nodegroups:
      fromNodeLabelKey: "kwok-nodegroup"
    nodes:
    configmap:
      name: kwok-provider-templates
    kwok:
      install: false
kind: ConfigMap
metadata:
  name: kwok-provider-config
  namespace: default
kwok-provider-templates
apiVersion: v1
data:
  templates: |-
    apiVersion: v1
    items:
    - apiVersion: v1
      kind: Node
      metadata:
        annotations:
          node.alpha.kubernetes.io/ttl: "0"
          kwok.x-k8s.io/node: fake
        labels:
          beta.kubernetes.io/arch: amd64
          beta.kubernetes.io/os: linux
          kubernetes.io/arch: amd64
          kubernetes.io/hostname: kwok-node-0
          kubernetes.io/os: linux
          kubernetes.io/role: agent
          node-role.kubernetes.io/agent: ""
          type: kwok
          kwok-nodegroup: cluster-autoscaler
        name: kwok-node-0
      spec: {}
      status:
        allocatable:
          cpu: 32
          memory: 256Gi
          pods: 110
        capacity:
          cpu: 32
          memory: 256Gi
          pods: 110
        nodeInfo:
          architecture: amd64
          bootID: ""
          containerRuntimeVersion: ""
          kernelVersion: ""
          kubeProxyVersion: fake
          kubeletVersion: fake
          machineID: ""
          operatingSystem: linux
          osImage: ""
          systemUUID: ""
        phase: Running
    kind: List
    metadata:
      resourceVersion: ""
kind: ConfigMap
metadata:
  name: kwok-provider-templates
  namespace: default
starting CA
POD_NAMESPACE=default KWOK_PROVIDER_MODE=local ./cluster-autoscaler-amd64
    --cloud-provider=kwok \
    --namespace=default \
    --kubeconfig=<kubeconfig> \
    --expander=random \
    --scale-down-enabled=true \
    --scale-down-utilization-threshold=0.5 \
    --scale-down-gpu-utilization-threshold=0.5 \
    --scale-down-delay-after-add=10s \
    --scale-down-delay-after-failure=10s \
    --scale-down-unneeded-time=0s \
    --skip-nodes-with-system-pods=true \
    --skip-nodes-with-local-storage=true \
    --logtostderr=true \
    --stderrthreshold=info \
    --leader-elect=false \
    --v=4 \
    --scan-interval=3s
scale this deployment to test scale up and down
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployments-simple-deployment-deployment
  namespace: default
spec:
  replicas: 0
  selector:
    matchLabels:
      app: deployments-simple-deployment-app
  template:
    metadata:
      labels:
        app: deployments-simple-deployment-app
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kwok-nodegroup
                operator: In
                values:
                - cluster-autoscaler
      containers:
      - command:
        - sleep
        - "3600"
        image: busybox
        imagePullPolicy: Always
        name: busybox
        resources:
          requests:
            cpu: "31"
      terminationGracePeriodSeconds: 0
      tolerations:
      - effect: NoSchedule
        key: kwok-provider
        operator: Equal
        value: "true"
CA log
I1202 22:49:14.738401 1229586 static_autoscaler.go:290] Starting main loop
I1202 22:49:14.738533 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:14.738623 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:14.738654 1229586 filter_out_schedulable.go:63] Filtering out schedulables
I1202 22:49:14.738720 1229586 klogx.go:87] failed to find place for default/deployments-simple-deployment-deployment-59994f79f6-r85f5: cannot put pod deployments-simple-deployment-deployment-59994f79f6-r85f5 on any node
I1202 22:49:14.738732 1229586 filter_out_schedulable.go:120] 0 pods marked as unschedulable can be scheduled.
I1202 22:49:14.738740 1229586 filter_out_schedulable.go:83] No schedulable pods
I1202 22:49:14.738746 1229586 filter_out_daemon_sets.go:40] Filtering out daemon set pods
I1202 22:49:14.738751 1229586 filter_out_daemon_sets.go:49] Filtered out 0 daemon set pods, 1 unschedulable pods left
I1202 22:49:14.738768 1229586 klogx.go:87] Pod default/deployments-simple-deployment-deployment-59994f79f6-r85f5 is unschedulable
I1202 22:49:14.738839 1229586 orchestrator.go:108] Upcoming 0 nodes
I1202 22:49:14.738847 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:14.739038 1229586 orchestrator.go:181] Best option to resize: cluster-autoscaler-1701528461
I1202 22:49:14.739047 1229586 orchestrator.go:185] Estimated 1 nodes needed in cluster-autoscaler-1701528461
I1202 22:49:14.739061 1229586 orchestrator.go:291] Final scale-up plan: [{cluster-autoscaler-1701528461 0->1 (max: 200)}]
I1202 22:49:14.739077 1229586 executor.go:147] Scale-up: setting group cluster-autoscaler-1701528461 size to 1
I1202 22:49:14.739181 1229586 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"default", Name:"cluster-autoscaler-status", UID:"095e2c8c-de6b-44b1-bda9-7174134f7a6e", APIVersion:"v1", ResourceVersion:"24451", FieldPath:""}): type: 'Normal' reason: 'ScaledUpGroup' Scale-up: setting group cluster-autoscaler-1701528461 size to 1 instead of 0 (max: 200)
I1202 22:49:14.743388 1229586 eventing_scale_up_processor.go:47] Skipping event processing for unschedulable pods since there is a ScaleUp attempt this loop
I1202 22:49:14.743494 1229586 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"deployments-simple-deployment-deployment-59994f79f6-r85f5", UID:"fbfb034f-f5a8-42b8-9e81-411caaa49042", APIVersion:"v1", ResourceVersion:"24446", FieldPath:""}): type: 'Normal' reason: 'TriggeredScaleUp' pod triggered scale-up: [{cluster-autoscaler-1701528461 0->1 (max: 200)}]
I1202 22:49:17.749563 1229586 static_autoscaler.go:290] Starting main loop
I1202 22:49:17.749713 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:17.749828 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:17.749844 1229586 clusterstate.go:260] Scale up in group cluster-autoscaler-1701528461 finished successfully in 3.006191994s
I1202 22:49:17.749875 1229586 filter_out_schedulable.go:63] Filtering out schedulables
I1202 22:49:17.749884 1229586 filter_out_schedulable.go:120] 0 pods marked as unschedulable can be scheduled.
I1202 22:49:17.749892 1229586 filter_out_schedulable.go:83] No schedulable pods
I1202 22:49:17.749901 1229586 filter_out_daemon_sets.go:40] Filtering out daemon set pods
I1202 22:49:17.749905 1229586 filter_out_daemon_sets.go:49] Filtered out 0 daemon set pods, 0 unschedulable pods left
I1202 22:49:17.749912 1229586 static_autoscaler.go:547] No unschedulable pods
I1202 22:49:17.749918 1229586 static_autoscaler.go:570] Calculating unneeded nodes
I1202 22:49:17.749924 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:17.749930 1229586 pre_filtering_processor.go:57] Node minikube should not be processed by cluster autoscaler (no node group config)
I1202 22:49:17.749961 1229586 eligibility.go:162] Node cluster-autoscaler-1701528461-xghmv unremovable: cpu requested (96.875% of allocatable) is above the scale-down utilization threshold
I1202 22:49:17.749989 1229586 static_autoscaler.go:617] Scale down status: lastScaleUpTime=2023-12-02 22:49:14.738369867 +0800 CST m=+97.234587115 lastScaleDownDeleteTime=2023-12-02 21:47:41.492276216 +0800 CST m=-3596.011506536 lastScaleDownFailTime=2023-12-02 21:47:41.492276216 +0800 CST m=-3596.011506536 scaleDownForbidden=false scaleDownInCooldown=true
I1202 22:49:20.754194 1229586 static_autoscaler.go:290] Starting main loop
I1202 22:49:20.754386 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:20.754575 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:20.754637 1229586 filter_out_schedulable.go:63] Filtering out schedulables
I1202 22:49:20.754651 1229586 filter_out_schedulable.go:120] 0 pods marked as unschedulable can be scheduled.
I1202 22:49:20.754666 1229586 filter_out_schedulable.go:83] No schedulable pods
I1202 22:49:20.754674 1229586 filter_out_daemon_sets.go:40] Filtering out daemon set pods
I1202 22:49:20.754683 1229586 filter_out_daemon_sets.go:49] Filtered out 0 daemon set pods, 0 unschedulable pods left
I1202 22:49:20.754698 1229586 static_autoscaler.go:547] No unschedulable pods
I1202 22:49:20.754709 1229586 static_autoscaler.go:570] Calculating unneeded nodes
I1202 22:49:20.754719 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:20.754727 1229586 pre_filtering_processor.go:57] Node minikube should not be processed by cluster autoscaler (no node group config)
I1202 22:49:20.754761 1229586 klogx.go:87] Node cluster-autoscaler-1701528461-xghmv - memory requested is 0% of allocatable
I1202 22:49:20.754785 1229586 cluster.go:156] Simulating node cluster-autoscaler-1701528461-xghmv removal
I1202 22:49:20.754805 1229586 cluster.go:179] node cluster-autoscaler-1701528461-xghmv may be removed
I1202 22:49:20.754817 1229586 nodes.go:84] cluster-autoscaler-1701528461-xghmv is unneeded since 2023-12-02 22:49:20.754082099 +0800 CST m=+103.250299367 duration 0s
I1202 22:49:20.754857 1229586 static_autoscaler.go:617] Scale down status: lastScaleUpTime=2023-12-02 22:49:14.738369867 +0800 CST m=+97.234587115 lastScaleDownDeleteTime=2023-12-02 21:47:41.492276216 +0800 CST m=-3596.011506536 lastScaleDownFailTime=2023-12-02 21:47:41.492276216 +0800 CST m=-3596.011506536 scaleDownForbidden=false scaleDownInCooldown=true
I1202 22:49:23.760153 1229586 static_autoscaler.go:290] Starting main loop
I1202 22:49:23.760340 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:23.760477 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:23.760546 1229586 filter_out_schedulable.go:63] Filtering out schedulables
I1202 22:49:23.760560 1229586 filter_out_schedulable.go:120] 0 pods marked as unschedulable can be scheduled.
I1202 22:49:23.760572 1229586 filter_out_schedulable.go:83] No schedulable pods
I1202 22:49:23.760580 1229586 filter_out_daemon_sets.go:40] Filtering out daemon set pods
I1202 22:49:23.760587 1229586 filter_out_daemon_sets.go:49] Filtered out 0 daemon set pods, 0 unschedulable pods left
I1202 22:49:23.760600 1229586 static_autoscaler.go:547] No unschedulable pods
I1202 22:49:23.760611 1229586 static_autoscaler.go:570] Calculating unneeded nodes
I1202 22:49:23.760620 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:23.760633 1229586 pre_filtering_processor.go:57] Node minikube should not be processed by cluster autoscaler (no node group config)
I1202 22:49:23.760665 1229586 klogx.go:87] Node cluster-autoscaler-1701528461-xghmv - memory requested is 0% of allocatable
I1202 22:49:23.760690 1229586 cluster.go:156] Simulating node cluster-autoscaler-1701528461-xghmv removal
I1202 22:49:23.760709 1229586 cluster.go:179] node cluster-autoscaler-1701528461-xghmv may be removed
I1202 22:49:23.760722 1229586 nodes.go:84] cluster-autoscaler-1701528461-xghmv is unneeded since 2023-12-02 22:49:20.754082099 +0800 CST m=+103.250299367 duration 3.006038664s
I1202 22:49:23.760758 1229586 static_autoscaler.go:617] Scale down status: lastScaleUpTime=2023-12-02 22:49:14.738369867 +0800 CST m=+97.234587115 lastScaleDownDeleteTime=2023-12-02 21:47:41.492276216 +0800 CST m=-3596.011506536 lastScaleDownFailTime=2023-12-02 21:47:41.492276216 +0800 CST m=-3596.011506536 scaleDownForbidden=false scaleDownInCooldown=true
I1202 22:49:26.766561 1229586 static_autoscaler.go:290] Starting main loop
I1202 22:49:26.766740 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:26.766880 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:26.766936 1229586 filter_out_schedulable.go:63] Filtering out schedulables
I1202 22:49:26.766951 1229586 filter_out_schedulable.go:120] 0 pods marked as unschedulable can be scheduled.
I1202 22:49:26.766964 1229586 filter_out_schedulable.go:83] No schedulable pods
I1202 22:49:26.766972 1229586 filter_out_daemon_sets.go:40] Filtering out daemon set pods
I1202 22:49:26.766980 1229586 filter_out_daemon_sets.go:49] Filtered out 0 daemon set pods, 0 unschedulable pods left
I1202 22:49:26.766992 1229586 static_autoscaler.go:547] No unschedulable pods
I1202 22:49:26.767003 1229586 static_autoscaler.go:570] Calculating unneeded nodes
I1202 22:49:26.767014 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:26.767022 1229586 pre_filtering_processor.go:57] Node minikube should not be processed by cluster autoscaler (no node group config)
I1202 22:49:26.767052 1229586 klogx.go:87] Node cluster-autoscaler-1701528461-xghmv - memory requested is 0% of allocatable
I1202 22:49:26.767073 1229586 cluster.go:156] Simulating node cluster-autoscaler-1701528461-xghmv removal
I1202 22:49:26.767093 1229586 cluster.go:179] node cluster-autoscaler-1701528461-xghmv may be removed
I1202 22:49:26.767106 1229586 nodes.go:84] cluster-autoscaler-1701528461-xghmv is unneeded since 2023-12-02 22:49:20.754082099 +0800 CST m=+103.250299367 duration 6.012451453s
I1202 22:49:26.767143 1229586 static_autoscaler.go:617] Scale down status: lastScaleUpTime=2023-12-02 22:49:14.738369867 +0800 CST m=+97.234587115 lastScaleDownDeleteTime=2023-12-02 21:47:41.492276216 +0800 CST m=-3596.011506536 lastScaleDownFailTime=2023-12-02 21:47:41.492276216 +0800 CST m=-3596.011506536 scaleDownForbidden=false scaleDownInCooldown=false
I1202 22:49:26.767173 1229586 static_autoscaler.go:642] Starting scale down
I1202 22:49:26.767203 1229586 nodes.go:126] cluster-autoscaler-1701528461-xghmv was unneeded for 6.012451453s
I1202 22:49:26.767222 1229586 scale_down_set_processor.go:103] Considering node cluster-autoscaler-1701528461-xghmv for standard scale down
I1202 22:49:26.776287 1229586 taints.go:221] Successfully added ToBeDeletedTaint on node cluster-autoscaler-1701528461-xghmv
I1202 22:49:26.776372 1229586 actuator.go:143] Scale-down: removing empty node "cluster-autoscaler-1701528461-xghmv"
I1202 22:49:26.776470 1229586 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"cluster-autoscaler-1701528461-xghmv", UID:"88187c9f-7dc0-419b-9edb-47223f904a76", APIVersion:"v1", ResourceVersion:"24471", FieldPath:""}): type: 'Normal' reason: 'ScaleDown' marked the node as toBeDeleted/unschedulable
I1202 22:49:26.776627 1229586 actuator.go:238] Scale-down: waiting 5s before trying to delete nodes
I1202 22:49:26.779502 1229586 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"default", Name:"cluster-autoscaler-status", UID:"095e2c8c-de6b-44b1-bda9-7174134f7a6e", APIVersion:"v1", ResourceVersion:"24498", FieldPath:""}): type: 'Normal' reason: 'ScaleDownEmpty' Scale-down: removing empty node "cluster-autoscaler-1701528461-xghmv"
I1202 22:49:29.781983 1229586 static_autoscaler.go:290] Starting main loop
I1202 22:49:29.782130 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:29.782214 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:29.782247 1229586 filter_out_schedulable.go:63] Filtering out schedulables
I1202 22:49:29.782257 1229586 filter_out_schedulable.go:120] 0 pods marked as unschedulable can be scheduled.
I1202 22:49:29.782264 1229586 filter_out_schedulable.go:83] No schedulable pods
I1202 22:49:29.782269 1229586 filter_out_daemon_sets.go:40] Filtering out daemon set pods
I1202 22:49:29.782274 1229586 filter_out_daemon_sets.go:49] Filtered out 0 daemon set pods, 0 unschedulable pods left
I1202 22:49:29.782281 1229586 static_autoscaler.go:547] No unschedulable pods
I1202 22:49:29.782287 1229586 static_autoscaler.go:570] Calculating unneeded nodes
I1202 22:49:29.782294 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:29.782299 1229586 pre_filtering_processor.go:57] Node minikube should not be processed by cluster autoscaler (no node group config)
I1202 22:49:29.782323 1229586 static_autoscaler.go:617] Scale down status: lastScaleUpTime=2023-12-02 22:49:14.738369867 +0800 CST m=+97.234587115 lastScaleDownDeleteTime=2023-12-02 22:49:26.766533552 +0800 CST m=+109.262750820 lastScaleDownFailTime=2023-12-02 21:47:41.492276216 +0800 CST m=-3596.011506536 scaleDownForbidden=false scaleDownInCooldown=false
I1202 22:49:29.782346 1229586 static_autoscaler.go:642] Starting scale down
I1202 22:49:32.788976 1229586 static_autoscaler.go:290] Starting main loop
I1202 22:49:32.789123 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
I1202 22:49:32.789226 1229586 kwok_provider.go:58] ignoring node 'minikube' because it is not managed by kwok
F1202 22:49:32.789237 1229586 kwok_helpers.go:270] label 'kwok-nodegroup' for node 'kwok:cluster-autoscaler-1701528461-xghmv' not present in the manifest

Debugger finished with the exit code 0

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. area/cluster-autoscaler labels Dec 2, 2023
Comment on lines +84 to -87
nodeGroup.targetSize += 1
}

nodeGroup.targetSize = newSize

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a case in which some nodes are created successfully and some fail

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Should we add a test case around this (for both IncreaseSize and DeleteNodes)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@qianlei90
Copy link
Contributor Author

/assign @vadasambar

@vadasambar
Copy link
Member

Thank you for the PR!

@vadasambar
Copy link
Member

I can't reproduce the issue. I used the same commands and configmap you used.

Here's what I did:

kubectl scale deploy deployments-simple-deployment-deployment --replicas=2

kwok provider created 2 fake nodes.

And then

kubectl scale deploy deployments-simple-deployment-deployment --replicas=0

kwok provider scaled down the 2 fake nodes.

Logs for reference: https://gist.github.com/vadasambar/56ac07f2eedbd97e5d8aaa1424df3481

@vadasambar
Copy link
Member

Maybe you can share the error you saw?

@qianlei90
Copy link
Contributor Author

@vadasambar Sorry for the misunderstanding in the title. CA does not panic; it simply quits without any stack information. The latest line in your log shows the reason:

F1205 23:17:08.902773  116597 kwok_helpers.go:270] label 'kwok-nodegroup' for node 'kwok:cluster-autoscaler-1701798315-296wg' not present in the manifest

@qianlei90 qianlei90 changed the title fix(kwok): fix panic when scale down node group fix(kwok): prevent quitting when scaling down node group Dec 6, 2023
no.GetName())
}

ngName = fmt.Sprintf("%s-%v", ngName, time.Now().Unix())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to keep the node group name unchanged, especially in cases where nodes still remain in the cluster and CA is restarted.

Copy link
Member

@vadasambar vadasambar Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 275 is clearly a bug. The original intention was to ensure targetSize of a nodegroup correctly reflects the number of nodes with matching annotation/label. Making sure nodegroup has a unique name on every pod restart achieves this.

If we already have nodes in the cluster with matching nodegroup annotations/labels and we remove time.Now().Unix() suffix, when creating nodegroups (imagine the CA pod got restarted) the target size won't accurately reflect the actual nodegroup size because there are more nodes matching nodegroup label in the cluster now.

One solution can be to implement Refresh and update the nodegroup target size based on the actual matching nodes in the cluster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know what you think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kwok provider will calculate the target sizes during startup, and the Cluster Autoscaler will scale these nodegroups to the appropriate size. I think it's not necessary to calculate the target size in the Refresh function, just leave this work to CA may be a better choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kwok provider will clean all the fake node when CA quit:

// Cleanup cleans up all resources before the cloud provider is removed
func (kwok *KwokCloudProvider) Cleanup() error {
	for _, ng := range kwok.nodeGroups {
		nodeNames, err := ng.getNodeNamesForNodeGroup()
		if err != nil {
			return fmt.Errorf("error cleaning up: %v", err)
		}

		for _, node := range nodeNames {
			err := kwok.kubeClient.CoreV1().Nodes().Delete(context.Background(), node, v1.DeleteOptions{})
			if err != nil {
				klog.Errorf("error cleaning up kwok provider nodes '%v'", node)
			}
		}
	}

	return nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I was considering a case when Cleanup doesn't clean all the nodes (it doesn't work correctly for whatever reason)

The Kwok provider will calculate the target sizes during startup, and the Cluster Autoscaler will scale these nodegroups to the appropriate size. I think it's not necessary to calculate the target size in the Refresh function, just leave this work to CA may be a better choice.

I think we might have to try this out to confirm if CA can handle such a situation. If you can try it out as a part of this PR, great. If not, we can take care of it in another issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vadasambar, I have implemented the Refresh() function. Please take a look.

@vadasambar
Copy link
Member

@vadasambar Sorry for the misunderstanding in the title. CA does not panic; it simply quits without any stack information. The latest line in your log shows the reason:

F1205 23:17:08.902773  116597 kwok_helpers.go:270] label 'kwok-nodegroup' for node 'kwok:cluster-autoscaler-1701798315-296wg' not present in the manifest

This is clearly a bug. Thank you for the explanation!

Copy link
Member

@vadasambar vadasambar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think we should add a test case which fails with the current code and passes with the fix.

@qianlei90
Copy link
Contributor Author

qianlei90 commented Dec 19, 2023

Also, I think we should add a test case which fails with the current code and passes with the fix.

Thanks for you advise, it will be done in a few days.

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 19, 2023
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 28, 2023
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 28, 2023
@qianlei90
Copy link
Contributor Author

qianlei90 commented Dec 29, 2023

Also, I think we should add a test case which fails with the current code and passes with the fix.

Done.

/unhold

switch opts.CloudProviderName {
case cloudprovider.KwokProviderName:
return kwok.BuildKwokCloudProvider(opts, do, rl)(opts, do, rl)
return kwok.BuildKwok(opts, do, rl, informerFactory)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@vadasambar
Copy link
Member

@qianlei90 apologies for the delay (was out on vacation). I plan to review this PR this week.

// ngs = append(ngs, ng)
// }
for _, node := range allNodes {
ngName := getNGName(node, kwok.config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will lead to klog.Fatal for cases where the node doesn't have the nodegroup label or annotation.

You might have to use a filter function to filter only nodes which have the nodegroup label/annotation OR convert klog.Fatal into an non-fatal error log.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I converted klog.Fatal to klog.Warning and added comments to getNGName.

}

for _, ng := range kwok.nodeGroups {
ng.targetSize = targetSizeInCluster[ng.Id()]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something for the future: I wonder if we should delete nodes which have ng that is not a part of kwok.nodeGroups.

assert.NoError(t, err)
assert.NotNil(t, p)

err = p.Refresh()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kwokConfig.status is coming out nil here for some reason because of which node4 test case is not throwing an error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Do you mean the p.config.status is nil? I ran this test case in debug mode and found that this field was not nil. Is this the expected value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it was a misunderstanding on my end. I ran the test again. Looks good to me.

@vadasambar
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 12, 2024
@vadasambar
Copy link
Member

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 12, 2024
@vadasambar
Copy link
Member

Thank you @qianlei90 . LGTM.

@BigDarkClown can you please merge the PR 🙏
I am the approver and reviewer for kwok cloud provider (this PR contains only kwok provider changes) but I can't seem to merge the PR.

@vadasambar
Copy link
Member

/assign @towca

@towca
Copy link
Collaborator

towca commented Jan 18, 2024

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qianlei90, towca, vadasambar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 18, 2024
@k8s-ci-robot k8s-ci-robot merged commit df0ce2d into kubernetes:master Jan 18, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants