Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable garbage collection #509

Closed
wants to merge 1 commit into from
Closed

Enable garbage collection #509

wants to merge 1 commit into from

Conversation

kvaps
Copy link
Contributor

@kvaps kvaps commented Aug 7, 2024

fixes: #508
downstream: aenix-io/cozystack#263

Copy link

netlify bot commented Aug 7, 2024

Deploy Preview for kamaji-documentation canceled.

Name Link
🔨 Latest commit f47df4a
🔍 Latest deploy log https://app.netlify.com/sites/kamaji-documentation/deploys/66ba55283cd3590008269805

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@prometherion
Copy link
Member

For some reason, e2e is failing. I'm debugging locally, but in the meantime, I guess it would be good to catch an improvement here: since we're importing the whole Kubernetes code-base, we can take advantage of the advanced defaults for the Kubelet type.

I'm sharing my patch here, it would be great if you could apply it.

diff --git a/internal/kubeadm/uploadconfig.go b/internal/kubeadm/uploadconfig.go
index 0dc9e71..158f54f 100644
--- a/internal/kubeadm/uploadconfig.go
+++ b/internal/kubeadm/uploadconfig.go
@@ -17,7 +17,7 @@ import (
 	"k8s.io/kubernetes/cmd/kubeadm/app/phases/uploadconfig"
 	"k8s.io/kubernetes/cmd/kubeadm/app/util/apiclient"
 	"k8s.io/kubernetes/pkg/apis/rbac"
-	pointer "k8s.io/utils/ptr"
+	kubeletv1beta1 "k8s.io/kubernetes/pkg/kubelet/apis/config/v1beta1"
 
 	"github.com/clastix/kamaji/internal/utilities"
 )
@@ -72,58 +72,16 @@ func UploadKubeletConfig(client kubernetes.Interface, config *Configuration) ([]
 }
 
 func getKubeletConfigmapContent(kubeletConfiguration KubeletConfiguration) ([]byte, error) {
-	zeroDuration := metav1.Duration{Duration: 0}
+	var kc kubelettypes.KubeletConfiguration
 
-	kc := kubelettypes.KubeletConfiguration{
-		TypeMeta: metav1.TypeMeta{
-			Kind:       "KubeletConfiguration",
-			APIVersion: "kubelet.config.k8s.io/v1beta1",
-		},
-		Authentication: kubelettypes.KubeletAuthentication{
-			Anonymous: kubelettypes.KubeletAnonymousAuthentication{
-				Enabled: pointer.To(false),
-			},
-			Webhook: kubelettypes.KubeletWebhookAuthentication{
-				Enabled:  pointer.To(true),
-				CacheTTL: zeroDuration,
-			},
-			X509: kubelettypes.KubeletX509Authentication{
-				ClientCAFile: "/etc/kubernetes/pki/ca.crt",
-			},
-		},
-		Authorization: kubelettypes.KubeletAuthorization{
-			Mode: kubelettypes.KubeletAuthorizationModeWebhook,
-			Webhook: kubelettypes.KubeletWebhookAuthorization{
-				CacheAuthorizedTTL:   zeroDuration,
-				CacheUnauthorizedTTL: zeroDuration,
-			},
-		},
-		CgroupDriver:              kubeletConfiguration.TenantControlPlaneCgroupDriver,
-		ClusterDNS:                kubeletConfiguration.TenantControlPlaneDNSServiceIPs,
-		ClusterDomain:             kubeletConfiguration.TenantControlPlaneDomain,
-		CPUManagerReconcilePeriod: zeroDuration,
-		EvictionHard: map[string]string{
-			"imagefs.available": "0%",
-			"nodefs.available":  "0%",
-			"nodefs.inodesFree": "0%",
-		},
-		EvictionPressureTransitionPeriod: zeroDuration,
-		FileCheckFrequency:               zeroDuration,
-		HealthzBindAddress:               "127.0.0.1",
-		HealthzPort:                      pointer.To(int32(10248)),
-		HTTPCheckFrequency:               zeroDuration,
-		ImageGCHighThresholdPercent:      pointer.To(int32(100)),
-		NodeStatusUpdateFrequency:        zeroDuration,
-		NodeStatusReportFrequency:        zeroDuration,
-		RotateCertificates:               true,
-		RuntimeRequestTimeout:            zeroDuration,
-		ShutdownGracePeriod:              zeroDuration,
-		ShutdownGracePeriodCriticalPods:  zeroDuration,
-		StaticPodPath:                    "/etc/kubernetes/manifests",
-		StreamingConnectionIdleTimeout:   zeroDuration,
-		SyncFrequency:                    zeroDuration,
-		VolumeStatsAggPeriod:             zeroDuration,
-	}
+	kubeletv1beta1.SetDefaults_KubeletConfiguration(&kc)
+
+	kc.Authentication.X509.ClientCAFile = "/etc/kubernetes/pki/ca.crt"
+	kc.CgroupDriver = kubeletConfiguration.TenantControlPlaneCgroupDriver
+	kc.ClusterDNS = kubeletConfiguration.TenantControlPlaneDNSServiceIPs
+	kc.ClusterDomain = kubeletConfiguration.TenantControlPlaneDomain
+	kc.RotateCertificates = true
+	kc.StaticPodPath = "/etc/kubernetes/manifests"
 
 	return utilities.EncodeToYaml(&kc)
 }

A go mod tidy will be required.

Once the TCP is ready I can extract the kubelet configuration, and the result seems fine to me:

address: 0.0.0.0
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
cgroupDriver: systemd
cgroupsPerQOS: true
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
configMapAndSecretChangeDetectionStrategy: Watch
containerLogMaxFiles: 5
containerLogMaxSize: 10Mi
containerLogMaxWorkers: 1
containerLogMonitorInterval: 10s
containerRuntimeEndpoint: unix:///run/containerd/containerd.sock
contentType: application/vnd.kubernetes.protobuf
cpuCFSQuota: true
cpuCFSQuotaPeriod: 100ms
cpuManagerPolicy: none
cpuManagerReconcilePeriod: 10s
enableControllerAttachDetach: true
enableDebugFlagsHandler: true
enableDebuggingHandlers: true
enableProfilingHandler: true
enableServer: true
enableSystemLogHandler: true
enforceNodeAllocatable:
- pods
eventBurst: 100
eventRecordQPS: 50
evictionPressureTransitionPeriod: 5m0s
failSwapOn: true
fileCheckFrequency: 20s
hairpinMode: promiscuous-bridge
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 20s
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
imageMaximumGCAge: 0s
imageMinimumGCAge: 2m0s
iptablesDropBit: 15
iptablesMasqueradeBit: 14
kubeAPIBurst: 100
kubeAPIQPS: 50
localStorageCapacityIsolation: true
logging:
  flushFrequency: 5s
  format: text
  options:
    json:
      infoBufferSize: "0"
    text:
      infoBufferSize: "0"
  verbosity: 0
makeIPTablesUtilChains: true
maxOpenFiles: 1000000
maxPods: 110
memoryManagerPolicy: None
memorySwap: {}
memoryThrottlingFactor: 0.9
nodeLeaseDurationSeconds: 40
nodeStatusMaxImages: 50
nodeStatusReportFrequency: 5m0s
nodeStatusUpdateFrequency: 10s
oomScoreAdj: -999
podLogsDir: /var/log/pods
podPidsLimit: -1
port: 10250
registerNode: true
registryBurst: 10
registryPullQPS: 5
resolvConf: /etc/resolv.conf
rotateCertificates: true
runtimeRequestTimeout: 2m0s
seccompDefault: false
serializeImagePulls: true
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 4h0m0s
syncFrequency: 1m0s
topologyManagerPolicy: none
topologyManagerScope: container
volumePluginDir: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
volumeStatsAggPeriod: 1m0s

@prometherion
Copy link
Member

@kvaps unfortunately, #510 introduced a bug, #511 should revert it: once merged, please, may I ask you to rebase with my suggested changes?

@kvaps
Copy link
Contributor Author

kvaps commented Aug 12, 2024

Hi @prometherion I tried your patch, and found that it is not working for some reason, nodes can't join to the cluster for some reason:

[   11.272369] cloud-init[1351]: [preflight] Running pre-flight checks
[   12.103807] cloud-init[1351]: [preflight] Reading configuration from the cluster...
[   12.106741] cloud-init[1351]: [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[   12.749235] cloud-init[1351]: W0812 18:28:59.461088    1368 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:ixumoo" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
[   12.830133] cloud-init[1351]: error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to get component configs: could not download the kubelet configuration from ConfigMap "kubelet-config": invalid configuration for GroupVersionKind /, Kind=: kind and apiVersion is mandatory information that must be specified
[   12.840614] cloud-init[1351]: To see the stack trace of this error execute with --v=5 or higher
[   12.846246] cloud-init[1351]: 2024-08-12 18:28:59,557 - cc_scripts_user.py[WARNING]: Failed to run module scripts_user (scripts in /var/lib/cloud/instance/scripts)
[   12.850916] cloud-init[1351]: 2024-08-12 18:28:59,558 - util.py[WARNING]: Running module scripts_user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_scripts_user.py'>) failed

Unfortunately I wasn't investigated the reason, just running busy all the time, sorry :0

@prometherion
Copy link
Member

prometherion commented Aug 12, 2024

@kvaps may I ask you which version gave you that error?

Tested my changes with a v1.30 node and it worked smoothly.

edit: my proposed changes are working with v1.30, v1.29, v1.28: starting from v1.27 there's a marshaling error:

error unmarshaling configuration schema.GroupVersionKind{Group:"kubelet.config.k8s.io", Version:"v1beta1", Kind:"KubeletConfiguration"}: json: cannot unmarshal string into Go struct field LoggingConfiguration.logging.flushFrequency of type time.Duration

@bsctl since v1.27 is EOL I would suggest dropping support for Tenant Control Planes lower this version: WDYT?

I found a way to have backward compatibility, if you don't mind @kvaps I would push the changes to your branch: no pressure at all, I can open a different PR if you want to review without losing your contributions.

With this, we'll be able still to spin TCP pre and post v1.27 🎉

@kvaps
Copy link
Contributor Author

kvaps commented Aug 12, 2024

I used v1.30. Sorry push or open new PR, I don't mind :)

@prometherion
Copy link
Member

Thanks for the tests and the contributions, Andrei, just merged #542 which superseded this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Image garbage collection is not working on workers
2 participants