Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm update to version 1.7.x resulting in pods Readiness probe failed: HTTP probe failed with statuscode: 404 #3582

Closed
myevit opened this issue Feb 16, 2024 · 8 comments

Comments

@myevit
Copy link

myevit commented Feb 16, 2024

Helm update to version 1.7.x resulting in Pod: aws-load-balancer-controller Readiness probe failed: HTTP probe failed with statuscode: 404

helm versions 1.4.1 through 1.6.2 working as expected

1.7.x pod logs:

{"level":"info","ts":1708126775.380832,"msg":"version","GitVersion":"v2.4.6","GitCommit":"a92e689dfe464f5b24784f398947e0fef31dc470","BuildDate":"2023-01-12T06:29:16+0000"}
{"level":"info","ts":1708126775.410416,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1708126775.4145405,"logger":"setup","msg":"adding health check for controller"}
{"level":"info","ts":1708126775.414611,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/mutate-v1-pod"}
{"level":"info","ts":1708126775.4146783,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/mutate-elbv2-k8s-aws-v1beta1-targetgroupbinding"}
{"level":"info","ts":1708126775.414728,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-elbv2-k8s-aws-v1beta1-targetgroupbinding"}
{"level":"info","ts":1708126775.4147651,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-networking-v1-ingress"}
{"level":"info","ts":1708126775.4148157,"logger":"setup","msg":"starting podInfo repo"}
{"level":"info","ts":1708126777.415071,"msg":"starting metrics server","path":"/metrics"}
I0216 23:39:37.415030       1 leaderelection.go:243] attempting to acquire leader lease kube-system/aws-load-balancer-controller-leader...
{"level":"info","ts":1708126777.415177,"logger":"controller-runtime.webhook.webhooks","msg":"starting webhook server"}
{"level":"info","ts":1708126777.4158258,"logger":"controller-runtime.certwatcher","msg":"Updated current TLS certificate"}
{"level":"info","ts":1708126777.415922,"logger":"controller-runtime.webhook","msg":"serving webhook server","host":"","port":9443}
{"level":"info","ts":1708126777.4159951,"logger":"controller-runtime.certwatcher","msg":"Starting certificate watcher"}
@greg-anetac
Copy link

+1 started seeing this today

@wweiwei-li
Copy link
Collaborator

This is known issue. Can you confirm that you are using latest version for controller also. If you are updating chart version, you need also to use v2.7.0 controller version

@myevit
Copy link
Author

myevit commented Feb 22, 2024

Helm chart reporting App Version v 2.7.1
Also, in the helm chart 1.7.1 it set to public.ecr.aws/eks/aws-load-balancer-controller:v2.4.6
whatever version is in that container the one is running in chart 1.7.1...

image:
  pullPolicy: IfNotPresent
  repository: public.ecr.aws/eks/aws-load-balancer-controller
  tag: v2.4.6

@wweiwei-li
Copy link
Collaborator

https://github.com/aws/eks-charts/blob/master/stable/aws-load-balancer-controller/values.yaml. Here is the most recent helm chart, which links to the right controller version. Can you tell us from where did you see the v2.4.6.

@myevit
Copy link
Author

myevit commented Feb 29, 2024

https://github.com/aws/eks-charts/blob/master/stable/aws-load-balancer-controller/values.yaml. Here is the most recent helm chart, which links to the right controller version. Can you tell us from where did you see the v2.4.6.

Manual image tagging in the published helm chart resolving the issue.

I am using Lens to update helm chart deployments. For some reason Lens is thinking that Image tag is user supplied value and keeping it unchanged.

additionalLabels: {}
affinity: {}
autoscaling:
  enabled: false
  maxReplicas: 5
  minReplicas: 1
  targetCPUUtilizationPercentage: 80
awsApiEndpoints: null
awsApiThrottle: null
awsMaxRetries: null
backendSecurityGroup: null
cluster:
  dnsDomain: cluster.local
clusterName: production
clusterSecretsPermissions:
  allowAllSecrets: false
configureDefaultAffinity: true
controllerConfig:
  featureGates: {}
createIngressClassResource: true
defaultSSLPolicy: null
defaultTags: {}
defaultTargetType: instance
deploymentAnnotations: {}
disableIngressClassAnnotation: null
disableIngressGroupNameAnnotation: null
disableRestrictedSecurityGroupRules: null
dnsPolicy: null
enableBackendSecurityGroup: null
enableCertManager: false
enableEndpointSlices: null
enablePodReadinessGateInject: null
enableServiceMutatorWebhook: true
enableShield: null
enableWaf: null
enableWafv2: null
env: null
externalManagedTags: []
extraVolumeMounts: null
extraVolumes: null
fullnameOverride: ""
hostNetwork: false
image:
  pullPolicy: IfNotPresent
  repository: public.ecr.aws/eks/aws-load-balancer-controller
  tag: v2.4.6
imagePullSecrets: []
ingressClass: alb
ingressClassConfig:
  default: false
ingressClassParams:
  create: true
  name: null
  spec: {}
ingressMaxConcurrentReconciles: null
keepTLSSecret: true
livenessProbe:
  failureThreshold: 2
  httpGet:
    path: /healthz
    port: 61779
    scheme: HTTP
  initialDelaySeconds: 30
  timeoutSeconds: 10
logLevel: null
metricsBindAddr: ""
nameOverride: ""
nodeSelector: {}
objectSelector:
  matchExpressions: null
  matchLabels: null
podAnnotations: {}
podDisruptionBudget: {}
podLabels: {}
podSecurityContext:
  fsGroup: 65534
priorityClassName: system-cluster-critical
rbac:
  create: true
readinessProbe:
  failureThreshold: 2
  httpGet:
    path: /readyz
    port: 61779
    scheme: HTTP
  initialDelaySeconds: 10
  successThreshold: 1
  timeoutSeconds: 10
region: ca-central-1
replicaCount: 2
resources: {}
revisionHistoryLimit: 10
securityContext:
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  runAsNonRoot: true
serviceAccount:
  annotations: {}
  automountServiceAccountToken: true
  create: false
  imagePullSecrets: null
  name: aws-load-balancer-controller
serviceAnnotations: {}
serviceMaxConcurrentReconciles: null
serviceMonitor:
  additionalLabels: {}
  enabled: false
  interval: 1m
  namespace: null
serviceTargetENISGTags: null
syncPeriod: null
targetgroupbindingMaxConcurrentReconciles: null
targetgroupbindingMaxExponentialBackoffDelay: null
terminationGracePeriodSeconds: 10
tolerations: []
topologySpreadConstraints: {}
updateStrategy: {}
vpcId: ""
watchNamespace: null
webhookBindPort: null
webhookNamespaceSelectors: null
webhookTLS:
  caCert: null
  cert: null
  key: null

@ravikyada
Copy link

For this time, I changed the readiness probe health check back to "healthz", and its worked properly.

@davidkl97 Please check whether it is an issue or if any workaround needs to be done here.

@davidkl97
Copy link
Contributor

Hey, @ravikyada which image version you are using?
please note that readiness probe change requires newer images as suggested here due to the the fact that readyz endpoint registered in newer version(image version 2.7.0)

@ravikyada
Copy link

ravikyada commented Feb 29, 2024

ok thank you @davidkl97 for the help!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants