Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster-autoscaler AWS "runtime error: invalid memory address or nil pointer dereference" #1556

Closed
amitlt opened this issue Jan 6, 2019 · 6 comments

Comments

@amitlt
Copy link

amitlt commented Jan 6, 2019

kubernete version: 1.10.11
ca version: 1.2.2

instance type: r5.large

this issue describes the problem being that the specific instance type isn't in the configuration, but looking into the 1.2.2 release i see that the instance type does indeed exist

furthermore, from my testing, scaling up to 1 instance and scaling back down to 0 instances worked multiple times, so I'm not entirely sure what triggered this issue.

values.yaml

autoDiscovery:
# Only cloudProvider `aws` and `gce` are supported by auto-discovery at this time
# AWS: Set tags as described in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md#auto-discovery-setup
  clusterName:

autoscalingGroups:
# At least one element is required if not using autoDiscovery
  - name: jobs-nodes.k8s.test.site.com
    maxSize: 20
    minSize: 0
  # - name: asg2
  #   maxSize: 2
  #   minSize: 1

autoscalingGroupsnamePrefix: []
# At least one element is required if not using autoDiscovery
  # - name: ig01
  #   maxSize: 10
  #   minSize: 0
  # - name: ig02
  #   maxSize: 10
  #   minSize: 0

# Required if cloudProvider=aws
awsRegion: us-east-1

# Currently only `gce`, `aws` & `spotinst` are supported
cloudProvider: aws

sslCertPath: /etc/ssl/certs/ca-certificates.crt

# Configuration file for cloud provider
cloudConfigPath: /etc/gce.conf

image:
  repository: gcr.io/google_containers/cluster-autoscaler
  tag: v1.2.2
  pullPolicy: Always

tolerations: []

extraArgs:
  v: 4
  stderrthreshold: info
  logtostderr: true
  skip-nodes-with-system-pods: "false"
  # write-status-configmap: true
  # leader-elect: true
  # skip-nodes-with-local-storage: false
  # expander: least-waste
  # scale-down-enabled: true
  # balance-similar-node-groups: true
  # min-replica-count: 2
  # scale-down-utilization-threshold: 0.5
  # scale-down-non-empty-candidates-count: 5
  # max-node-provision-time: 15m0s
  # scan-interval: 10s
  # scale-down-delay: 10m
  # scale-down-unneeded-time: 10m
  # skip-nodes-with-local-storage: false


## Affinity for pod assignment
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
## affinity: {}

podDisruptionBudget: |
  maxUnavailable: 1
  # minAvailable: 2

## Node labels for pod assignment
## Ref: https://kubernetes.io/docs/user-guide/node-selection/
nodeSelector: {}

podAnnotations: {
  "iam.amazonaws.com/role": "arn:aws:iam::671587110562:role/system/k8s-autoscaler-test-role"
}
podLabels: {}
replicaCount: 1

rbac:
  ## If true, create & use RBAC resources
  ##
  create: true
  ## If true, create & use Pod Security Policy resources
  ## https://kubernetes.io/docs/concepts/policy/pod-security-policy/
  pspEnabled: true
  ## Ignored if rbac.create is true
  ##
  serviceAccountName: default

resources: {}
  # limits:
  #   cpu: 100m
  #   memory: 300Mi
  # requests:
  #   cpu: 100m
  #   memory: 300Mi

priorityClassName: ""

service:
  annotations: {}
  clusterIP: ""

  ## List of IP addresses at which the service is available
  ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
  ##
  externalIPs: []

  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  servicePort: 8085
  portName: http
  type: ClusterIP

spotinst:
  account: ""
  token: ""
  image:
    repository: spotinst/kubernetes-cluster-autoscaler
    tag: 0.6.0
    pullPolicy: IfNotPresent

ca starts with the following flags:

spec:
  containers:
  - command:
    - ./cluster-autoscaler
    - --cloud-provider=aws
    - --namespace=kube-system
    - --nodes=0:20:jobs-nodes.k8s.test.site.com
    - --logtostderr=true
    - --skip-nodes-with-system-pods=false
    - --stderrthreshold=info
    - --v=4

ca logs:

I0106 08:13:18.701677       1 flags.go:52] FLAG: --address=":8085"
I0106 08:13:18.701711       1 flags.go:52] FLAG: --alsologtostderr="false"
I0106 08:13:18.701719       1 flags.go:52] FLAG: --application-metrics-count-limit="100"
I0106 08:13:18.701722       1 flags.go:52] FLAG: --azure-container-registry-config=""
I0106 08:13:18.701727       1 flags.go:52] FLAG: --balance-similar-node-groups="false"
I0106 08:13:18.701731       1 flags.go:52] FLAG: --boot-id-file="/proc/sys/kernel/random/boot_id"
I0106 08:13:18.701735       1 flags.go:52] FLAG: --cloud-config=""
I0106 08:13:18.701786       1 flags.go:52] FLAG: --cloud-provider="aws"
I0106 08:13:18.701793       1 flags.go:52] FLAG: --cloud-provider-gce-lb-src-cidrs="130.211.0.0/22,209.85.152.0/22,209.85.204.0/22,35.191.0.0/16"
I0106 08:13:18.701800       1 flags.go:52] FLAG: --cluster-name=""
I0106 08:13:18.701804       1 flags.go:52] FLAG: --configmap=""
I0106 08:13:18.701807       1 flags.go:52] FLAG: --container-hints="/etc/cadvisor/container_hints.json"
I0106 08:13:18.701811       1 flags.go:52] FLAG: --containerd="unix:///var/run/containerd.sock"
I0106 08:13:18.701816       1 flags.go:52] FLAG: --cores-total="0:320000"
I0106 08:13:18.701819       1 flags.go:52] FLAG: --docker="unix:///var/run/docker.sock"
I0106 08:13:18.701824       1 flags.go:52] FLAG: --docker-env-metadata-whitelist=""
I0106 08:13:18.701827       1 flags.go:52] FLAG: --docker-only="false"
I0106 08:13:18.701831       1 flags.go:52] FLAG: --docker-root="/var/lib/docker"
I0106 08:13:18.701835       1 flags.go:52] FLAG: --docker-tls="false"
I0106 08:13:18.701839       1 flags.go:52] FLAG: --docker-tls-ca="ca.pem"
I0106 08:13:18.701842       1 flags.go:52] FLAG: --docker-tls-cert="cert.pem"
I0106 08:13:18.701846       1 flags.go:52] FLAG: --docker-tls-key="key.pem"
I0106 08:13:18.701849       1 flags.go:52] FLAG: --enable-load-reader="false"
I0106 08:13:18.701852       1 flags.go:52] FLAG: --estimator="binpacking"
I0106 08:13:18.701856       1 flags.go:52] FLAG: --event-storage-age-limit="default=0"
I0106 08:13:18.701860       1 flags.go:52] FLAG: --event-storage-event-limit="default=0"
I0106 08:13:18.701864       1 flags.go:52] FLAG: --expander="random"
I0106 08:13:18.701868       1 flags.go:52] FLAG: --expendable-pods-priority-cutoff="0"
I0106 08:13:18.701872       1 flags.go:52] FLAG: --gke-api-endpoint=""
I0106 08:13:18.701875       1 flags.go:52] FLAG: --global-housekeeping-interval="1m0s"
I0106 08:13:18.701878       1 flags.go:52] FLAG: --google-json-key=""
I0106 08:13:18.701882       1 flags.go:52] FLAG: --housekeeping-interval="10s"
I0106 08:13:18.701886       1 flags.go:52] FLAG: --httptest.serve=""
I0106 08:13:18.701889       1 flags.go:52] FLAG: --kubeconfig=""
I0106 08:13:18.701893       1 flags.go:52] FLAG: --kubernetes=""
I0106 08:13:18.701896       1 flags.go:52] FLAG: --leader-elect="true"
I0106 08:13:18.701902       1 flags.go:52] FLAG: --leader-elect-lease-duration="15s"
I0106 08:13:18.701908       1 flags.go:52] FLAG: --leader-elect-renew-deadline="10s"
I0106 08:13:18.701912       1 flags.go:52] FLAG: --leader-elect-resource-lock="endpoints"
I0106 08:13:18.701916       1 flags.go:52] FLAG: --leader-elect-retry-period="2s"
I0106 08:13:18.701919       1 flags.go:52] FLAG: --log-backtrace-at=":0"
I0106 08:13:18.701926       1 flags.go:52] FLAG: --log-cadvisor-usage="false"
I0106 08:13:18.701930       1 flags.go:52] FLAG: --log-dir=""
I0106 08:13:18.701933       1 flags.go:52] FLAG: --log-flush-frequency="5s"
I0106 08:13:18.701937       1 flags.go:52] FLAG: --logtostderr="true"
I0106 08:13:18.701941       1 flags.go:52] FLAG: --machine-id-file="/etc/machine-id,/var/lib/dbus/machine-id"
I0106 08:13:18.701946       1 flags.go:52] FLAG: --max-autoprovisioned-node-group-count="15"
I0106 08:13:18.701949       1 flags.go:52] FLAG: --max-empty-bulk-delete="10"
I0106 08:13:18.701954       1 flags.go:52] FLAG: --max-failing-time="15m0s"
I0106 08:13:18.701957       1 flags.go:52] FLAG: --max-graceful-termination-sec="600"
I0106 08:13:18.701961       1 flags.go:52] FLAG: --max-inactivity="10m0s"
I0106 08:13:18.701965       1 flags.go:52] FLAG: --max-node-provision-time="15m0s"
I0106 08:13:18.701968       1 flags.go:52] FLAG: --max-nodes-total="0"
I0106 08:13:18.701971       1 flags.go:52] FLAG: --max-total-unready-percentage="45"
I0106 08:13:18.701976       1 flags.go:52] FLAG: --memory-total="0:6400000"
I0106 08:13:18.701979       1 flags.go:52] FLAG: --min-replica-count="0"
I0106 08:13:18.701982       1 flags.go:52] FLAG: --namespace="kube-system"
I0106 08:13:18.701986       1 flags.go:52] FLAG: --node-autoprovisioning-enabled="false"
I0106 08:13:18.701990       1 flags.go:52] FLAG: --node-group-auto-discovery="[]"
I0106 08:13:18.701994       1 flags.go:52] FLAG: --nodes="[0:20:jobs-nodes.k8s.test.site.com]"
I0106 08:13:18.701999       1 flags.go:52] FLAG: --ok-total-unready-count="3"
I0106 08:13:18.702002       1 flags.go:52] FLAG: --regional="false"
I0106 08:13:18.702007       1 flags.go:52] FLAG: --scale-down-candidates-pool-min-count="50"
I0106 08:13:18.702010       1 flags.go:52] FLAG: --scale-down-candidates-pool-ratio="0.1"
I0106 08:13:18.702014       1 flags.go:52] FLAG: --scale-down-delay-after-add="10m0s"
I0106 08:13:18.702017       1 flags.go:52] FLAG: --scale-down-delay-after-delete="10s"
I0106 08:13:18.702021       1 flags.go:52] FLAG: --scale-down-delay-after-failure="3m0s"
I0106 08:13:18.702025       1 flags.go:52] FLAG: --scale-down-enabled="true"
I0106 08:13:18.702029       1 flags.go:52] FLAG: --scale-down-non-empty-candidates-count="30"
I0106 08:13:18.702032       1 flags.go:52] FLAG: --scale-down-unneeded-time="10m0s"
I0106 08:13:18.702036       1 flags.go:52] FLAG: --scale-down-unready-time="20m0s"
I0106 08:13:18.702040       1 flags.go:52] FLAG: --scale-down-utilization-threshold="0.5"
I0106 08:13:18.702043       1 flags.go:52] FLAG: --scan-interval="10s"
I0106 08:13:18.702048       1 flags.go:52] FLAG: --skip-nodes-with-local-storage="true"
I0106 08:13:18.702051       1 flags.go:52] FLAG: --skip-nodes-with-system-pods="false"
I0106 08:13:18.702054       1 flags.go:52] FLAG: --stderrthreshold="0"
I0106 08:13:18.702059       1 flags.go:52] FLAG: --storage-driver-buffer-duration="1m0s"
I0106 08:13:18.702062       1 flags.go:52] FLAG: --storage-driver-db="cadvisor"
I0106 08:13:18.702066       1 flags.go:52] FLAG: --storage-driver-host="localhost:8086"
I0106 08:13:18.702070       1 flags.go:52] FLAG: --storage-driver-password="root"
I0106 08:13:18.702073       1 flags.go:52] FLAG: --storage-driver-secure="false"
I0106 08:13:18.702077       1 flags.go:52] FLAG: --storage-driver-table="stats"
I0106 08:13:18.702080       1 flags.go:52] FLAG: --storage-driver-user="root"
I0106 08:13:18.702084       1 flags.go:52] FLAG: --test.bench=""
I0106 08:13:18.702088       1 flags.go:52] FLAG: --test.benchmem="false"
I0106 08:13:18.702091       1 flags.go:52] FLAG: --test.benchtime="1s"
I0106 08:13:18.702094       1 flags.go:52] FLAG: --test.blockprofile=""
I0106 08:13:18.702098       1 flags.go:52] FLAG: --test.blockprofilerate="1"
I0106 08:13:18.702101       1 flags.go:52] FLAG: --test.count="1"
I0106 08:13:18.702105       1 flags.go:52] FLAG: --test.coverprofile=""
I0106 08:13:18.702108       1 flags.go:52] FLAG: --test.cpu=""
I0106 08:13:18.702111       1 flags.go:52] FLAG: --test.cpuprofile=""
I0106 08:13:18.702116       1 flags.go:52] FLAG: --test.memprofile=""
I0106 08:13:18.702119       1 flags.go:52] FLAG: --test.memprofilerate="0"
I0106 08:13:18.702122       1 flags.go:52] FLAG: --test.mutexprofile=""
I0106 08:13:18.702127       1 flags.go:52] FLAG: --test.mutexprofilefraction="1"
I0106 08:13:18.702130       1 flags.go:52] FLAG: --test.outputdir=""
I0106 08:13:18.702133       1 flags.go:52] FLAG: --test.parallel="2"
I0106 08:13:18.702137       1 flags.go:52] FLAG: --test.run=""
I0106 08:13:18.702141       1 flags.go:52] FLAG: --test.short="false"
I0106 08:13:18.702144       1 flags.go:52] FLAG: --test.timeout="0s"
I0106 08:13:18.702148       1 flags.go:52] FLAG: --test.trace=""
I0106 08:13:18.702151       1 flags.go:52] FLAG: --test.v="false"
I0106 08:13:18.702155       1 flags.go:52] FLAG: --v="4"
I0106 08:13:18.702161       1 flags.go:52] FLAG: --version="false"
I0106 08:13:18.702169       1 flags.go:52] FLAG: --vmodule=""
I0106 08:13:18.702172       1 flags.go:52] FLAG: --write-status-configmap="true"
I0106 08:13:18.702178       1 main.go:298] Cluster Autoscaler 1.2.2
I0106 08:13:18.734980       1 leaderelection.go:175] attempting to acquire leader lease  kube-system/cluster-autoscaler...
I0106 08:13:18.746266       1 leaderelection.go:184] successfully acquired lease kube-system/cluster-autoscaler
I0106 08:13:18.746698       1 factory.go:33] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"cluster-autoscaler", UID:"38678a25-0f52-11e9-baa4-0e4b20da8fb0", APIVersion:"v1", ResourceVersion:"36472506", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' k8s-autoscaler-aws-cluster-autoscaler-d6bd46fc9-j42v8 became leader
I0106 08:13:18.747459       1 predicates.go:125] Using predicate PodFitsResources
I0106 08:13:18.747471       1 predicates.go:125] Using predicate GeneralPredicates
I0106 08:13:18.747476       1 predicates.go:125] Using predicate PodToleratesNodeTaints
I0106 08:13:18.747481       1 predicates.go:125] Using predicate CheckNodeDiskPressure
I0106 08:13:18.747485       1 predicates.go:125] Using predicate NoDiskConflict
I0106 08:13:18.747491       1 predicates.go:125] Using predicate NoVolumeZoneConflict
I0106 08:13:18.747494       1 predicates.go:125] Using predicate CheckNodeCondition
I0106 08:13:18.747498       1 predicates.go:125] Using predicate CheckNodeMemoryPressure
I0106 08:13:18.747502       1 predicates.go:125] Using predicate MaxAzureDiskVolumeCount
I0106 08:13:18.747507       1 predicates.go:125] Using predicate ready
I0106 08:13:18.747512       1 predicates.go:125] Using predicate MatchInterPodAffinity
I0106 08:13:18.747517       1 predicates.go:125] Using predicate MaxEBSVolumeCount
I0106 08:13:18.747521       1 predicates.go:125] Using predicate CheckVolumeBinding
I0106 08:13:18.747525       1 predicates.go:125] Using predicate MaxGCEPDVolumeCount
I0106 08:13:18.747761       1 reflector.go:202] Starting reflector *v1beta1.DaemonSet (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:293
I0106 08:13:18.747789       1 reflector.go:240] Listing and watching *v1beta1.DaemonSet from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:293
I0106 08:13:18.747984       1 reflector.go:202] Starting reflector *v1beta1.PodDisruptionBudget (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.748025       1 reflector.go:240] Listing and watching *v1beta1.PodDisruptionBudget from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.748180       1 reflector.go:202] Starting reflector *v1.Pod (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.748191       1 reflector.go:240] Listing and watching *v1.Pod from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.748452       1 reflector.go:202] Starting reflector *v1.StorageClass (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.748464       1 reflector.go:240] Listing and watching *v1.StorageClass from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.748687       1 reflector.go:202] Starting reflector *v1.PersistentVolumeClaim (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.748725       1 reflector.go:240] Listing and watching *v1.PersistentVolumeClaim from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.748787       1 reflector.go:202] Starting reflector *v1.PersistentVolume (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.748794       1 reflector.go:240] Listing and watching *v1.PersistentVolume from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.749082       1 reflector.go:202] Starting reflector *v1.Service (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.749093       1 reflector.go:240] Listing and watching *v1.Service from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.749311       1 reflector.go:202] Starting reflector *v1.ReplicationController (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.749349       1 reflector.go:240] Listing and watching *v1.ReplicationController from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.749407       1 reflector.go:202] Starting reflector *v1.Node (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.749414       1 reflector.go:240] Listing and watching *v1.Node from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.749734       1 reflector.go:202] Starting reflector *v1beta1.ReplicaSet (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.749796       1 reflector.go:240] Listing and watching *v1beta1.ReplicaSet from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.750085       1 reflector.go:202] Starting reflector *v1.Pod (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:149
I0106 08:13:18.750131       1 reflector.go:240] Listing and watching *v1.Pod from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:149
I0106 08:13:18.750241       1 reflector.go:202] Starting reflector *v1beta1.StatefulSet (0s) from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.750272       1 reflector.go:240] Listing and watching *v1beta1.StatefulSet from k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/informers/factory.go:87
I0106 08:13:18.750359       1 reflector.go:202] Starting reflector *v1.Pod (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:174
I0106 08:13:18.750370       1 reflector.go:240] Listing and watching *v1.Pod from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:174
I0106 08:13:18.750444       1 reflector.go:202] Starting reflector *v1.Node (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:212
I0106 08:13:18.750453       1 reflector.go:240] Listing and watching *v1.Node from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:212
I0106 08:13:18.750513       1 reflector.go:202] Starting reflector *v1.Node (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:239
I0106 08:13:18.750555       1 reflector.go:240] Listing and watching *v1.Node from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:239
I0106 08:13:18.750607       1 reflector.go:202] Starting reflector *v1beta1.PodDisruptionBudget (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:266
I0106 08:13:18.750612       1 reflector.go:240] Listing and watching *v1beta1.PodDisruptionBudget from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:266
I0106 08:13:18.947752       1 request.go:481] Throttling request took 197.141227ms, request: GET:https://100.64.0.1:443/api/v1/nodes?limit=500&resourceVersion=0
I0106 08:13:19.147770       1 request.go:481] Throttling request took 396.132055ms, request: PUT:https://100.64.0.1:443/api/v1/namespaces/kube-system/configmaps/cluster-autoscaler-status
I0106 08:13:19.155802       1 cloud_provider_builder.go:72] Building aws cloud provider.
I0106 08:13:19.155902       1 auto_scaling_groups.go:78] Registering ASG jobs-nodes.k8s.test.site.com
I0106 08:13:19.155913       1 auto_scaling_groups.go:139] Invalidating unowned instance cache
I0106 08:13:19.155919       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:19.345600       1 aws_manager.go:241] Refreshed ASG list, next refresh after 2019-01-06 08:14:19.345591774 +0000 UTC
I0106 08:13:19.345724       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:19.547761       1 request.go:481] Throttling request took 196.79672ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-63-118.ec2.internal
I0106 08:13:19.747763       1 request.go:481] Throttling request took 196.744941ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-61-191.ec2.internal
I0106 08:13:19.947758       1 request.go:481] Throttling request took 196.622592ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-63-16.ec2.internal
I0106 08:13:20.147755       1 request.go:481] Throttling request took 197.004768ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-63-153.ec2.internal
I0106 08:13:20.347759       1 request.go:481] Throttling request took 196.74824ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-61-151.ec2.internal
I0106 08:13:20.547762       1 request.go:481] Throttling request took 197.238579ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-63-86.ec2.internal
I0106 08:13:20.747735       1 request.go:481] Throttling request took 197.072671ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-62-182.ec2.internal
I0106 08:13:20.757579       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0106 08:13:20.947758       1 request.go:481] Throttling request took 196.767459ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-62-168.ec2.internal
I0106 08:13:21.147761       1 request.go:481] Throttling request took 196.564581ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-62-233.ec2.internal
I0106 08:13:21.347777       1 request.go:481] Throttling request took 196.680603ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-61-134.ec2.internal
I0106 08:13:21.547756       1 request.go:481] Throttling request took 196.794005ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-62-155.ec2.internal
I0106 08:13:21.747790       1 request.go:481] Throttling request took 196.72087ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-61-94.ec2.internal
I0106 08:13:21.947758       1 request.go:481] Throttling request took 196.413705ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-62-138.ec2.internal
I0106 08:13:22.147756       1 request.go:481] Throttling request took 196.80818ms, request: GET:https://100.64.0.1:443/api/v1/nodes/ip-10-0-62-74.ec2.internal
I0106 08:13:22.151032       1 main.go:228] Registered cleanup signal handler
I0106 08:13:22.767123       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0106 08:13:24.777707       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0106 08:13:26.787862       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0106 08:13:28.800381       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0106 08:13:30.811593       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0106 08:13:32.151181       1 static_autoscaler.go:114] Starting main loop
I0106 08:13:32.460952       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:32.580709       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:32.685289       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:32.821598       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0106 08:13:32.839600       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:33.012626       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:33.145119       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:33.260650       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:33.405939       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:33.572608       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:33.724017       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:33.843445       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:33.964336       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:34.076162       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:34.165768       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:34.323805       1 auto_scaling_groups.go:153] Regenerating instance to ASG map for ASGs: [jobs-nodes.k8s.test.site.com]
I0106 08:13:34.475068       1 static_autoscaler.go:263] Filtering out schedulables
I0106 08:13:34.475464       1 static_autoscaler.go:273] No schedulable pods
I0106 08:13:34.475485       1 scale_up.go:59] Pod default/ai-service-job-1546732800-m2ckm is unschedulable
W0106 08:13:34.665288       1 aws_manager.go:380] Found multiple availability zones, using us-east-1b
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x145dccd]

goroutine 74 [running]:
k8s.io/autoscaler/cluster-autoscaler/cloudprovider/aws.(*AwsManager).buildNodeFromTemplate(0xc421782b90, 0xc420e7ba40, 0xc42162d200, 0xc42162d200, 0x0, 0x0)
	/gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/aws/aws_manager.go:407 +0x48d
k8s.io/autoscaler/cluster-autoscaler/cloudprovider/aws.(*Asg).TemplateNodeInfo(0xc420e7ba40, 0xc42129fc80, 0x7ffdd5459730, 0x1f)
	/gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider.go:294 +0x78
k8s.io/autoscaler/cluster-autoscaler/core.GetNodeInfosForGroups(0xc421ce2300, 0xf, 0x10, 0x5a07ca0, 0xc4217da1d0, 0x5a14620, 0xc4200d83c0, 0xc421123b60, 0x3, 0x4, ...)
	/gopath/src/k8s.io/autoscaler/cluster-autoscaler/core/utils.go:228 +0x65a
k8s.io/autoscaler/cluster-autoscaler/core.ScaleUp(0xc420dba700, 0xc421ef1368, 0x1, 0x1, 0xc421ce2300, 0xf, 0x10, 0xc421123b60, 0x3, 0x4, ...)
	/gopath/src/k8s.io/autoscaler/cluster-autoscaler/core/scale_up.go:63 +0x393
k8s.io/autoscaler/cluster-autoscaler/core.(*StaticAutoscaler).RunOnce(0xc421092980, 0xed3c3afac, 0xe090276c9, 0x5bf5180, 0x0, 0x0)
	/gopath/src/k8s.io/autoscaler/cluster-autoscaler/core/static_autoscaler.go:299 +0x296a
main.run(0xc420a93400)
	/gopath/src/k8s.io/autoscaler/cluster-autoscaler/main.go:269 +0x474
main.main.func2(0xc420b894a0)
	/gopath/src/k8s.io/autoscaler/cluster-autoscaler/main.go:356 +0x2a
created by k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run
	/gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:145 +0x97
@Jeffwan
Copy link
Contributor

Jeffwan commented Jan 6, 2019

you checked code in branch cluster-autoscaler-release-1.2 but you use v1.2.2 release https://github.com/kubernetes/autoscaler/releases/tag/cluster-autoscaler-1.2.2. cluster-autoscaler-release-1.2 keep track of changes of v1.2.x. Latest release of this branch is v1.2.3 which doesn't have r5 family support. There's an issue #1387 talking about release of v1.2.4 that will cut over recent changes to latest version.

@losipiuk Any plan to release v1.2.4? Looks like still lots of kubernetes 1.10 users. I can help on the release.

@Jeffwan
Copy link
Contributor

Jeffwan commented Jan 6, 2019

Here's the changes from 1.2.3 to current 1.2.x branch

cluster-autoscaler-1.2.3...cluster-autoscaler-release-1.2

@k8s-ci-robot

This comment has been minimized.

@losipiuk
Copy link
Contributor

losipiuk commented Jan 7, 2019

So far the CA lifecycle was the same as k8s. Therefore we consider CA 1.2 end-of-life (as k8s 1.10 is such). We were not planning on doing any official releases of CA from 1.2 branch.

If I find some time today I can prepare a release though, yet without any strong guarantees as I do will not have time for thorough testing. @Jeffwan are all the changes you need already on 1.2 release branch?

@aleksandra-malinowska
Copy link
Contributor

@Jeffwan
Copy link
Contributor

Jeffwan commented Jan 7, 2019

Thanks @losipiuk and @aleksandra-malinowska. For new features, I won't expect to merge back to v1.2.x, new instance types are most needed.

yaroslava-serdiuk pushed a commit to yaroslava-serdiuk/autoscaler that referenced this issue Feb 22, 2024
Change-Id: Iec0b9385e4fe847825441af175444717a3660f34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants