Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aws-eks] EKS cluster fails to update with helm chart added #6381

Closed
lkoniecz opened this issue Feb 20, 2020 · 1 comment · Fixed by #6522
Closed

[aws-eks] EKS cluster fails to update with helm chart added #6381

lkoniecz opened this issue Feb 20, 2020 · 1 comment · Fixed by #6522
Assignees
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. p1

Comments

@lkoniecz
Copy link

We've recently added prometheus chart to our cluster resources:

eks_cluster.add_chart(
    id='Prometheus',
    chart='prometheus',
    repository='https://kubernetes-charts.storage.googleapis.com/',
    release='prometheus',
    version='10.4.0',
    namespace='monitoring'
)

An attempt to update the cluster fails, logs below. This happens occasionally, might be related to the race condition issues we've been facing in the past: #4087

Reproduction Steps

Add helm chart to your EKS cluster and deploy the stack.

Error Log

sandbox-eks-cluster: creating CloudFormation changeset...
  0/72 | 8:59:57 | UPDATE_IN_PROGRESS   | AWS::ElasticLoadBalancing::LoadBalancer | SandboxNginxLoadBalancer/SandboxNginxLoadBalancer (SandboxNginxLoadBalancer16F6BF5F) 
  1/72 | 8:59:57 | UPDATE_COMPLETE      | AWS::ElasticLoadBalancing::LoadBalancer | SandboxNginxLoadBalancer/SandboxNginxLoadBalancer (SandboxNginxLoadBalancer16F6BF5F) 
  1/72 | 8:59:57 | UPDATE_IN_PROGRESS   | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.ClusterResourceProvider.NestedStack/@aws-cdk--aws-eks.ClusterResourceProvider.NestedStackResource (awscdkawseksClusterResourceProviderNestedStackawscdkawseksClusterResourceProviderNestedStackResource9827C454) 
  1/72 | 8:59:58 | UPDATE_IN_PROGRESS   | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.KubectlProvider.NestedStack/@aws-cdk--aws-eks.KubectlProvider.NestedStackResource (awscdkawseksKubectlProviderNestedStackawscdkawseksKubectlProviderNestedStackResourceA7AEBA6B) 
  2/72 | 8:59:58 | UPDATE_COMPLETE      | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.ClusterResourceProvider.NestedStack/@aws-cdk--aws-eks.ClusterResourceProvider.NestedStackResource (awscdkawseksClusterResourceProviderNestedStackawscdkawseksClusterResourceProviderNestedStackResource9827C454) 
 2/72 Currently in progress: awscdkawseksKubectlProviderNestedStackawscdkawseksKubectlProviderNestedStackResourceA7AEBA6B
  3/72 | 9:01:00 | UPDATE_COMPLETE      | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.KubectlProvider.NestedStack/@aws-cdk--aws-eks.KubectlProvider.NestedStackResource (awscdkawseksKubectlProviderNestedStackawscdkawseksKubectlProviderNestedStackResourceA7AEBA6B) 
  3/72 | 9:01:17 | CREATE_IN_PROGRESS   | Custom::AWSCDK-EKS-HelmChart            | SandboxEksCluster/SandboxEksCluster/chart-Prometheus/Resource/Default (SandboxEksClusterchartPrometheusD00EF105) 
  3/72 | 9:01:17 | UPDATE_IN_PROGRESS   | AWS::Lambda::Function                   | SandboxEksClusterLogs/EksCloudwatchFunction (SandboxEksClusterLogsEksCloudwatchFunction33328456) 
  3/72 | 9:01:18 | UPDATE_IN_PROGRESS   | Custom::AWSCDK-EKS-KubernetesResource   | SandboxEksCluster/SandboxEksCluster/manifest-ClusterAutoscalerDeployment/Resource/Default (SandboxEksClustermanifestClusterAutoscalerDeployment4CF7C8B5) 
  4/72 | 9:01:18 | UPDATE_COMPLETE      | AWS::Lambda::Function                   | SandboxEksClusterLogs/EksCloudwatchFunction (SandboxEksClusterLogsEksCloudwatchFunction33328456) 
  5/72 | 9:01:36 | UPDATE_COMPLETE      | Custom::AWSCDK-EKS-KubernetesResource   | SandboxEksCluster/SandboxEksCluster/manifest-ClusterAutoscalerDeployment/Resource/Default (SandboxEksClustermanifestClusterAutoscalerDeployment4CF7C8B5) 
 5/72 Currently in progress: SandboxEksClusterchartPrometheusD00EF105
  5/72 | 9:02:34 | CREATE_IN_PROGRESS   | Custom::AWSCDK-EKS-HelmChart            | SandboxEksCluster/SandboxEksCluster/chart-Prometheus/Resource/Default (SandboxEksClusterchartPrometheusD00EF105) Resource creation Initiated
  6/72 | 9:02:35 | CREATE_FAILED        | Custom::AWSCDK-EKS-HelmChart            | SandboxEksCluster/SandboxEksCluster/chart-Prometheus/Resource/Default (SandboxEksClusterchartPrometheusD00EF105) Failed to create resource. Error: b'\n[Errno 32] Broken pipe\n'
    at invokeUserFunction (/var/task/framework.js:85:19)
    at process._tickCallback (internal/process/next_tick.js:68:7)
	new CustomResource (/tmp/jsii-kernel-yFL5X8/node_modules/@aws-cdk/aws-cloudformation/lib/custom-resource.js:56:25)
	\_ new HelmChart (/tmp/jsii-kernel-yFL5X8/node_modules/@aws-cdk/aws-eks/lib/helm-chart.js:16:9)
	\_ Cluster.addChart (/tmp/jsii-kernel-yFL5X8/node_modules/@aws-cdk/aws-eks/lib/cluster.js:265:16)
	\_ /home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/.env/lib/python3.7/site-packages/jsii/_embedded/jsii/jsii-runtime.js:7589:51
	\_ Kernel._wrapSandboxCode (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/.env/lib/python3.7/site-packages/jsii/_embedded/jsii/jsii-runtime.js:8222:20)
	\_ /home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/.env/lib/python3.7/site-packages/jsii/_embedded/jsii/jsii-runtime.js:7589:25
	\_ Kernel._ensureSync (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/.env/lib/python3.7/site-packages/jsii/_embedded/jsii/jsii-runtime.js:8198:20)
	\_ Kernel.invoke (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/.env/lib/python3.7/site-packages/jsii/_embedded/jsii/jsii-runtime.js:7588:26)
	\_ KernelHost.processRequest (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/.env/lib/python3.7/site-packages/jsii/_embedded/jsii/jsii-runtime.js:7296:28)
	\_ KernelHost.run (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/.env/lib/python3.7/site-packages/jsii/_embedded/jsii/jsii-runtime.js:7236:14)
	\_ Immediate._onImmediate (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/.env/lib/python3.7/site-packages/jsii/_embedded/jsii/jsii-runtime.js:7239:37)
	\_ processImmediate (internal/timers.js:439:21)
  6/72 | 9:02:36 | UPDATE_ROLLBACK_IN_P | AWS::CloudFormation::Stack              | sandbox-eks-cluster The following resource(s) failed to create: [SandboxEksClusterchartPrometheusD00EF105]. 
  6/72 | 9:02:47 | UPDATE_IN_PROGRESS   | AWS::ElasticLoadBalancing::LoadBalancer | SandboxNginxLoadBalancer/SandboxNginxLoadBalancer (SandboxNginxLoadBalancer16F6BF5F) 
  6/72 | 9:02:47 | UPDATE_IN_PROGRESS   | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.ClusterResourceProvider.NestedStack/@aws-cdk--aws-eks.ClusterResourceProvider.NestedStackResource (awscdkawseksClusterResourceProviderNestedStackawscdkawseksClusterResourceProviderNestedStackResource9827C454) 
  6/72 | 9:02:47 | UPDATE_IN_PROGRESS   | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.KubectlProvider.NestedStack/@aws-cdk--aws-eks.KubectlProvider.NestedStackResource (awscdkawseksKubectlProviderNestedStackawscdkawseksKubectlProviderNestedStackResourceA7AEBA6B) 
  7/72 | 9:02:47 | UPDATE_COMPLETE      | AWS::ElasticLoadBalancing::LoadBalancer | SandboxNginxLoadBalancer/SandboxNginxLoadBalancer (SandboxNginxLoadBalancer16F6BF5F) 
  8/72 | 9:02:47 | UPDATE_COMPLETE      | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.ClusterResourceProvider.NestedStack/@aws-cdk--aws-eks.ClusterResourceProvider.NestedStackResource (awscdkawseksClusterResourceProviderNestedStackawscdkawseksClusterResourceProviderNestedStackResource9827C454) 
  9/72 | 9:03:22 | UPDATE_COMPLETE      | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.KubectlProvider.NestedStack/@aws-cdk--aws-eks.KubectlProvider.NestedStackResource (awscdkawseksKubectlProviderNestedStackawscdkawseksKubectlProviderNestedStackResourceA7AEBA6B) 
  9/72 | 9:03:25 | UPDATE_IN_PROGRESS   | Custom::AWSCDK-EKS-KubernetesResource   | SandboxEksCluster/SandboxEksCluster/manifest-ClusterAutoscalerDeployment/Resource/Default (SandboxEksClustermanifestClusterAutoscalerDeployment4CF7C8B5) 
  9/72 | 9:03:26 | UPDATE_IN_PROGRESS   | AWS::Lambda::Function                   | SandboxEksClusterLogs/EksCloudwatchFunction (SandboxEksClusterLogsEksCloudwatchFunction33328456) 
 10/72 | 9:03:27 | UPDATE_COMPLETE      | AWS::Lambda::Function                   | SandboxEksClusterLogs/EksCloudwatchFunction (SandboxEksClusterLogsEksCloudwatchFunction33328456) 
 11/72 | 9:03:41 | UPDATE_COMPLETE      | Custom::AWSCDK-EKS-KubernetesResource   | SandboxEksCluster/SandboxEksCluster/manifest-ClusterAutoscalerDeployment/Resource/Default (SandboxEksClustermanifestClusterAutoscalerDeployment4CF7C8B5) 
 11/72 | 9:03:42 | UPDATE_ROLLBACK_COMP | AWS::CloudFormation::Stack              | sandbox-eks-cluster 
 11/72 | 9:03:43 | DELETE_IN_PROGRESS   | AWS::CloudFormation::CustomResource     | SandboxEksCluster/SandboxEksCluster/chart-Prometheus/Resource/Default (SandboxEksClusterchartPrometheusD00EF105) 
 12/72 | 9:03:44 | UPDATE_COMPLETE      | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.ClusterResourceProvider.NestedStack/@aws-cdk--aws-eks.ClusterResourceProvider.NestedStackResource (awscdkawseksClusterResourceProviderNestedStackawscdkawseksClusterResourceProviderNestedStackResource9827C454) 
 13/72 | 9:03:45 | DELETE_COMPLETE      | AWS::CloudFormation::CustomResource     | SandboxEksCluster/SandboxEksCluster/chart-Prometheus/Resource/Default (SandboxEksClusterchartPrometheusD00EF105) 
 14/72 | 9:04:05 | UPDATE_COMPLETE      | AWS::CloudFormation::Stack              | @aws-cdk--aws-eks.KubectlProvider.NestedStack/@aws-cdk--aws-eks.KubectlProvider.NestedStackResource (awscdkawseksKubectlProviderNestedStackawscdkawseksKubectlProviderNestedStackResourceA7AEBA6B) 
 15/72 | 9:04:05 | UPDATE_ROLLBACK_COMP | AWS::CloudFormation::Stack              | sandbox-eks-cluster 

 ❌  sandbox-eks-cluster failed: Error: The stack named sandbox-eks-cluster is in a failed state: UPDATE_ROLLBACK_COMPLETE
    at /home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/node_modules/aws-cdk/lib/api/util/cloudformation.ts:165:13
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at waitFor (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/node_modules/aws-cdk/lib/api/util/cloudformation.ts:76:20)
    at Object.deployStack (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/node_modules/aws-cdk/lib/api/deploy-stack.ts:107:5)
    at CdkToolkit.deploy (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/node_modules/aws-cdk/lib/cdk-toolkit.ts:137:24)
    at main (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/node_modules/aws-cdk/bin/cdk.ts:212:16)
    at initCommandLine (/home/lky/Repositories/CasinoServerCloudformation/cdk/eks/cluster/node_modules/aws-cdk/bin/cdk.ts:164:9)
The stack named sandbox-eks-cluster is in a failed state: UPDATE_ROLLBACK_COMPLETE

Handler logs:

[ERROR] Exception: b'\n[Errno 32] Broken pipe\n' Traceback (most recent call last):   File "/var/task/index.py", line 16, in handler     return helm_handler(event, context)   File "/var/task/helm/__init__.py", line 47, in helm_handler     helm('upgrade', release, chart, repository, values_file, namespace, version)   File "/var/task/helm/__init__.py", line 74, in helm     raise Exception(exc.output
[ERROR] Exception: b'\n[Errno 32] Broken pipe\n'
Traceback (most recent call last):
  File "/var/task/index.py", line 16, in handler
    return helm_handler(event, context)
  File "/var/task/helm/__init__.py", line 47, in helm_handler
    helm('upgrade', release, chart, repository, values_file, namespace, version)
  File "/var/task/helm/__init__.py", line 74, in helm
    raise Exception(exc.output)

Environment

  • **CLI Version : 1.22.0
  • Framework Version: 1.22.0
  • OS : any
  • Language : any

Other


This is 🐛 Bug Report

@lkoniecz lkoniecz added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Feb 20, 2020
@SomayaB SomayaB added the @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service label Feb 21, 2020
@eladb
Copy link
Contributor

eladb commented Mar 1, 2020

I am wondering if this is related to helm/helm#3480

eladb pushed a commit that referenced this issue Mar 1, 2020
Retry three times if helm fails if a “broken pipe” error.

Fixes #6381
@eladb eladb added p1 in-progress This issue is being actively worked on. labels Mar 1, 2020
@SomayaB SomayaB removed the needs-triage This issue or PR still needs to be triaged. label Mar 2, 2020
@ccfife ccfife mentioned this issue Mar 3, 2020
19 tasks
@mergify mergify bot closed this as completed in #6522 Mar 10, 2020
mergify bot pushed a commit that referenced this issue Mar 10, 2020
fix(eks): sporadic broken pipe when deploying helm charts (#6522)

Retry three times if helm fails if a “broken pipe” error.

Fixes #6381
@iliapolo iliapolo changed the title EKS cluster fails to update with helm chart added [aws-eks] EKS cluster fails to update with helm chart added Aug 16, 2020
@iliapolo iliapolo removed the in-progress This issue is being actively worked on. label Aug 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. p1
Projects
None yet
4 participants