-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicitly set aws-node-termination-handler queue region so crash-loops are avoided, allowing faster startup #977
base: main
Are you sure you want to change the base?
Conversation
/run cluster-test-suites |
…ps are avoided, allowing faster startup
cluster-test-suites
📋 View full results in Tekton Dashboard Rerun trigger: Tip To only re-run the failed test suites you can provide a |
/run cluster-test-suites TARGET_SUITES=./providers/capa/china,./providers/capa/private |
cluster-test-suites
📋 View full results in Tekton Dashboard Rerun trigger: Tip To only re-run the failed test suites you can provide a |
/run cluster-test-suites TARGET_SUITES=./providers/capa/china |
cluster-test-suites
📋 View full results in Tekton Dashboard Rerun trigger: Tip To only re-run the failed test suites you can provide a |
/run cluster-test-suites TARGET_SUITES=./providers/capa/china |
cluster-test-suites
📋 View full results in Tekton Dashboard Rerun trigger: Tip To only re-run the failed test suites you can provide a |
/run cluster-test-suites |
There were differences in the rendered Helm template, please check! Output
|
Oh No! 😱 At least one test suite has failed during the Be sure to check the full results in Tekton Dashboard to see which test suite has failed and then run the following on the associated MC to list all leftover resources: PIPELINE_RUN="pr-cluster-aws-977-cluster-test-suiteslksjr"
NAMES="$(kubectl api-resources --verbs list -o name | tr '\n' ,)"
kubectl get "${NAMES:0:${#NAMES}-1}" --show-kind --ignore-not-found -l cicd.giantswarm.io/pipelinerun=${PIPELINE_RUN} -A 2>/dev/null |
cluster-test-suites
📋 View full results in Tekton Dashboard Rerun trigger: Tip To only re-run the failed test suites you can provide a |
What this PR does / why we need it
Towards giantswarm/roadmap#3802
Until now, NTH only started operating minutes after the cluster came up, or in unhealthy cluster conditions, even later. That could slow down ASG instance refreshes, node termination, etc. NTH only came up because the
AWS_REGION
environment variable is injected by IRSA.The crash-looping message
FTL Unable to find the AWS region to process queue events.
goes away with this fix, but the pod still requires IRSA credentials injection to operate, so it may still take a few minutes to start up. But at least the error becomes clearer with this fix, and we avoid getting alerted.Checklist