Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: sets fail-fast to false for matrix workflows #995

Merged
merged 3 commits into from
Nov 8, 2024
Merged

Conversation

noahpb
Copy link
Contributor

@noahpb noahpb commented Nov 8, 2024

Description

We've noticed some behavior on the nightly ci pipelines where one failed parallel job was causing other in-progress jobs in the same workflow run to non gracefully terminate as the cancellation timeout would be met. This led to inconsistent behavior with subsequent workflow runs involving terraform where pre-existing state files were locked. Workflows are continually failing until a new commit is made, which forces the workflow to generate a new state key. This would circumvent the issue, but consequentially leave behind orphaned resources.

The intent of this PR is to ensure that all failed jobs gracefully exit and do not impact the status of other jobs that are running in the same workflow. This will add additional time to workflow runs, but it will always ensure that resources are properly cleaned up and that processes gracefully terminate before the pipeline completing.
...

Related Issue

Example workflow run:
https://github.com/defenseunicorns/uds-core/actions/runs/11747339383/job/32731009925?pr=989#step:9:63

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Other (security config, docs update, etc)

Checklist before merging

@noahpb noahpb marked this pull request as ready for review November 8, 2024 20:56
@noahpb noahpb requested a review from a team as a code owner November 8, 2024 20:56
Copy link
Contributor

@UnicornChance UnicornChance left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the time trade off is worth the cleanup

@noahpb noahpb merged commit 3008788 into main Nov 8, 2024
14 checks passed
@noahpb noahpb deleted the fix/fail-strategy branch November 8, 2024 22:46
UnicornChance pushed a commit that referenced this pull request Nov 12, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.31.0](v0.30.0...v0.31.0)
(2024-11-12)


### ⚠ BREAKING CHANGES

* Remove the generated exception block from the remoteCidr generation.
This change means that a cidr containing the META_IP could be set.

### Bug Fixes

* avoids memory leak in istio sidecar termination
([#972](#972))
([bfd415e](bfd415e))
* ensure grafana does not install plugins from the internet
([#993](#993))
([f3def45](f3def45))
* remove remoteCidr exception block
([#987](#987))
([264fbf6](264fbf6))
* renovate config updated to track tests
([#981](#981))
([2494448](2494448))
* sets `fail-fast` to `false` for matrix workflows
([#995](#995))
([3008788](3008788))
* sort auth chains when building the authservice config
([#969](#969))
([15487fb](15487fb))


### Miscellaneous

* add prometheus, loki, and vector e2e testing
([#939](#939))
([f271ce2](f271ce2))
* add the scorecard supply chain security workflow
([#917](#917))
([5626f2f](5626f2f))
* **deps:** update authservice to v1.0.3
([#893](#893))
([5585a3c](5585a3c))
* **deps:** update grafana curl-fips image to v8.11.0
([#994](#994))
([dfc4c8c](dfc4c8c))
* **deps:** update grafana to 11.3.0
([#921](#921))
([7cdd742](7cdd742))
* **deps:** update loki to 3.2.1
([#918](#918))
([5fa6a24](5fa6a24))
* **deps:** update loki to v6.19.0
([#990](#990))
([8bbac53](8bbac53))
* **deps:** update pepr to v0.39.0
([#932](#932))
([27eb1bd](27eb1bd))
* **deps:** update support dependencies to v3.27.2
([#1001](#1001))
([8702952](8702952))
* **deps:** update support dependencies to v3.3.0
([#985](#985))
([4636a38](4636a38))
* **deps:** update support dependencies to v3.3.1
([#1002](#1002))
([8c20b49](8c20b49))
* **deps:** update support-deps
([#928](#928))
([a9cf1f2](a9cf1f2))
* **deps:** update support-deps
([#983](#983))
([dc3084b](dc3084b))
* **deps:** update support-deps
([#989](#989))
([7a1c74e](7a1c74e))
* **deps:** update velero
([#956](#956))
([7746092](7746092))
* regroup renovate support dependencies
([#979](#979))
([6491be9](6491be9))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants