Skip to content

Commit

Permalink
More resilient DRA packaging (#39332)
Browse files Browse the repository at this point in the history
Occasionally packaging steps from the DRA pipeline may get stuck[^1].
This causes a breach of the global pipeline timeout (currently 1hr) and
cancels the job.

This commit increases the global timeout to 90min, adds one retry per
step and limits the runtime per step to 40min (so that a single stuck
step doesn't exhaust the entire global timeout).

Finally, we shush slack notifications if the retry recovered the step.

In a future PR we will consider also adding a daily DRA build to cover
for cases where the retries didn't help and there were no subsequent
commits to trigger a new build.

[^1]: https://buildkite.com/elastic/beats-packaging-pipeline/builds/114

(cherry picked from commit 726f6e9)

# Conflicts:
#	catalog-info.yaml
  • Loading branch information
dliappis authored and mergify[bot] committed May 1, 2024
1 parent 6947755 commit efafd15
Show file tree
Hide file tree
Showing 2 changed files with 169 additions and 0 deletions.
32 changes: 32 additions & 0 deletions .buildkite/packaging.pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "${GCP_DEFAULT_MACHINE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
commands:
- make build/distributions/dependencies.csv
- make beats-dashboards
Expand All @@ -62,6 +66,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "${GCP_DEFAULT_MACHINE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
commands:
- make build/distributions/dependencies.csv
- make beats-dashboards
Expand All @@ -86,6 +94,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "${GCP_DEFAULT_MACHINE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*
matrix:
Expand Down Expand Up @@ -116,6 +128,10 @@ steps:
provider: "aws"
imagePrefix: "${AWS_IMAGE_UBUNTU_ARM_64}"
instanceType: "${AWS_ARM_INSTANCE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*
matrix:
Expand All @@ -142,6 +158,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "c2-standard-16"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*

Expand All @@ -161,6 +181,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "${GCP_DEFAULT_MACHINE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*
matrix:
Expand Down Expand Up @@ -191,6 +215,10 @@ steps:
provider: "aws"
imagePrefix: "${AWS_IMAGE_UBUNTU_ARM_64}"
instanceType: "${AWS_ARM_INSTANCE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*
matrix:
Expand All @@ -217,6 +245,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "c2-standard-16"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*

Expand Down
137 changes: 137 additions & 0 deletions catalog-info.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1015,4 +1015,141 @@ spec:
release-eng:
access_level: BUILD_AND_READ
everyone:
<<<<<<< HEAD
access_level: READ_ONLY
=======
access_level: BUILD_AND_READ

---
# yaml-language-server: $schema=https://gist.githubusercontent.com/elasticmachine/988b80dae436cafea07d9a4a460a011d/raw/rre.schema.json
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
name: beats-packaging-pipeline
description: Buildkite pipeline for packaging and publishing to DRA
links:
- title: Pipeline
url: https://buildkite.com/elastic/beats-packaging-pipeline
spec:
type: buildkite-pipeline
owner: group:ingest-fp
system: buildkite
implementation:
apiVersion: buildkite.elastic.dev/v1
kind: Pipeline
metadata:
name: beats-packaging-pipeline
description: Pipeline for Beats packaging and publishing DRA artifacts
spec:
repository: elastic/beats
pipeline_file: ".buildkite/packaging.pipeline.yml"
branch_configuration: "main 8.14"
# TODO enable after packaging backports for release branches
# branch_configuration: "main 8.* 7.17"
cancel_intermediate_builds: false
skip_intermediate_builds: false
maximum_timeout_in_minutes: 90
provider_settings:
build_branches: true
build_pull_request_forks: false
build_pull_requests: false
build_tags: false
filter_condition: >-
build.branch =~ /^[0-9]+\.[0-9]+$$/ || build.branch == "main"
filter_enabled: true
trigger_mode: code
env:
ELASTIC_SLACK_NOTIFICATIONS_ENABLED: 'true'
SLACK_NOTIFICATIONS_CHANNEL: '#ingest-notifications'
SLACK_NOTIFICATIONS_ON_SUCCESS: 'false'
SLACK_NOTIFICATIONS_SKIP_FOR_RETRIES: 'true'
teams:
ingest-fp:
access_level: MANAGE_BUILD_AND_READ
release-eng:
access_level: BUILD_AND_READ
everyone:
access_level: BUILD_AND_READ

---
# yaml-language-server: $schema=https://gist.githubusercontent.com/elasticmachine/988b80dae436cafea07d9a4a460a011d/raw/rre.schema.json
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
name: beats-ironbank-validation
description: Buildkite pipeline for validating the Ironbank docker context
links:
- title: Pipeline
url: https://buildkite.com/elastic/beats-ironbank-validation
spec:
type: buildkite-pipeline
owner: group:ingest-fp
system: buildkite
implementation:
apiVersion: buildkite.elastic.dev/v1
kind: Pipeline
metadata:
name: beats-ironbank-validation
description: Buildkite pipeline for validating the Ironbank docker context
spec:
repository: elastic/beats
pipeline_file: ".buildkite/ironbank-validation.yml"
branch_configuration: "main 8.* 7.17"
cancel_intermediate_builds: false
skip_intermediate_builds: false
provider_settings:
trigger_mode: none
teams:
ingest-fp:
access_level: MANAGE_BUILD_AND_READ
release-eng:
access_level: BUILD_AND_READ
everyone:
access_level: BUILD_AND_READ

---
# yaml-language-server: $schema=https://gist.githubusercontent.com/elasticmachine/988b80dae436cafea07d9a4a460a011d/raw/rre.schema.json
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
name: beats-pipeline-scheduler
description: 'Scheduled runs of various Beats pipelines per release branch'
links:
- title: 'Scheduled runs of Beats pipelines per release branch'
url: https://buildkite.com/elastic/logstash-pipeline-scheduler
spec:
type: buildkite-pipeline
owner: group:ingest-fp
system: buildkite
implementation:
apiVersion: buildkite.elastic.dev/v1
kind: Pipeline
metadata:
name: beats-pipeline-scheduler
description: ':alarm_clock: Scheduled runs of various Beats pipelines per release branch'
spec:
repository: elastic/beats
pipeline_file: ".buildkite/pipeline-scheduler.yml"
maximum_timeout_in_minutes: 240
schedules:
Daily run of Iron Bank validation:
branch: main
cronline: 30 02 * * *
message: Daily trigger of Iron Bank validation Pipeline per branch
env:
PIPELINES_TO_TRIGGER: 'beats-ironbank-validation'
skip_intermediate_builds: true
provider_settings:
trigger_mode: none
env:
ELASTIC_SLACK_NOTIFICATIONS_ENABLED: 'true'
SLACK_NOTIFICATIONS_CHANNEL: '#ingest-notifications'
SLACK_NOTIFICATIONS_ON_SUCCESS: 'false'
teams:
ingest-fp:
access_level: MANAGE_BUILD_AND_READ
release-eng:
access_level: BUILD_AND_READ
everyone:
access_level: BUILD_AND_READ
>>>>>>> 726f6e9bde (More resilient DRA packaging (#39332))

0 comments on commit efafd15

Please sign in to comment.