-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(scale-down): Update Owner Logic #1065
feat(scale-down): Update Owner Logic #1065
Conversation
Can you also have a thougt about migrartion? Since your change will most likely use the tags to decide which runners to tear down. Existing runners while upgrading will most likely not have the right tag. Maybe a simple solution like a small script to add tags to existing instances can do trick that needs to be executed once after upgrade. |
Would it be worthwhile to add a function to deal with runners that got pulled in with the environment tag but then do not have any other tags? |
@npalm I am marking this ready for review. I am working on some very small code changes to deal with orphaned runners. That will solve the bug I identified and also address your concerns. I will update the PR description with that callout. |
@npalm I created mcaulifn#2 as a way to deal with legacy runners. If the approach is good, I'll merge. |
@mcaulifn did a review on mcaulifn#2 only some small comments If that's merged in here, it looks good to me. Nice work and thanks for improving this project! |
* Terminate legacy runners * Update modules/runners/lambdas/runners/src/scale-runners/scale-down.ts Co-authored-by: Gertjan Maas <gertjan@maas.codes> * Move find index to new function * Removing old comment Co-authored-by: Gertjan Maas <gertjan@maas.codes>
@gertjanmaas @npalm I think this should be ready. I will be testing it out in our dev environment next week. |
@mcaulifn will check the PR early nexrt week as well. @gertjanmaas thanks for checking |
modules/runners/lambdas/runners/src/scale-runners/scale-down.ts
Outdated
Show resolved
Hide resolved
modules/runners/lambdas/runners/src/scale-runners/scale-down.ts
Outdated
Show resolved
Hide resolved
modules/runners/lambdas/runners/src/scale-runners/scale-down.ts
Outdated
Show resolved
Hide resolved
modules/runners/lambdas/runners/src/scale-runners/scale-down.ts
Outdated
Show resolved
Hide resolved
Co-authored-by: Niek Palm <npalm@users.noreply.github.com>
} | ||
} else { | ||
console.debug(`Runner '${ec2Runner.instanceId}' is orphaned and will be removed.`); | ||
terminateOrphan(ec2Runner.instanceId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the diffs shows it correctly, it seems orphan runners are removed if not (runnerMinimumTimeExceeded(ec2Runner, minimumRunningTimeInMinutes))
. This will remove all runners that are NOT exceeding the minimal time. So potential still executing the user_data script. Orphan runners are all runners that not registred in github but running in aws as ec instance longer then the minimal time.
So removal should be done if NOT ghRunner
and minimal time exeeded. So suggest you swap if clauses on line 113 and 114. Next move the terminateOrphan to the else of the most inner if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the diff is rendering correctly. I copied that section below and added two notes.
if (ghRunner) {
if (runnerMinimumTimeExceeded(ec2Runner, minimumRunningTimeInMinutes)) {
if (idleCounter > 0) {
idleCounter--;
console.debug(`Runner '${ec2Runner.instanceId}' will kept idle.`);
} else { //If the idle counter is not greater than 0
await removeRunner(ec2Runner, ghRunner.id);
}
// implied else - If the runner min time not exceeded, continue
}
} else {
console.debug(`Runner '${ec2Runner.instanceId}' is orphaned and will be removed.`);
terminateOrphan(ec2Runner.instanceId);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think the code is not correct see below the part int he PR, second part my proposal
if (ghRunner) { // runner registred in github
if (runnerMinimumTimeExceeded(ec2Runner, minimumRunningTimeInMinutes)) { // runner is not running minimum time
if (idleCounter > 0) {
idleCounter--;
console.debug(`Runner '${ec2Runner.instanceId}' will kept idle.`);
} else { // enough idle runners so remove
await removeRunner(ec2Runner, ghRunner.id);
}
// implied else - If the runner min time not exceeded, continue
}
} else { // runner not registred in gihub, potential not exceedining minimal time, current code (dev) has check.
console.debug(`Runner '${ec2Runner.instanceId}' is orphaned and will be removed.`);
terminateOrphan(ec2Runner.instanceId);
}
Proposed change, we swam the check for minimal time and ghRunner, so if the runner is not a ghRunner we can safely remove the runner still it execeeded the minimal required time to boot.
if (runnerMinimumTimeExceeded(ec2Runner, minimumRunningTimeInMinutes)) {
if (ghRunnr) {
if (idleCounter > 0) {
idleCounter--;
console.debug(`Runner '${ec2Runner.instanceId}' will kept idle.`);
} else { // enough idle runners so remove
await removeRunner(ec2Runner, ghRunner.id);
}
} else { // runner not registered
console.debug(`Runner '${ec2Runner.instanceId}' is orphaned and will be removed.`);
terminateOrphan(ec2Runner.instanceId);
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (runnerMinimumTimeExceeded(ec2Runner, minimumRunningTimeInMinutes)) { // runner is not running minimum time
Is the "not" intended here? I removed the !
so it is now checking if the minimum time has been met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized I missed the comment "So potential still executing the user_data script". That shouldn't be the case as the default is 5 minutes and should be enough time to register. It might be worth calling out that this value should be the the user data execution time plus an additional allowance. In our case, with a 60 minute minimum run time, we don't want an orphaned runner hanging around for that long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcaulifn do you agree with my view on the if / else statements above? I see no change in the code yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I do not. The updated description covers the not enough time to boot scenario. The intention is to remove orphan runners sooner and not leave them online unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcaulifn I looked at the code in your branch and I still believe it is possible to have an EC2 instance booting up, not yet being registered in Github and then being marked as orphan and thus removed.
It might not happen much because the window for this happening is probably small.
I suggest to split the functionalities @npalm mentioned above into two variables:
runner_boot_time_in_minutes
, make sure the runner has enough time to boot without it being marked as orphan. Default to 5 mins (or whatever the idle time default is now). This time should be big enough to cover the user_data script and registering the runner to Github.minimum_running_time_in_minutes
. Revert to previous description.
This way the runners will be marked as orphan faster when a big minimum_running_time_in_minutes
is used, but still avoid runners falsely being marked as orphan.
Not sure if this should be a part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gertjanmaas That sounds good to me. Would the check be for It will need to be separate.runner_boot_time_in_minutes
+ minimum_running_time_in_minutes
?
I think it should be a part of this PR as I am changing the logic and that would satisfy both sets of requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think I have have it solved now 😆
If the runner is identified as orphaned, there is now a check to see if the minimum boot time has passed. If not, it is left alone for this run. I added a new test instance with a launchTime
of "now" that should replicate that.
@npalm I deployed locally and I see a legacy runner and new runners being handled correctly. |
@gertjanmaas Anything else on this one? |
Looks good to me, tested the default example and works as expected. @npalm do you want to take a look at it? |
@gertjanmaas thanks for checking, please feel free to merge |
## [0.19.0](v0.18.1...v0.19.0) (2021-09-30) ### Features * **scale-down:** Update Owner Logic ([#1065](#1065)) ([ba2536b](ba2536b)), closes [#2](#2) ### Bug Fixes * explicit set region for downloading runner distribution from S3 ([#1204](#1204)) ([439fb1b](439fb1b)) * upgrade jest ([#1219](#1219)) ([c8b8139](c8b8139)) * use dynamic block to ignore null market opts ([#1202](#1202)) ([df9bd78](df9bd78)) * use dynamic block to ignore null market opts ([#1202](#1202)) ([06a5598](06a5598)) * **logging:** Additional Logging ([#1135](#1135)) ([f7f194d](f7f194d)) * **scale-down:** Clearing cache between runs ([#1164](#1164)) ([e72227b](e72227b))
## [0.17.0-develop.1](v0.16.0...v0.17.0-develop.1) (2022-03-25) ### Features * Add associate_public_ip_address variable to windows AMI too ([philips-labs#1819](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1819)) ([0b8e1fc](0b8e1fc)), closes [/github.com/philips-labs/pull/1816#issuecomment-1060650668](https://github.com/enverus-cts//github.com/philips-labs/terraform-aws-github-runner/pull/1816/issues/issuecomment-1060650668) * Add associate_public_ip_address variable ([philips-labs#1816](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1816)) ([052e9f8](052e9f8)) * add format checking for lambdas in CI ([#899](#899)) ([#1080](#1080)) ([ae9c277](ae9c277)) * Add hooks for prebuilt images (AMI), including amazon linux packer example ([philips-labs#1444](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1444)) ([060daac](060daac)) * add option ephemeral runners ([philips-labs#1374](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1374)) ([2f323d6](2f323d6)), closes [philips-labs#1399](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1399) [philips-labs#1444](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1444) * Add option for ephemeral to check builds status before scaling ([philips-labs#1854](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1854)) ([7eb0bda](7eb0bda)) * Add option for KMS encryption for cloudwatch log groups ([philips-labs#1833](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1833)) ([3f1a67f](3f1a67f)) * Add option to configure concurrent running scale up lambda ([philips-labs#1415](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1415)) ([23ee630](23ee630)) * Add option to disable SSL verification support for GitHub Enterprise Server ([#1216](#1216)) ([3c3ef19](3c3ef19)), closes [#1207](#1207) * add option to format logging in JSON for lambdas ([#1228](#1228)) ([a250b96](a250b96)) * add option to overwrite / disable egress [#748](#748) ([#1112](#1112)) ([9c2548d](9c2548d)) * add option to specify SSE config for dist bucket ([philips-labs#1324](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1324)) ([ae84302](ae84302)) * Add output image id used in launch template ([philips-labs#1676](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1676)) ([a49fab4](a49fab4)) * Add possibility to create multiple ebs ([philips-labs#1845](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1845)) ([7a2ca0d](7a2ca0d)) * Add scheduled / pull based scaling for org level runners ([philips-labs#1577](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1577)) ([8197432](8197432)) * Add SQS queue resource policy to improve security ([philips-labs#1798](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1798)) ([96def9a](96def9a)) * Add Support for Alternative Partitions in ARNs (like govcloud) ([philips-labs#1815](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1815)) ([0ba06c8](0ba06c8)) * Add variable to specify custom commands while building the AMI ([philips-labs#1838](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1838)) ([8f9c342](8f9c342)) * add windows support ([philips-labs#1476](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1476)) ([dbba705](dbba705)) * adding message retention seconds ([philips-labs#1354](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1354)) ([a19929f](a19929f)) * Adding support for new workflow_job event. ([#1019](#1019)) ([a74e10b](a74e10b)) * adding var for tags for ec2s ([philips-labs#1357](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1357)) ([31cf02d](31cf02d)) * Change default location of runner to `/opt` and fix Ubuntu example ([philips-labs#1572](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1572)) ([77f350b](77f350b)) * Ignore github managed labels and add check disable option ([#1244](#1244)) ([859fa38](859fa38)) * **images:** Added ubuntu-focual example packer configuration ([philips-labs#1644](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1644)) ([997b171](997b171)) * **packer:** add vars and minor clean up ([philips-labs#1611](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1611)) ([1c897a4](1c897a4)) * Parameterise delete_on_termination ([philips-labs#1758](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1758)) ([6282351](6282351)), closes [philips-labs#1745](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1745) * remove unused app client since SSH key is used to secure app authorization ([#1223](#1223)) ([4cb5cf1](4cb5cf1)) * Replace run instance API by create fleet API ([philips-labs#1556](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1556)) ([27e974d](27e974d)) * **runner:** Ability to disable default runner security group creation ([philips-labs#1718](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1718)) ([94779f8](94779f8)) * **runner:** Add option to disable auto update ([philips-labs#1791](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1791)) ([c2a834f](c2a834f)) * **runner:** Replace patch by install ICU package for ARM runners ([philips-labs#1624](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1624)) ([74cfa51](74cfa51)) * **scale-down:** Update Owner Logic ([#1065](#1065)) ([ba2536b](ba2536b)), closes [#2](#2) * Strict label check and replace disable_check_wokflow_job_labels by opt in enable_workflow_job_labels_check ([philips-labs#1591](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1591)) ([405b11d](405b11d)) * support single line for app private key ([philips-labs#1368](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1368)) ([14183ac](14183ac)) * Support t4g Graviton instance type ([philips-labs#1561](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1561)) ([3fa5896](3fa5896)) * upgrade Terraform version of module 1.0.x ([philips-labs#1254](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1254)) ([2a817dc](2a817dc)) ### Bug Fixes * `instance_types` from a Set to a List, so instance order preference is preserved ([#1154](#1154)) ([150d227](150d227)) * add --preserve-env to start-runner.sh to enable RUNNER_ALLOW_RUNASROOT ([philips-labs#1537](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1537)) ([1cd9cd3](1cd9cd3)) * Add config for windows ami ([philips-labs#1525](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1525)) ([7907984](7907984)) * add logging context to runner lambda ([philips-labs#1399](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1399)) ([0ba0930](0ba0930)) * Add required providers to module ssm ([philips-labs#1423](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1423)) ([5b68b7b](5b68b7b)) * add runners binaries bucket as terraform output ([5809fee](5809fee)) * add validation to distribution_bucket_name variable ([philips-labs#1356](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1356)) ([6522317](6522317)) * added more detailed logging for scaling up and down ([#1222](#1222)) ([9aa7456](9aa7456)) * Autoupdate should be disabled by default ([philips-labs#1797](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1797)) ([828bed6](828bed6)) * clean up non used variables in examples ([philips-labs#1416](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1416)) ([fe65a5f](fe65a5f)) * configurable metadata options for runners ([philips-labs#1377](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1377)) ([f37df23](f37df23)) * Create SQS DLQ policy only if DLQ is created ([philips-labs#1839](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1839)) ([c88a005](c88a005)) * Don't delete busy runners ([philips-labs#1832](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1832)) ([0e9b083](0e9b083)) * Dowload lambda see [philips-labs#1541](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1541) for details. ([philips-labs#1542](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1542)) ([7cb73c8](7cb73c8)) * Download lambda ([philips-labs#1480](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1480)) ([f1b99d9](f1b99d9)) * **examples:** Update AMI filter ([philips-labs#1673](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1673)) ([39c019c](39c019c)) * explicit set region for downloading runner distribution from S3 ([#1204](#1204)) ([439fb1b](439fb1b)) * **images:** use new runner install location ([philips-labs#1628](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1628)) ([36c1bf5](36c1bf5)) * install_config_runner -> install_runner ([philips-labs#1479](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1479)) ([de5b93f](de5b93f)) * Limit AWS Terraform Provider to 3.* ([philips-labs#1741](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1741)) ([0cf2b5d](0cf2b5d)) * **logging:** Add context to webhook logs ([philips-labs#1401](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1401)) ([8094576](8094576)) * **logging:** Additional Logging ([#1135](#1135)) ([f7f194d](f7f194d)) * **logging:** Adjusting scale logging messages and levels ([philips-labs#1286](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1286)) ([665e1a6](665e1a6)) * **logging:** Adjusting webhook logs and levels ([philips-labs#1287](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1287)) ([9df5fb8](9df5fb8)) * **packer:** Add missing RUNNER_ARCHITECTURE for amazn-linux2 ([philips-labs#1647](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1647)) ([ec497a2](ec497a2)) * reducing verbosity of role and profile ([philips-labs#1358](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1358)) ([922ef99](922ef99)) * remove export from install script. ([philips-labs#1538](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1538)) ([d32ca1b](d32ca1b)) * replace depcrated 'request' dependency by 'node-fetch' ([#903](#903)) ([#1082](#1082)) ([fb51756](fb51756)) * Retention days was used instead of kms key id for pool ([philips-labs#1855](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1855)) ([aa29d93](aa29d93)) * **runner:** Cannot disable cloudwatch agent ([philips-labs#1738](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1738)) ([0f798ca](0f798ca)) * **runnrs:** Pool runners to allow multiple pool_config objects ([philips-labs#1621](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1621)) ([c9c7c69](c9c7c69)) * **scale-down:** Clearing cache between runs ([#1164](#1164)) ([e72227b](e72227b)) * **syncer:** Add tests, coverage report, and refactor lambda / naming ([philips-labs#1478](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1478)) ([8266442](8266442)) * **syncer:** Fix for windows binaries in action runner syncer ([philips-labs#1716](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1716)) ([63e0e27](63e0e27)) * Update launch template to use metadata service v2 ([philips-labs#1278](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1278)) ([ef16287](ef16287)) * update return codes, no error code for job that are ignored ([philips-labs#1381](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1381)) ([f9f705f](f9f705f)) * Upgrade Amazon base AMI to Amazon Linux 2 kernel 5x ([philips-labs#1812](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1812)) ([9aa5532](9aa5532)) * upgrade jest ([#1219](#1219)) ([c8b8139](c8b8139)) * Upgrade lambda runtime to node 14.x ([#1203](#1203)) ([570949a](570949a)) * use dynamic block to ignore null market opts ([#1202](#1202)) ([df9bd78](df9bd78)) * use dynamic block to ignore null market opts ([#1202](#1202)) ([06a5598](06a5598)) * webhook labels for `workflow_job` ([#1133](#1133)) ([4b39fb9](4b39fb9)) * **webhook:** depcrated warning on ts-jest mocked ([philips-labs#1615](https://github.com/enverus-cts/terraform-aws-github-runner/issues/1615)) ([56c1ece](56c1ece)) * **webhook:** remove node fetch ([ca14ac5](ca14ac5)) * **webhook:** replace node-fetch by axios [#1247](#1247) ([80fff4b](80fff4b))
* chore(release): 0.17.0 [skip ci] * Adding support for new workflow_job event. ([#1019](#1019)) ([a74e10b](a74e10b)) * chore(release): 0.18.0 [skip ci] * add format checking for lambdas in CI ([#899](#899)) ([#1080](#1080)) ([ae9c277](ae9c277)) * add option to overwrite / disable egress [#748](#748) ([#1112](#1112)) ([9c2548d](9c2548d)) * replace depcrated 'request' dependency by 'node-fetch' ([#903](#903)) ([#1082](#1082)) ([fb51756](fb51756)) * chore(release): 0.18.1 [skip ci] * webhook labels for `workflow_job` ([#1133](#1133)) ([4b39fb9](4b39fb9)) * chore(release): 0.19.0 [skip ci] * **scale-down:** Update Owner Logic ([#1065](#1065)) ([ba2536b](ba2536b)), closes [#2](#2) * explicit set region for downloading runner distribution from S3 ([#1204](#1204)) ([439fb1b](439fb1b)) * upgrade jest ([#1219](#1219)) ([c8b8139](c8b8139)) * use dynamic block to ignore null market opts ([#1202](#1202)) ([df9bd78](df9bd78)) * use dynamic block to ignore null market opts ([#1202](#1202)) ([06a5598](06a5598)) * **logging:** Additional Logging ([#1135](#1135)) ([f7f194d](f7f194d)) * **scale-down:** Clearing cache between runs ([#1164](#1164)) ([e72227b](e72227b)) * chore(release): 0.19.1 [skip ci] * `instance_types` from a Set to a List, so instance order preference is preserved ([#1154](#1154)) ([150d227](150d227)) * chore(release): 0.20.0 [skip ci] * Add option to disable SSL verification support for GitHub Enterprise Server ([#1216](#1216)) ([3c3ef19](3c3ef19)), closes [#1207](#1207) * chore(release): 0.20.1 [skip ci] * Upgrade lambda runtime to node 14.x ([#1203](#1203)) ([570949a](570949a)) * **webhook:** remove node fetch ([ca14ac5](ca14ac5)) * **webhook:** replace node-fetch by axios [#1247](#1247) ([80fff4b](80fff4b)) * added more detailed logging for scaling up and down ([#1222](#1222)) ([9aa7456](9aa7456)) * chore(release): 0.21.0 [skip ci] * Ignore github managed labels and add check disable option ([#1244](#1244)) ([859fa38](859fa38)) * remove unused app client since SSH key is used to secure app authorization ([#1223](#1223)) ([4cb5cf1](4cb5cf1)) * upgrade Terraform version of module 1.0.x ([#1254](#1254)) ([2a817dc](2a817dc)) * chore(release): 0.21.1 [skip ci] * **logging:** Adjusting scale logging messages and levels ([#1286](#1286)) ([665e1a6](665e1a6)) * **logging:** Adjusting webhook logs and levels ([#1287](#1287)) ([9df5fb8](9df5fb8)) * Update launch template to use metadata service v2 ([#1278](#1278)) ([ef16287](ef16287)) * chore(release): 0.22.0 [skip ci] * adding message retention seconds ([#1354](#1354)) ([a19929f](a19929f)) * adding var for tags for ec2s ([#1357](#1357)) ([31cf02d](31cf02d)) * add validation to distribution_bucket_name variable ([#1356](#1356)) ([6522317](6522317)) * chore(release): 0.23.0 [skip ci] * add option to format logging in JSON for lambdas ([#1228](#1228)) ([a250b96](a250b96)) * add option to specify SSE config for dist bucket ([#1324](#1324)) ([ae84302](ae84302)) * reducing verbosity of role and profile ([#1358](#1358)) ([922ef99](922ef99)) * chore(release): 0.23.1 [skip ci] * configurable metadata options for runners ([#1377](#1377)) ([f37df23](f37df23)) * chore(release): 0.24.0 [skip ci] * support single line for app private key ([#1368](#1368)) ([14183ac](14183ac)) * update return codes, no error code for job that are ignored ([#1381](#1381)) ([f9f705f](f9f705f)) * chore(release): 0.25.0 [skip ci] * Add option to configure concurrent running scale up lambda ([#1415](#1415)) ([23ee630](23ee630)) * clean up non used variables in examples ([#1416](#1416)) ([fe65a5f](fe65a5f)) * chore(release): 0.25.1 [skip ci] * Add required providers to module ssm ([#1423](#1423)) ([5b68b7b](5b68b7b)) * chore(release): 0.25.2 [skip ci] * add logging context to runner lambda ([#1399](#1399)) ([0ba0930](0ba0930)) * **logging:** Add context to webhook logs ([#1401](#1401)) ([8094576](8094576)) * chore(release): 0.26.0 [skip ci] * Add hooks for prebuilt images (AMI), including amazon linux packer example ([#1444](#1444)) ([060daac](060daac)) * add runners binaries bucket as terraform output ([5809fee](5809fee)) * chore(release): 0.26.1 [skip ci] * Download lambda ([#1480](#1480)) ([f1b99d9](f1b99d9)) * **syncer:** Add tests, coverage report, and refactor lambda / naming ([#1478](#1478)) ([8266442](8266442)) * install_config_runner -> install_runner ([#1479](#1479)) ([de5b93f](de5b93f)) * chore(release): 0.27.0 [skip ci] * add windows support ([#1476](#1476)) ([dbba705](dbba705)) * chore(release): 0.27.1 [skip ci] * add --preserve-env to start-runner.sh to enable RUNNER_ALLOW_RUNASROOT ([#1537](#1537)) ([1cd9cd3](1cd9cd3)) * remove export from install script. ([#1538](#1538)) ([d32ca1b](d32ca1b)) * chore(release): 0.27.2 [skip ci] * Dowload lambda see [#1541](#1541) for details. ([#1542](#1542)) ([7cb73c8](7cb73c8)) * chore(release): 0.28.0 [skip ci] * add option ephemeral runners ([#1374](#1374)) ([2f323d6](2f323d6)), closes [#1399](#1399) [#1444](#1444) * Change default location of runner to `/opt` and fix Ubuntu example ([#1572](#1572)) ([77f350b](77f350b)) * Replace run instance API by create fleet API ([#1556](#1556)) ([27e974d](27e974d)) * Support t4g Graviton instance type ([#1561](#1561)) ([3fa5896](3fa5896)) * Add config for windows ami ([#1525](#1525)) ([7907984](7907984)) * chore(release): 0.29.0 [skip ci] * Strict label check and replace disable_check_wokflow_job_labels by opt in enable_workflow_job_labels_check ([#1591](#1591)) ([405b11d](405b11d)) * chore(release): 0.30.0 [skip ci] * Add scheduled / pull based scaling for org level runners ([#1577](#1577)) ([8197432](8197432)) * chore(release): 0.30.1 [skip ci] * **runnrs:** Pool runners to allow multiple pool_config objects ([#1621](#1621)) ([c9c7c69](c9c7c69)) * chore(release): 0.31.0 [skip ci] * **packer:** add vars and minor clean up ([#1611](#1611)) ([1c897a4](1c897a4)) * **webhook:** depcrated warning on ts-jest mocked ([#1615](#1615)) ([56c1ece](56c1ece)) * chore(release): 0.32.0 [skip ci] * **runner:** Replace patch by install ICU package for ARM runners ([#1624](#1624)) ([74cfa51](74cfa51)) * **images:** use new runner install location ([#1628](#1628)) ([36c1bf5](36c1bf5)) * **packer:** Add missing RUNNER_ARCHITECTURE for amazn-linux2 ([#1647](#1647)) ([ec497a2](ec497a2)) * chore(release): 0.33.0 [skip ci] * **images:** Added ubuntu-focual example packer configuration ([#1644](#1644)) ([997b171](997b171)) * **examples:** Update AMI filter ([#1673](#1673)) ([39c019c](39c019c)) * chore(release): 0.34.0 [skip ci] * Add output image id used in launch template ([#1676](#1676)) ([a49fab4](a49fab4)) * chore(release): 0.34.1 [skip ci] * **syncer:** Fix for windows binaries in action runner syncer ([#1716](#1716)) ([63e0e27](63e0e27)) * chore(release): 0.34.2 [skip ci] * Limit AWS Terraform Provider to 3.* ([#1741](#1741)) ([0cf2b5d](0cf2b5d)) * **runner:** Cannot disable cloudwatch agent ([#1738](#1738)) ([0f798ca](0f798ca)) * chore(release): 0.35.0 [skip ci] * Parameterise delete_on_termination ([#1758](#1758)) ([6282351](6282351)), closes [#1745](#1745) * **runner:** Ability to disable default runner security group creation ([#1718](#1718)) ([94779f8](94779f8)) * chore(release): 0.36.0 [skip ci] * **runner:** Add option to disable auto update ([#1791](#1791)) ([c2a834f](c2a834f)) * chore(release): 0.37.0 [skip ci] * Add associate_public_ip_address variable to windows AMI too ([#1819](#1819)) ([0b8e1fc](0b8e1fc)), closes [/github.com//pull/1816#issuecomment-1060650668](https://github.com/philips-labs//github.com/philips-labs/terraform-aws-github-runner/pull/1816/issues/issuecomment-1060650668) * Add associate_public_ip_address variable ([#1816](#1816)) ([052e9f8](052e9f8)) * Add option for KMS encryption for cloudwatch log groups ([#1833](#1833)) ([3f1a67f](3f1a67f)) * Add SQS queue resource policy to improve security ([#1798](#1798)) ([96def9a](96def9a)) * Add Support for Alternative Partitions in ARNs (like govcloud) ([#1815](#1815)) ([0ba06c8](0ba06c8)) * Add variable to specify custom commands while building the AMI ([#1838](#1838)) ([8f9c342](8f9c342)) * Autoupdate should be disabled by default ([#1797](#1797)) ([828bed6](828bed6)) * Create SQS DLQ policy only if DLQ is created ([#1839](#1839)) ([c88a005](c88a005)) * Upgrade Amazon base AMI to Amazon Linux 2 kernel 5x ([#1812](#1812)) ([9aa5532](9aa5532)) * chore(release): 0.38.0 [skip ci] * Add option for ephemeral to check builds status before scaling ([#1854](#1854)) ([7eb0bda](7eb0bda)) * Retention days was used instead of kms key id for pool ([#1855](#1855)) ([aa29d93](aa29d93)) * chore(release): 0.39.0 [skip ci] * Add possibility to create multiple ebs ([#1845](#1845)) ([7a2ca0d](7a2ca0d)) * Don't delete busy runners ([#1832](#1832)) ([0e9b083](0e9b083)) * chore(release): 0.40.0 [skip ci] * Support multi runner process support for runner scale down. ([#1859](#1859)) ([3658d6a](3658d6a)) * Set the minimal AWS provider to 3.50 ([#1937](#1937)) ([16095d8](16095d8)) * chore(release): 0.40.1 [skip ci] * Avoid non semantic commontes can be merged. ([#1969](#1969)) ([ad1c872](ad1c872)) * chore(release): 0.40.2 [skip ci] * Outputs for pool need to account for complexity ([#1970](#1970)) ([2d92906](2d92906)) * chore(release): 0.40.3 [skip ci] * Volume size is ingored ([#2014](#2014)) ([b733248](b733248)), closes [#1954](#1954) * chore(release): 0.40.4 [skip ci] * Wrong block device mapping ([#2019](#2019)) ([c42a467](c42a467)) * chore(release): 1.0.0 [skip ci] * var.volume_size replaced by var.block_device_mappings * The module is upgraded to AWS Terraform provider 4.x * Improve syncer s3 kms encryption ([38ed5be](38ed5be)) * Remove var.volume_size in favour of var.block_device_mappings ([4e97048](4e97048)) * Support AWS 4.x Terraform provider ([#1739](#1739)) ([cfb6da2](cfb6da2)) * Wrong block device mapping ([#2019](#2019)) ([185ef20](185ef20)) * chore(release): 1.1.0 [skip ci] * Add option to enable detailed monitoring for runner launch template ([#2024](#2024)) ([e73a267](e73a267)) * chore(release): 1.1.1 [skip ci] * **runner:** Don't treat the string "false" as true. ([#2051](#2051)) ([b67c7dc](b67c7dc)) * chore(release): 1.2.0 [skip ci] * Replace environment variable by prefix ([#1858](#1858)) ([e2f9a27](e2f9a27)) * docs: fix hyperlinks in the Terraform Registry documentation (#2085) This makes the hyperlink correct in the Terraform Registry documentation * chore(release): 1.3.0 [skip ci] * Support arm64 lambda functions ([#2121](#2121)) ([9e2a7b6](9e2a7b6)) * Support Node16 for AWS Lambda ([#2073](#2073)) ([68a2014](68a2014)) * replaced old environment variable ([#2146](#2146)) ([f2072f7](f2072f7)) * set explicit permissions on s3 for syncer lambda ([#2145](#2145)) ([aa7edd1](aa7edd1)) * set kms key on aws_s3_object when encryption is enabled ([#2147](#2147)) ([b4dc706](b4dc706)) * chore(release): 1.4.0 [skip ci] * Add option to match some of the labes instead of all [#2122](#2122) ([#2123](#2123)) ([c5e3c21](c5e3c21)) * don't apply extra labels unless defined ([#2181](#2181)) ([c0b11bb](c0b11bb)) * Remove asterik in permission for runner lambda to describe instances ([9b9da03](9b9da03)) * chore(release): 1.4.1 [skip ci] * added server_side_encryption key to download trigger for distribution ([#2207](#2207)) ([404e3b6](404e3b6)) * chore(release): 1.5.0 [skip ci] * Add ubuntu-jammy example image based on existing ubuntu-focal ([#2102](#2102)) ([486ae91](486ae91)) * **images:** avoid wrong AMI could be selected for ubuntu focal ([#2214](#2214)) ([76be94b](76be94b)) * chore(release): 1.6.0 [skip ci] * Add options extra option to ebs block device mapping ([#2052](#2052)) ([7cd2524](7cd2524)) * Enable node16 default ([#2074](#2074)) ([58aa5ed](58aa5ed)) * Incorrect path of Runner logs ([#2233](#2233)) ([98eff98](98eff98)) * Preventing that lambda webhook fails when it tries to process an installation_repositories event ([#2288](#2288)) ([8656c83](8656c83)) * Update ubuntu example to fix /opt/hostedtoolcache ([#2302](#2302)) ([8eea748](8eea748)) * Webhook lambda misleading log ([#2291](#2291)) ([c6275f9](c6275f9)) * chore(release): 1.7.0 [skip ci] * Webhook accept jobs where not all labels are provided in job. ([#2209](#2209)) ([6d9116f](6d9116f)) * Ignore case for runner labels. ([#2315](#2315)) ([014985a](014985a)) * chore(release): 1.8.0 [skip ci] * Add option to disable lambda to sync runner binaries ([#2314](#2314)) ([9f7d32d](9f7d32d)) * **examples:** Upgrading ubuntu example to 22.04 ([#2250](#2250)) ([d4b7650](d4b7650)), closes [#2103](#2103) * chore(release): 1.8.1 [skip ci] * **runners:** Pass allocation strategy ([#2345](#2345)) ([68d3445](68d3445)) * chore(release): 1.9.0 [skip ci] * Add option to enable access log for API gateway ([#2387](#2387)) ([fcd9fba](fcd9fba)) * add s3_location_runner_distribution var as expandable for userdata ([#2371](#2371)) ([05fe737](05fe737)) * Encrypted data at REST on SQS by default ([#2431](#2431)) ([7f3f4bf](7f3f4bf)) * **images:** Allow passing instance type when building windows image ([#2369](#2369)) ([eca23bf](eca23bf)) * **runners:** Fetch instance environment tag though metadata ([#2346](#2346)) ([27db290](27db290)) * **runners:** Set the default Windows AMI to Server 2022 ([#2325](#2325)) ([78e99d1](78e99d1)) * chore(release): 1.9.1 [skip ci] * **webhook:** Use `x-hub-signature-256` header as default ([#2434](#2434)) ([9c3e495](9c3e495)) * chore(release): 1.10.0 [skip ci] * Download runner release via latest release API ([#2455](#2455)) ([e75e092](e75e092)) * fix: Execute runner in own process, mask token in logs * Add option to disable user_data logging * Enforcing debug is disabled, and introduce option to enable debug logging. * add section related to security considerations * add section related to security considerations Co-authored-by: semantic-release-bot <semantic-release-bot@martynus.net> Co-authored-by: Derek Crosson <derekcrosson18@gmail.com>
tag = type
andtag = owner
instead oftag = type
andvalue = owner
NOTE: Runners that do not contain the new tags will be terminated when the scale down lambda runs.
Resolves #1021