Skip to content

Commit

Permalink
feat: Strict label check and replace disable_check_wokflow_job_labels…
Browse files Browse the repository at this point in the history
… by opt in enable_workflow_job_labels_check (philips-labs#1591)

* Check strict labels

* feat: Replace disable_check_wokflow_job_labels by opt in enable_workflow_job_labels_check, and check all labels.

* Make check strict

* update docs

* cleanup
  • Loading branch information
npalm authored Jan 10, 2022
1 parent 27e974d commit 405b11d
Show file tree
Hide file tree
Showing 16 changed files with 233 additions and 124 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ Besides these permissions, the lambdas also need permission to CloudWatch (for l
To be able to support a number of use-cases the module has quite a lot configuration options. We try to choose reasonable defaults. The several examples also shows for the main cases how to configure the runners.

- Org vs Repo level. You can configure the module to connect the runners in GitHub on a org level and share the runners in your org. Or set the runners on repo level. The module will install the runner to the repo. This can be multiple repo's but runners are not shared between repo's.
- Checkrun vs Workflow job event. You can configure the webhook in GitHub to send checkrun or workflow job events to the webhook. Workflow job events are introduced by GitHub in September 2021 and are designed to support scalable runners. We advise when possible to use the workflow job event, you can set `disable_check_wokflow_job_labels = true` to disable the label check.
- Checkrun vs Workflow job event. You can configure the webhook in GitHub to send checkrun or workflow job events to the webhook. Workflow job events are introduced by GitHub in September 2021 and are designed to support scalable runners. We advise when possible to use the workflow job event, you can set `runner_enable_workflow_job_labels_check = true` to let the webhook only accept jobs based on the labels configured. The webhook will check the custom labels provided via the variable `runner_extra_labels` and the GitHub managed labels, "self-hosted", OS and architecture. The OS and architecture are derived from the settings. By default the check is disabled.
- Linux vs Windows. you can configure the os types linux and win. Linux will be used by default.
- Re-use vs Ephemeral. By default runners are re-used for till detected idle, once idle they will be removed from the pool. To improve security we are introducing ephemeral runners. Those runners are only used for one job. Ephemeral runners are only working in combination with the workflow job event. We also suggest to use a pre-build AMI to improve the start time of jobs.
- GitHub cloud vs GitHub enterprise server (GHES). The runner support GitHub cloud as well GitHub enterprise service. For GHES we rely on our community to test and support. We have no possibility to test ourselves on GHES.
Expand Down Expand Up @@ -382,7 +382,6 @@ In case the setup does not work as intended follow the trace of events:
| <a name="input_cloudwatch_config"></a> [cloudwatch\_config](#input\_cloudwatch\_config) | (optional) Replaces the module default cloudwatch log config. See https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html for details. | `string` | `null` | no |
| <a name="input_create_service_linked_role_spot"></a> [create\_service\_linked\_role\_spot](#input\_create\_service\_linked\_role\_spot) | (optional) create the serviced linked role for spot instances that is required by the scale-up lambda. | `bool` | `false` | no |
| <a name="input_delay_webhook_event"></a> [delay\_webhook\_event](#input\_delay\_webhook\_event) | The number of seconds the event accepted by the webhook is invisible on the queue before the scale up lambda will receive the event. | `number` | `30` | no |
| <a name="input_disable_check_wokflow_job_labels"></a> [disable\_check\_wokflow\_job\_labels](#input\_disable\_check\_wokflow\_job\_labels) | Disable the the check of workflow labels for received workflow job events. | `bool` | `false` | no |
| <a name="input_enable_cloudwatch_agent"></a> [enable\_cloudwatch\_agent](#input\_enable\_cloudwatch\_agent) | Enabling the cloudwatch agent on the ec2 runner instances, the runner contains default config. Configuration can be overridden via `cloudwatch_config`. | `bool` | `true` | no |
| <a name="input_enable_ephemeral_runners"></a> [enable\_ephemeral\_runners](#input\_enable\_ephemeral\_runners) | Enable ephemeral runners, runners will only be used once. | `bool` | `false` | no |
| <a name="input_enable_organization_runners"></a> [enable\_organization\_runners](#input\_enable\_organization\_runners) | Register runners to organization, instead of repo level | `bool` | `false` | no |
Expand Down Expand Up @@ -426,7 +425,8 @@ In case the setup does not work as intended follow the trace of events:
| <a name="input_runner_boot_time_in_minutes"></a> [runner\_boot\_time\_in\_minutes](#input\_runner\_boot\_time\_in\_minutes) | The minimum time for an EC2 runner to boot and register as a runner. | `number` | `5` | no |
| <a name="input_runner_ec2_tags"></a> [runner\_ec2\_tags](#input\_runner\_ec2\_tags) | Map of tags that will be added to the launch template instance tag specificatons. | `map(string)` | `{}` | no |
| <a name="input_runner_egress_rules"></a> [runner\_egress\_rules](#input\_runner\_egress\_rules) | List of egress rules for the GitHub runner instances. | <pre>list(object({<br> cidr_blocks = list(string)<br> ipv6_cidr_blocks = list(string)<br> prefix_list_ids = list(string)<br> from_port = number<br> protocol = string<br> security_groups = list(string)<br> self = bool<br> to_port = number<br> description = string<br> }))</pre> | <pre>[<br> {<br> "cidr_blocks": [<br> "0.0.0.0/0"<br> ],<br> "description": null,<br> "from_port": 0,<br> "ipv6_cidr_blocks": [<br> "::/0"<br> ],<br> "prefix_list_ids": null,<br> "protocol": "-1",<br> "security_groups": null,<br> "self": null,<br> "to_port": 0<br> }<br>]</pre> | no |
| <a name="input_runner_extra_labels"></a> [runner\_extra\_labels](#input\_runner\_extra\_labels) | Extra labels for the runners (GitHub). Separate each label by a comma | `string` | `""` | no |
| <a name="input_runner_enable_workflow_job_labels_check"></a> [runner\_enable\_workflow\_job\_labels\_check](#input\_runner\_enable\_workflow\_job\_labels\_check) | If set to true all labels in the workflow job even are matched agaist the custom labels and GitHub labels (os, architecture and `self-hosted`). When the labels are not matching the event is dropped at the webhook. | `bool` | `false` | no |
| <a name="input_runner_extra_labels"></a> [runner\_extra\_labels](#input\_runner\_extra\_labels) | Extra (custom) labels for the runners (GitHub). Separate each label by a comma. Labels checks on the webhook can be enforced by setting `enable_workflow_job_labels_check`. GitHub read-only labels should not be provided. | `string` | `""` | no |
| <a name="input_runner_group_name"></a> [runner\_group\_name](#input\_runner\_group\_name) | Name of the runner group. | `string` | `"Default"` | no |
| <a name="input_runner_iam_role_managed_policy_arns"></a> [runner\_iam\_role\_managed\_policy\_arns](#input\_runner\_iam\_role\_managed\_policy\_arns) | Attach AWS or customer-managed IAM policies (by ARN) to the runner IAM role | `list(string)` | `[]` | no |
| <a name="input_runner_log_files"></a> [runner\_log\_files](#input\_runner\_log\_files) | (optional) Replaces the module default cloudwatch log config. See https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html for details. | <pre>list(object({<br> log_group_name = string<br> prefix_log_group = bool<br> file_path = string<br> log_stream_name = string<br> }))</pre> | `null` | no |
Expand Down
11 changes: 7 additions & 4 deletions examples/ephemeral/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ module "runners" {
enable_organization_runners = true
runner_extra_labels = "default,example"

# enable workflow labels check
# runner_enable_workflow_job_labels_check = true

# enable access to the runners via SSM
enable_ssm_on_runners = true

Expand All @@ -55,12 +58,12 @@ module "runners" {
enable_ephemeral_runners = true

# configure your pre-built AMI
# enabled_userdata = false
# ami_filter = { name = ["github-runner-amzn2-x86_64-2021*"] }
# ami_owners = [data.aws_caller_identity.current.account_id]
enabled_userdata = false
ami_filter = { name = ["github-runner-amzn2-x86_64-2021*"] }
ami_owners = [data.aws_caller_identity.current.account_id]

# Enable logging
# log_level = "debug"
log_level = "debug"

# Setup a dead letter queue, by default scale up lambda will kepp retrying to process event in case of scaling error.
# redrive_policy_build_queue = {
Expand Down
2 changes: 1 addition & 1 deletion examples/windows/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ module "runners" {
runner_extra_labels = "default,example"

# Set the OS to Windows
runner_os = "win"
runner_os = "windows"
# we need to give the runner time to start because this is windows.
runner_boot_time_in_minutes = 20

Expand Down
6 changes: 4 additions & 2 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,10 @@ module "webhook" {
lambda_zip = var.webhook_lambda_zip
lambda_timeout = var.webhook_lambda_timeout
logging_retention_in_days = var.logging_retention_in_days
runner_extra_labels = var.runner_extra_labels
disable_check_wokflow_job_labels = var.disable_check_wokflow_job_labels

# labels
enable_workflow_job_labels_check = var.runner_enable_workflow_job_labels_check
runner_labels = "self-hosted,${var.runner_os},${var.runner_architecture},${var.runner_extra_labels}"

role_path = var.role_path
role_permissions_boundary = var.role_permissions_boundary
Expand Down
6 changes: 3 additions & 3 deletions modules/runners/logging.tf
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,19 @@ locals {
{
"log_group_name" : "user_data",
"prefix_log_group" : true,
"file_path" : var.runner_os == "win" ? "C:/UserData.log" : "/var/log/user-data.log",
"file_path" : var.runner_os == "windows" ? "C:/UserData.log" : "/var/log/user-data.log",
"log_stream_name" : "{instance_id}"
},
{
"log_group_name" : "runner",
"prefix_log_group" : true,
"file_path" : var.runner_os == "win" ? "C:/actions-runner/_diag/Runner_*.log" : "/home/runners/actions-runner/_diag/Runner_**.log",
"file_path" : var.runner_os == "windows" ? "C:/actions-runner/_diag/Runner_*.log" : "/home/runners/actions-runner/_diag/Runner_**.log",
"log_stream_name" : "{instance_id}"
},
{
"log_group_name" : "runner-startup",
"prefix_log_group" : true,
"file_path" : var.runner_os == "win" ? "C:/runner-startup.log" : "/var/log/runner-startup.log",
"file_path" : var.runner_os == "windows" ? "C:/runner-startup.log" : "/var/log/runner-startup.log",
"log_stream_name" : "{instance_id}"
}
]
Expand Down
16 changes: 8 additions & 8 deletions modules/runners/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -16,23 +16,23 @@ locals {
kms_key_arn = var.kms_key_arn != null ? var.kms_key_arn : ""

default_ami = {
"win" = { name = ["Windows_Server-20H2-English-Core-ContainersLatest-*"] }
"linux" = var.runner_architecture == "arm64" ? { name = ["amzn2-ami-hvm-2*-arm64-gp2"] } : { name = ["amzn2-ami-hvm-2.*-x86_64-ebs"] }
"windows" = { name = ["Windows_Server-20H2-English-Core-ContainersLatest-*"] }
"linux" = var.runner_architecture == "arm64" ? { name = ["amzn2-ami-hvm-2*-arm64-gp2"] } : { name = ["amzn2-ami-hvm-2.*-x86_64-ebs"] }
}

default_userdata_template = {
"win" = "${path.module}/templates/user-data.ps1"
"linux" = "${path.module}/templates/user-data.sh"
"windows" = "${path.module}/templates/user-data.ps1"
"linux" = "${path.module}/templates/user-data.sh"
}

userdata_install_runner = {
"win" = "${path.module}/templates/install-runner.ps1"
"linux" = "${path.module}/templates/install-runner.sh"
"windows" = "${path.module}/templates/install-runner.ps1"
"linux" = "${path.module}/templates/install-runner.sh"
}

userdata_start_runner = {
"win" = "${path.module}/templates/start-runner.ps1"
"linux" = "${path.module}/templates/start-runner.sh"
"windows" = "${path.module}/templates/start-runner.ps1"
"linux" = "${path.module}/templates/start-runner.sh"
}

ami_filter = coalesce(var.ami_filter, local.default_ami[var.runner_os])
Expand Down
4 changes: 2 additions & 2 deletions modules/runners/scale-down.tf
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
locals {
# Windows Runners can take their sweet time to do anything
min_runtime_defaults = {
"win" = 15
"linux" = 5
"windows" = 15
"linux" = 5
}
}
resource "aws_lambda_function" "scale_down" {
Expand Down
2 changes: 1 addition & 1 deletion modules/runners/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ variable "runner_os" {
default = "linux"

validation {
condition = contains(["linux", "win"], var.runner_os)
condition = contains(["linux", "windows"], var.runner_os)
error_message = "Valid values for runner_os are (linux, win)."
}
}
Expand Down
3 changes: 2 additions & 1 deletion modules/webhook/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ No modules.
|------|-------------|------|---------|:--------:|
| <a name="input_aws_region"></a> [aws\_region](#input\_aws\_region) | AWS region. | `string` | n/a | yes |
| <a name="input_disable_check_wokflow_job_labels"></a> [disable\_check\_wokflow\_job\_labels](#input\_disable\_check\_wokflow\_job\_labels) | Disable the the check of workflow labels. | `bool` | `false` | no |
| <a name="input_enable_workflow_job_labels_check"></a> [enable\_workflow\_job\_labels\_check](#input\_enable\_workflow\_job\_labels\_check) | If set to true all labels in the workflow job even are matched agaist the custom labels and GitHub labels (os, architecture and `self-hosted`). When the labels are not matching the event is dropped at the webhook. | `bool` | `false` | no |
| <a name="input_environment"></a> [environment](#input\_environment) | A name that identifies the environment, used as prefix and for tagging. | `string` | n/a | yes |
| <a name="input_github_app_webhook_secret_arn"></a> [github\_app\_webhook\_secret\_arn](#input\_github\_app\_webhook\_secret\_arn) | n/a | `string` | n/a | yes |
| <a name="input_kms_key_arn"></a> [kms\_key\_arn](#input\_kms\_key\_arn) | Optional CMK Key ARN to be used for Parameter Store. | `string` | `null` | no |
Expand All @@ -86,7 +87,7 @@ No modules.
| <a name="input_repository_white_list"></a> [repository\_white\_list](#input\_repository\_white\_list) | List of repositories allowed to use the github app | `list(string)` | `[]` | no |
| <a name="input_role_path"></a> [role\_path](#input\_role\_path) | The path that will be added to the role; if not set, the environment name will be used. | `string` | `null` | no |
| <a name="input_role_permissions_boundary"></a> [role\_permissions\_boundary](#input\_role\_permissions\_boundary) | Permissions boundary that will be added to the created role for the lambda. | `string` | `null` | no |
| <a name="input_runner_extra_labels"></a> [runner\_extra\_labels](#input\_runner\_extra\_labels) | Extra labels for the runners (GitHub). Separate each label by a comma | `string` | `""` | no |
| <a name="input_runner_labels"></a> [runner\_labels](#input\_runner\_labels) | Labels for the runners (GitHub). Separate each label by a comma. Labels are used to check events when `runner_enable_workflow_job_labels_check` is set to `true`. | `string` | `""` | no |
| <a name="input_sqs_build_queue"></a> [sqs\_build\_queue](#input\_sqs\_build\_queue) | SQS queue to publish accepted build events. | <pre>object({<br> id = string<br> arn = string<br> })</pre> | n/a | yes |
| <a name="input_sqs_build_queue_fifo"></a> [sqs\_build\_queue\_fifo](#input\_sqs\_build\_queue\_fifo) | Enable a FIFO queue to remain the order of events received by the webhook. Suggest to set to true for repo level runners. | `bool` | `false` | no |
| <a name="input_tags"></a> [tags](#input\_tags) | Map of tags that will be added to created resources. By default resources will be tagged with name and environment. | `map(string)` | `{}` | no |
Expand Down
108 changes: 108 additions & 0 deletions modules/webhook/lambdas/webhook/src/lambda.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
import { APIGatewayEvent, Context } from 'aws-lambda';
import { mocked } from 'ts-jest/utils';
import { githubWebhook } from './lambda';
import { handle } from './webhook/handler';
import { logger } from './webhook/logger';

const event: APIGatewayEvent = {
body: JSON.stringify(''),
headers: { abc: undefined },
httpMethod: '',
isBase64Encoded: false,
multiValueHeaders: { abc: undefined },
multiValueQueryStringParameters: null,
path: '',
pathParameters: null,
queryStringParameters: null,
stageVariables: null,
resource: '',
requestContext: {
authorizer: null,
accountId: '123456789012',
resourceId: '123456',
stage: 'prod',
requestId: 'c6af9ac6-7b61-11e6-9a41-93e8deadbeef',
requestTime: '09/Apr/2015:12:34:56 +0000',
requestTimeEpoch: 1428582896000,
identity: {
cognitoIdentityPoolId: null,
accountId: null,
cognitoIdentityId: null,
caller: null,
accessKey: null,
sourceIp: '127.0.0.1',
cognitoAuthenticationType: null,
cognitoAuthenticationProvider: null,
userArn: null,
userAgent: 'Custom User Agent String',
user: null,
clientCert: null,
apiKey: null,
apiKeyId: null,
principalOrgId: null,
},
path: '/prod/path/to/resource',
resourcePath: '/{proxy+}',
httpMethod: 'POST',
apiId: '1234567890',
protocol: 'HTTP/1.1',
},
};

const context: Context = {
awsRequestId: '1',
callbackWaitsForEmptyEventLoop: false,
functionName: '',
functionVersion: '',
getRemainingTimeInMillis: () => 0,
invokedFunctionArn: '',
logGroupName: '',
logStreamName: '',
memoryLimitInMB: '',
done: () => {
return;
},
fail: () => {
return;
},
succeed: () => {
return;
},
};

jest.mock('./webhook/handler');

describe('Test scale up lambda wrapper.', () => {
it('Happy flow, resolve.', async () => {
const mock = mocked(handle);
mock.mockImplementation(() => {
return new Promise((resolve) => {
resolve({ statusCode: 200 });
});
});

const result = await githubWebhook(event, context);
expect(result).toEqual({ statusCode: 200 });
});

it('An expected error, resolve.', async () => {
const mock = mocked(handle);
mock.mockImplementation(() => {
return new Promise((resolve) => {
resolve({ statusCode: 400 });
});
});

const result = await githubWebhook(event, context);
expect(result).toEqual({ statusCode: 400 });
});

it('Errors are not thrown.', async () => {
const mock = mocked(handle);
const logSpy = jest.spyOn(logger, 'error');
mock.mockRejectedValue(new Error('some error'));
const result = await githubWebhook(event, context);
expect(result).toMatchObject({ statusCode: 500 });
expect(logSpy).toBeCalledTimes(1);
});
});
16 changes: 10 additions & 6 deletions modules/webhook/lambdas/webhook/src/lambda.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,18 @@ export interface Response {
statusCode: number;
body?: string;
}

export const githubWebhook = async (event: APIGatewayEvent, context: Context, callback: Callback): Promise<void> => {
export async function githubWebhook(event: APIGatewayEvent, context: Context): Promise<Response> {
logger.setSettings({ requestId: context.awsRequestId });
logger.debug(JSON.stringify(event));
let result: Response;
try {
const response = await handle(event.headers, event.body as string);
callback(null, response);
result = await handle(event.headers, event.body as string);
} catch (e) {
callback(e as Error);
logger.error(e);
result = {
statusCode: 500,
body: 'Check the Lambda logs for the error details.',
};
}
};
return result;
}
Loading

0 comments on commit 405b11d

Please sign in to comment.