Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add github action based check enforcer #3287

Merged
merged 2 commits into from
May 18, 2022

Conversation

benbp
Copy link
Member

@benbp benbp commented May 6, 2022

NOTE Based on feedback, I have moved the main action implementation into its own repository here. This PR now contains a simple yaml file for referencing the external repository action.


This adds a new implementation of check enforcer intended to be run within a github actions context. To see an example of it running for a repository, check out the actions in my fork.

Design Goals and Current Problems:

  • Reduce ongoing maintenance and knowledge costs of the current check enforcer. By using github actions we won't have to maintain a check enforcer service ourselves.
  • Improve check enforcer scaling. While the current check enforcer app can process webhook events much more quickly than github actions (since it does not have to spin up a runner VM per event), it hits a few key limitations:

Details

  • Check enforcer maintenance
  • Check enforcer scaling issues
    • The github API has a secondary rate limit for parallel requests from a single github app identity.
    • check enforcer handles events for all of our repositories.
    • check enforcer operates as part of a check suite, and queries the status of all check runs in the suite every time an individual run completes. When a large changeset is put in a pull request or merged to main, it causes a build storm and therefore a check run storm. Check enforcer not only has to query all check runs for each of these, but github paginates the responses multiplying the high number of requests by around 50 in the worst cases.
    • The above and other behaviors cause big queue backups due to github throttling in extreme cases, which happen with relative frequency (every few weeks) and bring check enforcer to a halt for hours.
  • Github action benefits to scaling
    • When github runs an action, it passes a generated token in the workflow context that can be used for requests. From what I can tell, this token will not have the same "one request at a time" limitations as our current github app identity, but I need to do more load testing from the action to understand when we'll get rate limited.
    • Since actions are local to the repo, even if we hit throttling issues due to build storms (or queueing issues due to event storms), this will only affect a single repo rather than all repos.

Design Decisions

There are a few key design decisions made with this change, that I am eager will spark discussion.

  1. As an organization we are starting to recognize more areas in which github actions can provide value that is not available today via an azure pipelines solution. These primarily center around ease of onboarding, discovery and understanding, as well as much more flexibility around various repository events.
    • As we continue to add github actions, I think it is going to be optimal to provide common platform support from the engsys team. Some ideas I have here are having centralized action configs for each event type, to prevent more than one action being queued for the same event. We can handle all the event details, routing, status checking, logging, testing, etc. and reduce duplicate efforts by providing a framework that developers can build their actions in.
  2. The tool is written in Go. This decision is influenced by a future vision for our usage of github actions as an organization.
    • A top-level design goal should be to reduce the overhead of running the tool as much as possible, as our usage of actions grows (especially if we move more of our internal pipeline workloads into actions).
    • This tool or similar tools are likely to grow in complexity, usage and coverage. Tasks that contribute to setup time could include tool downloading, dependency installation, runtime startup, JIT compiling, etc. Go has a very minimal overhead in this area vs. Python, C#, Powershell, etc.
    • Any approach we choose needs to be easy to pick up by different language teams. Currently we use powershell as a common denominator, but there are many benefits to using a static language with static binaries that powershell cannot offer. Additionally, enforcing powershell quality, conventions and testing standards is extremely difficult today. Go has a much more opinionated and strict approach to how programs are written, which will help with consistency and ease of understanding of programs written across the repositories.

Implementation Details

The program is currently written to take a single argument which is a path to a payload file. The payload is expected to be some form of a github event. Currently, it supports two payloads: check_suite and issue_comment.

Currently I wrote the program with no dependencies (except for local testing) in order to support the fastest possible runtime of a check enforcer action. If we go with this approach and roll it out, and things look good, I plan to design and add support for publishing a tool instead of building and running from source. At this point dependencies won't have as large of a download overhead and we can introduce them. Primarily I plan to replace my homerolled github client with the primary github client for go.

For behavior and UX details, as well as instructions to enable, see the README file in this PR at tools/check-enforcer-actions/README.md. This tool has functional parity with the current check enforcer.

@benbp benbp requested review from weshaggard and scbedd May 6, 2022 23:28
@benbp benbp requested review from a team, AlexGhiondea and jsquire as code owners May 6, 2022 23:28
@check-enforcer-staging
Copy link

This pull request is protected by Check Enforcer.

What is Check Enforcer?

Check Enforcer helps ensure all pull requests are covered by at least one check-run (typically an Azure Pipeline). When all check-runs associated with this pull request pass then Check Enforcer itself will pass.

Why am I getting this message?

You are getting this message because Check Enforcer did not detect any check-runs being associated with this pull request within five minutes. This may indicate that your pull request is not covered by any pipelines and so Check Enforcer is correctly blocking the pull request being merged.

What should I do now?

If the check-enforcer check-run is not passing and all other check-runs associated with this PR are passing (excluding license-cla) then you could try telling Check Enforcer to evaluate your pull request again. You can do this by adding a comment to this pull request as follows:
/check-enforcer evaluate
Typically evaulation only takes a few seconds. If you know that your pull request is not covered by a pipeline and this is expected you can override Check Enforcer using the following command:
/check-enforcer override
Note that using the override command triggers alerts so that follow-up investigations can occur (PRs still need to be approved as normal).

@benbp benbp force-pushed the benbp/check-enforcer-actions-pr branch from bc37562 to ce12eb9 Compare May 7, 2022 00:46
@benbp benbp self-assigned this May 9, 2022
@benbp benbp added the Central-EngSys This issue is owned by the Engineering System team. label May 9, 2022
GITHUB_PAYLOAD: ${{ toJson(github.event) }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
echo "Sparse checkout azure-sdk-tools..."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets see what this looks like without needing to do a sparse checkout. The normal checkout I believe is already shallow by default. However if we can move this to be an action outside of the repo maybe we don't need to explicitly checkout at all because github will handle it.

Copy link
Member

@scbedd scbedd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've walked the go code a few times, and nothing jumps out at me. To be entirely honest I haven't gotten to the nitty-gritty into go code enough to see any hidden bugs here though.

@benbp benbp force-pushed the benbp/check-enforcer-actions-pr branch from e5bb54c to bd3b104 Compare May 16, 2022 20:00
@benbp benbp requested a review from mikeharder as a code owner May 16, 2022 20:00
@benbp benbp force-pushed the benbp/check-enforcer-actions-pr branch from bd3b104 to ac13351 Compare May 16, 2022 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Central-EngSys This issue is owned by the Engineering System team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants