Optimize git fetch-depth #296

korthout · 2022-11-22T21:57:13Z

This PR improves the performance of the action's git fetches. For repositories with large histories, this can dramatically decrease the action's execution time.

Note
Users do not have to adjust anything to benefit from this change unless they have configured the actions/checkout's fetch-depth to anything else than the default: fetch-depth: 1

closes #267
closes #269

Motivation

Since #162 this action supports shallow clones, which is the default when cloning a repo using actions/checkout. This means that the action fetches all the necessary git refs itself.

For repositories with large git histories (i.e. containing many commits), this increased the execution time of the action dramatically. The action would spend most of its time fetching this history, apparently also fetching common parts of the history multiple times.

What has changed?

A core mechanism had to be adjusted to reduce the number of commits that have to be fetched. This action previously determined the commits to cherry-pick by looking for the merge base between the pull request's head and base refs. However, it's unclear how many commits may already exist on the pull request's base (i.e. PR target) since the PR's head was branched from it. So, there's no way to specify a particular fetch depth that is optimal deterministically.

As an alternative, this PR uses the Github API to find these same commits. This allows the depth of the fetch used to discover all the necessary commits to cherry-pick to be reduced to n+1. An additional fetch with depth 1 is still required for each backport target branch. This means that the fetch depth is now optimal.

Why about those removed tests?

Many of the tests in this project have been bothering me for a longer time. They've restricted progress much more than they've helped to make it safe to change things. This became clear to me once more when I wanted to adjust the core of this action. So, I've removed many of the tests.

To ensure the action works correctly, I've mostly switched to using korthout/backport-action-test. That repo verifies the main functionality but does not cover all scenarios yet. I plan to expand these in the future. I also plan to re-introduce fast-running unit tests at a later time. But now, I do not see any reason to keep any of the deleted tests.

The base ref was necessary to determine which commits exist on the pull request. The common ancestor of base ref and pull head could be found using git merge-base. However, many commits may exist between the base ref and this common ancestor. In some cases this could lead to a slow fetch of the base ref, because it requires a deep fetch. Instead of determining the commits on the pull request using git only, we can simply ask github about the commits on the pull request. Github can provide the number of commits and there is an API to retrieve data about the commits in a paged way. Sadly this is limited to 250 commits, but that should cover most cases. We can improve upon this in the future. Using the number of commits on the pull request we can fetch exactly those commits needed for cherry-picking. We can then retrieve all the commits on the PR to find the first and last commit shas. Finally we can cherry-pick the entire range of commits: `git cherry-pick firstCommitSha..lastCommitSha` Note that this completely removes the old backport script method, which most tests depend on. It means we have remove/disable most tests from this repo. In the meantime, backport-action-test provides a way to validate the correct behavior of this action.

It seemed that ranges behave strangely when cherry-picking in a shallow clone, but I was mistaken. The commit range `sha1..sha2` is a shorthand for `^sha1 sha2`, which means all commit reachable by sha2 but not reachable by sha1. That means that sha1 is excluded. So to make the cherry-pick work correctly with the shorthand (..) notation, we'd need to use something like: `git cherry-pick <firstCommitSha>^..<lastCommitSha>`, i.e. take all commits reachable from <lastCommitSha> but not reachable from the first parent of <firstCommitSha>. This would include the commit referenced by <firstCommitSha> as well. Instead of using ranges, we can pass all specific commits directly to the cherry-pick command. This isn't easier, but makes the command more explicit. We can always choose to rewrite this back to using the shorthand (..) notation.

Most of the tests in this repo have bothered me. They've restricted progress more than they have helped me make sure I can safely make changes. To make sure the action works correctly, I've mostly switched to github.com/korthout/backport-action-test. That repo verifies the main functionality, but does not cover all scenarios. I plan to expand these in the future. However, edge cases should still be covered by unit-tests. Most of the existing ones I'm deleting in this commit, because they restrict me. But it's also noteworthy that they don't really test the edge cases that they should be testing, nor are they doing this in a good way. I plan to re-introduce fast running unit tests at a later time. But at this time, I do not see any reason to keep any of the tests deleted in this commit. This commit also updates the README, because the integration tests (which were aptly called acceptance tests) have been completely removed. The README still covers how to run tests because we still have some unit tests that run fast and make sense to have.

This is no longer needed for anything. And this dependency is updated extremely often, which became very annoying. Happy to get rid of it.

korthout · 2022-11-23T17:00:14Z

Tested wth https://github.com/korthout/backport-action-test

korthout · 2022-11-28T16:15:30Z

Backporting a pull request to 2 branches went from ~1m on v0.0.9 to just 11s on v1-rc1 in the camunda/zeebe repo 🌱

korthout · 2022-12-10T11:38:13Z

Backporting a pull request to 1 branch went from ~5m to just 16s on v1-rc for the NixOS/nixpkgs repo 🌱 🌱 🌱

original pull request: test: add lines to backport nixpkgs#2
created backport pull request: [Backport target] test: add lines to backport nixpkgs#3

korthout added 5 commits November 22, 2022 22:11

deps: remove @octokit/webhooks-types

2b7bbde

This is no longer needed for anything. And this dependency is updated extremely often, which became very annoying. Happy to get rid of it.

dist: build new release

2328883

korthout mentioned this pull request Nov 22, 2022

Prepare for a v1 release #289

Closed

9 tasks

korthout marked this pull request as ready for review November 23, 2022 17:03

korthout merged commit 85b0de7 into master Nov 23, 2022

korthout deleted the korthout-optimal-fetch-depth branch November 23, 2022 17:08

korthout mentioned this pull request Nov 28, 2022

Speed regression when using fetch-depth = 1 #267

Closed

This was referenced Dec 10, 2022

Copy labels as specified by copy_labels_pattern #303

Closed

Port origin PR labels to backport PR #87

Closed

korthout mentioned this pull request Dec 21, 2022

Release v1 #306

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize git fetch-depth #296

Optimize git fetch-depth #296

korthout commented Nov 22, 2022 •

edited

Loading

korthout commented Nov 23, 2022 •

edited

Loading

korthout commented Nov 28, 2022 •

edited

Loading

korthout commented Dec 10, 2022

Optimize git fetch-depth #296

Optimize git fetch-depth #296

Conversation

korthout commented Nov 22, 2022 • edited Loading

Motivation

What has changed?

Why about those removed tests?

korthout commented Nov 23, 2022 • edited Loading

korthout commented Nov 28, 2022 • edited Loading

korthout commented Dec 10, 2022

korthout commented Nov 22, 2022 •

edited

Loading

korthout commented Nov 23, 2022 •

edited

Loading

korthout commented Nov 28, 2022 •

edited

Loading