Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected LAGOON_GIT_SHA in PR environments #229

Open
smlx opened this issue Apr 26, 2022 · 6 comments
Open

Unexpected LAGOON_GIT_SHA in PR environments #229

smlx opened this issue Apr 26, 2022 · 6 comments

Comments

@smlx
Copy link
Member

smlx commented Apr 26, 2022

Describe the bug

When an environment is created from a PR, the LAGOON_GIT_SHA runtime environment variable is populated with a merge commit which only exists in the current build.

There are a couple of levels of surprising behaviour here in my view.

  1. Why is the PR branch merged during the build? If the merge target has had commits added to it since the PR branch diverged then there will be:
    a) possibly a merge conflict which will fail the build; and
    b) code from commits in the merge target added to the build which isn't in the PR branch.

  2. Assuming that merging makes sense by ignoring the problems in 1., the LAGOON_GIT_SHA should be some value which is useful to the developer. A merge commit which only exists in the current build doesn't seem very useful, although it is technically accurate.

To Reproduce

Steps to reproduce the behavior:

  1. Check the LAGOON_GIT_SHA value in a PR environment.
  2. See that it doesn't exist in the git repository.

Expected behavior

I would expect Lagoon not to merge the PR during the build.

If that is required for compatibility or other reasons then the method for injecting the PR HEAD commit into a runtime variable should be documented. That would be something like this I think:

ARG LAGOON_PR_HEAD_SHA
ENV LAGOON_PR_HEAD_SHA=$LAGOON_PR_HEAD_SHA

I had a look for this snippet in the Lagoon documentation but couldn't find it.

Screenshots

n/a

Additional context

git merge during build occurs here.

@seanhamlin
Copy link
Contributor

I have seen builds fail when the PR fails to merge cleanly into destination branch too.

@rocketeerbkw
Copy link
Member

Why is the PR branch merged during the build?

There's no other step in the process where it can happen? Github doesn't commit/save the result of a PR anywhere except inside Github. When we get a PR webhook, the payload only includes references to the head (what gets merged) and the base (where it gets merged). If we don't merge the two at some point ourselves, we wouldn't be able to create a valid PR environment.

If the merge target has had commits added to it since the PR branch diverged then there will be:
a) possibly a merge conflict which will fail the build; and
b) code from commits in the merge target added to the build which isn't in the PR branch.

In order to prevent these issues we pull the head commit sha and the base commit sha out of the PR payload and we don't use branch names. When we merge during build, we're merging one commit into another, not one branch into another.

@smlx
Copy link
Member Author

smlx commented May 30, 2022

There's no other step in the process where it can happen? Github doesn't commit/save the result of a PR anywhere except inside Github. When we get a PR webhook, the payload only includes references to the head (what gets merged) and the base (where it gets merged). If we don't merge the two at some point ourselves, we wouldn't be able to create a valid PR environment.

What do you mean here by "valid PR environment"? Isn't the head of the PR branch the commit that should be deployed?

In order to prevent these issues we pull the head commit sha and the base commit sha out of the PR payload and we don't use branch names. When we merge during build, we're merging one commit into another, not one branch into another.

The example payload you linked illustrates the problem with merging. If you take a look at the contents you'll see that base contains the current head commit on the master branch. Importantly, this commit is not an ancestor of the PR branch head. Instead, a commit has been added to the master branch after the PR branch was started:

    "base": {
      "label": "Codertocat:master",
      "ref": "master",
      "sha": "f95f852bd8fca8fcc58a9a2d6c842781e32a215e",

git tree

If f95f852 does not have a conflict with the PR branch head (ec26c3e AKA changes in this example) then the merge might work, but it will be confusing for the user because the Lagoon deploy will include the changes contained in f95f852 which do not exist in the PR branch.

Even in the case that there has not been a commit to the base branch since the PR branch was started, git merge will resolve the merge as a fast-forward anyway, so the git merge will simply have the effect of checking out the head of the PR branch, making the merge redundant. Here's an example from a real Lagoon build:

Screenshot from 2022-05-30 20-58-17

So it seems as though the possible effects of the git merge in Lagoon deploy scripts right now are:

  1. in the case that the head of the base branch is an ancestor of the PR branch, just fast-forward to the head of the PR branch; or
  2. if there has been a commit to the base branch since the PR branch was started, either:
    a) if there are no conflicts, create an ephemeral merge commit with changes from both base and head of the PR branch.
    b) if there are conflicts, exit with an error and fail the build.

IMO 2a is not desirable, since creating an ephemeral merge commit instead of using the head of the PR branch is very unusual, and quite different to the way other CI/CD systems such as Github actions or Jenkins work. Obviously 2b isn't great either.

Instead it seems that it would be faster, simpler and less surprising to do something like this and avoid merging altogether:

git init
git remote add origin git@github.com:Codertocat/Hello-World.git
git fetch --depth=1 origin ec26c3e57ca3a959ca5aad62de7213c562f8c821
git checkout ec26c3e57ca3a959ca5aad62de7213c562f8c821

Where ec26c3e is the PR branch head.

@rocketeerbkw
Copy link
Member

What do you mean here by "valid PR environment"? Isn't the head of the PR branch the commit that should be deployed?

Instead it seems that it would be faster, simpler and less surprising to do something like this and avoid merging altogether

I guess there's some disconnect on what purpose a PR environment serves? Merging is a fundamental feature of a Lagoon PR environment:

The idea of pull request-based workflows lies behind that idea that you can test a feature together with a target branch, without actually needing to merge yet, as Lagoon will do the merging for you during the build.

If we never merge base and head then why bother with PR environments at all? Because head will always be a branch and we build branch environments already. I'm all aboard for adjusting environment variables like LAGOON_GIT_SHA as you suggested previously, but not merging at all would be a new issue.

quite different to the way other CI/CD systems such as Github actions or Jenkins work

But it's not different to how our competitors work. We're not a CI/CD system, our concept of a PR environment is not the same as theirs. PR environments are how Lagoon solves this problem for our users:

As a user, I want an environment that has my feature/new-stuff branch and main so I can make sure I didn't break anything, even if it merges cleanly. My options to do that are 1) manually maintaining a staging branch and manually keep it synced and also collaborate with 6 other developers all trying to test on staging too OR 2) create a PR that automatically spins up a new environment for me of just main and feature/new-stuff.

@smlx
Copy link
Member Author

smlx commented May 31, 2022

Ah okay. Thanks for the explanation. Indeed I wasn't aware that this is the documented behaviour of Lagoon. 🙂

If we never merge base and head then why bother with PR environments at all?

I thought it was so that you could push branches to the repo, and then only deploy PRs as environments. But I guess I should have read the documentation more thoroughly.

As a user, I want an environment that has my feature/new-stuff branch and main so I can make sure I didn't break anything, even if it merges cleanly. My options to do that are 1) manually maintaining a staging branch and manually keep it synced and also collaborate with 6 other developers all trying to test on staging too OR 2) create a PR that automatically spins up a new environment for me of just main and feature/new-stuff.

Several potential problems with doing the merge in the deploy are:

  • It conflicts with one of the basic features of git, which is that the exact state of the repository is known from the commit hash. Instead Lagoon will deploy an ephemeral commit which is impossible for the developer to inspect locally.
  • It is racy because it depends on the state of the base branch at the moment of the git fetch during the build. If another developer updates base after the deploy there will be changes in the final merge to base which aren't reflected in the deploy.
  • git merge can produce different results depending on the strategy, and that strategy is not just configurable on the command line, but it also varies depending on the version of git in use.
  • git merge is not the only way of getting PR changes into a base branch. For example Github offers at least three different options through its UI which can change the outcome of the merge.

Maybe the Lagoon documentation should mention these pitfalls and also document that an alternative is to add a branch protection rule to your PRs? E.g. Github has:

Screenshot from 2022-05-31 09-53-16

@AlexSkrypnyk
Copy link

Workaround

Add these lines to the Dockerfile

ARG LAGOON_PR_HEAD_BRANCH=""
ENV LAGOON_PR_HEAD_BRANCH=$LAGOON_PR_HEAD_BRANCH
ARG LAGOON_PR_HEAD_SHA=""
ENV LAGOON_PR_HEAD_SHA=$LAGOON_PR_HEAD_SHA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants