Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for annotated git tags #2570

Merged
merged 1 commit into from
Jan 29, 2022
Merged

Conversation

alexcb
Copy link
Collaborator

@alexcb alexcb commented Jan 19, 2022

This fixes errors such as:

error: cannot update ref 'refs/heads/v2.30.0': trying to write non-commit object 2d9685d47a7e516281aa093bf0cddc8aafa72448 to branch 'refs/heads/v2.30.0'

which occur when cloning a tag rather than branch.

Signed-off-by: Alex Couture-Beil alex@earthly.dev

@tonistiigi
Copy link
Member

How do I reproduce this? docker buildx build "git://github.com/docker/buildx#v0.7.1" seems to work fine as is.

@alexcb
Copy link
Collaborator Author

alexcb commented Jan 21, 2022

Here's a repro-case (please forgive the size):

package main

import (
	"context"
	"fmt"
	"os"

	"github.com/moby/buildkit/client"
	"github.com/moby/buildkit/client/llb"
)

func buildkitAddr() string {
	buildkitAddr := os.Getenv("BUILDKIT_ADDR")
	if buildkitAddr == "" {
		buildkitAddr = "tcp://127.0.0.1:8372"
	}
	return buildkitAddr
}

func main() {
	src := llb.Git("https://git.kernel.org/pub/scm/git/git.git", "v2.30.0", llb.KeepGitDir())
	state := llb.Image("alpine:latest").
		File(llb.Copy(src, ".", "/git-src")).
		Run(llb.Shlex("find /git-src"))

	bkClient, err := client.New(context.TODO(), buildkitAddr())
	if err != nil {
		fmt.Printf("failed to connect to buildkit: %v\n", err)
		os.Exit(1)
	}
	defer bkClient.Close()

	llbDef, err := state.Marshal(context.TODO())
	if err != nil {
		fmt.Printf("failed to marshall llb: %v\n", err)
		os.Exit(1)
	}

	ch := make(chan *client.SolveStatus)
	go logStatus(ch)

	if err = llb.WriteTo(llbDef, os.Stdout); err != nil {
		fmt.Printf("failed to write LLB defintion to stdout: %v\n", err)
		os.Exit(1)
	}

	solveOpt := client.SolveOpt{}
	_, err = bkClient.Solve(context.TODO(), llbDef, solveOpt, ch)
	if err != nil {
		fmt.Printf("failed to solve llb: %v\n", err)
		os.Exit(1)
	}
}

func logStatus(ch chan *client.SolveStatus) {
	for {
		status := <-ch
		if status == nil {
			break
		}
		fmt.Printf("status is %v\n", status)
		for _, v := range status.Vertexes {
			fmt.Printf("====vertex: %+v\n", v)
		}
		for _, s := range status.Statuses {
			fmt.Printf("====status: %+v\n", s)
		}
		for _, l := range status.Logs {
			fmt.Printf("====log: %s\n", string(l.Data))
		}
	}
}

I then start up buildkit which listen on /run/buildkit/buildkitd.sock, then I can run my program with:

BUILDKIT_ADDR=unix:///run/buildkit/buildkitd.sock go run main.go
... lots of text ...

====status: &{ID:resolve docker.io/library/alpine:latest Vertex:sha256:665ba8b2cdc0cb0200e2a42a6b3c0f8f684089f4cd1b81494fbb9805879120f7 Name: Total:0 Current:0 Timestamp:2022-01-21 18:20:19.298224196 +0000 UTC Started:2022-01-21 18:20:19.298223846 +0000 UTC Completed:<nil>}
status is &{[] [] [0xc00035cfa0]}
====log: Initialized empty Git repository in /var/lib/buildkit/runc-overlayfs/snapshots/snapshots/6/fs/

status is &{[0xc0005b8300] [0xc0001cc0e0] []}
====vertex: &{Digest:sha256:665ba8b2cdc0cb0200e2a42a6b3c0f8f684089f4cd1b81494fbb9805879120f7 Inputs:[] Name:docker-image://docker.io/library/alpine:latest Started:2022-01-21 18:20:19.297345807 +0000 UTC Completed:2022-01-21 18:20:19.735310232 +0000 UTC Cached:false Error:}
====status: &{ID:resolve docker.io/library/alpine:latest Vertex:sha256:665ba8b2cdc0cb0200e2a42a6b3c0f8f684089f4cd1b81494fbb9805879120f7 Name: Total:0 Current:0 Timestamp:2022-01-21 18:20:19.735224189 +0000 UTC Started:2022-01-21 18:20:19.298223846 +0000 UTC Completed:2022-01-21 18:20:19.735222589 +0000 UTC}
status is &{[] [] [0xc000132320]}
====log: 2d9685d47a7e516281aa093bf0cddc8aafa72448	refs/tags/v2.30.0

status is &{[0xc0005306c0] [] []}
====vertex: &{Digest:sha256:2f1a6f2a3b95aeac93451143c00ce68ce6d2c0813c0dcadadc9230b756c1d7e1 Inputs:[] Name:git://git.kernel.org/pub/scm/git/git.git#v2.30.0 Started:2022-01-21 18:20:19.297316646 +0000 UTC Completed:2022-01-21 18:20:20.020331906 +0000 UTC Cached:false Error:}
status is &{[0xc0005b83c0] [] []}
====vertex: &{Digest:sha256:2f1a6f2a3b95aeac93451143c00ce68ce6d2c0813c0dcadadc9230b756c1d7e1 Inputs:[] Name:git://git.kernel.org/pub/scm/git/git.git#v2.30.0 Started:2022-01-21 18:20:20.020751545 +0000 UTC Completed:<nil> Cached:false Error:}
status is &{[] [] [0xc000132870]}
====log: From https://git.kernel.org/pub/scm/git/git
 * [new tag]         v2.30.0    -> v2.30.0

status is &{[] [] [0xc0005b6b90]}
====log: Initialized empty Git repository in /var/lib/buildkit/runc-overlayfs/snapshots/snapshots/7/fs/.git/

status is &{[] [] [0xc0005b6c80 0xc0005b6cd0]}
====log: error: cannot update ref 'refs/heads/v2.30.0': trying to write non-commit object 2d9685d47a7e516281aa093bf0cddc8aafa72448 to branch 'refs/heads/v2.30.0'
From /var/lib/buildkit/runc-overlayfs/snapshots/snapshots/6/fs
 ! [new tag]         v2.30.0    -> v2.30.0  (unable to update local ref)

====log:  * [new tag]         v2.30.0    -> v2.30.0

status is &{[0xc0005b87e0] [] []}
====vertex: &{Digest:sha256:2f1a6f2a3b95aeac93451143c00ce68ce6d2c0813c0dcadadc9230b756c1d7e1 Inputs:[] Name:git://git.kernel.org/pub/scm/git/git.git#v2.30.0 Started:2022-01-21 18:20:20.020751545 +0000 UTC Completed:2022-01-21 18:20:26.066649322 +0000 UTC Cached:false Error:exit status 1}
failed to solve llb: failed to solve: exit status 1

On thing that's interesting to note, is that when I used

src := llb.Git("git://github.com/docker/buildx", "v0.7.1", llb.KeepGitDir())

it works.

So what's the difference?

git rev-parse v2.30.0 shows 2d9685d47a7e516281aa093bf0cddc8aafa72448

So when I run git show 2d9685d47a7e516281aa093bf0cddc8aafa72448

I see:

tag v2.30.0
Tagger: Junio C Hamano <gitster@pobox.com>
Date:   Sun Dec 27 15:15:54 2020 -0800

Git 2.30
-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAl/pFasACgkQsLXohpav
5stUdg//WdTwkeKh4l3m8q3v8CoSx5P4qVVi17x5j67XjRuH1MvkPEe94WVNXv7b
loZNmFEJrdjGlvIW3ND9FbN6f2pv0hGgLwhKKlgOtMpFi6vDwQ7Y5FZUSmDODGyG
cGmbphqPLrow+EAIaA0OLT4kstQkHCPTYpoM5NJYOu051LatQsy1ROGDhbI+TfSK
QvMFVwHyd/cK3Qs/LTwEvqfL44PQWb88XFjvC3BDG9opgPpkbLXxngsIvrGQf7xZ
/ansFgYRI+hADDbzUB0PTI1xZluP4vDnFgMf/IANVMaUZJSZbqh7BZtPX5tdUt1W
IPFcX+QfMbe4g80YUFEUbZktjwB9CfCfsfpPKC1EzUJtjyLbGH3RnNXP+/5WG33h
VW+2gVmIWs7v9LfvP+tMRUDKiaLKOsGOgR94AbP1GXyAKk8pQMBszT4yIQeHVcOw
FRq02dkZf3z1BpoG3ClTUoz/rf6MadxwNpb8si5J78dscoMWSmXPjl0UHA9ouhGx
XoJeovIwgPMGrcSqRQLqxb1rWsJFER/XSa5IRdXh/gS2PdbZLSxuR0ZZmPJ2qITx
ev1FO8MY0OLSWwxi+r/UsZ/2ozzF9ZB1SOFdV+VLaYr6tRnXKIJRc0aQ2+fZCM00
/fqV6ki/O8BqnyPiVy1Nxqa+gAF8fAb2KK+OWZ6de43iwyNfPsY=
=QuDj
-----END PGP SIGNATURE-----

commit 71ca53e8125e36efbda17293c50027d31681a41f (grafted, HEAD, tag: v2.30.0)
Author: Junio C Hamano <gitster@pobox.com>
Date:   Sun Dec 27 15:15:23 2020 -0800

    Git 2.30
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff --git a/.cirrus.yml b/.cirrus.yml
new file mode 100644
index 0000000000..c2f5fe385a
--- /dev/null
+++ b/.cirrus.yml
@@ -0,0 +1,15 @@
+env:
+  CIRRUS_CLONE_DEPTH: 1
+
+freebsd_12_task:

However in the case of the buildx git example:

git rev-parse v0.7.1 gives me 05846896d149da05f3d6fd1e7770da187b52a247

and git show 05846896d149da05f3d6fd1e7770da187b52a247, shows:

Merge: 90694878 e89ed1bc
Author: Tõnis Tiigi <tonistiigi@gmail.com>
Date:   Thu Nov 25 12:11:11 2021 -0800

    Merge pull request #858 from tonistiigi/v0.7-fix-dockerignore-star
    
    [v0.7] vendor: patch docker and fsutil pattern matching

so it appears that the difference is related to the handling of signed vs unsigned commits.

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still quite confused about this.

It looks like it's annotated tag that causes issue, but I still don't really understand the case or what that error means.

I'd like to avoid refs with random ID where possible. I think this one also has the issue that if user would look up the current tag from checkout then it would not show the correct information.

I tried to run these commands manually. One of the things I found was that.

» git fetch -u --depth=1 origin  v2.30.0:v2.30.0
remote: Enumerating objects: 3961, done.
remote: Counting objects: 100% (3961/3961), done.
remote: Compressing objects: 100% (3480/3480), done.
remote: Total 3961 (delta 314), reused 3961 (delta 314), pack-reused 0
Receiving objects: 100% (3961/3961), 9.60 MiB | 65.99 MiB/s, done.
Resolving deltas: 100% (314/314), done.
error: cannot update ref 'refs/heads/v2.30.0': trying to write non-commit object 2d9685d47a7e516281aa093bf0cddc8aafa72448 to branch 'refs/heads/v2.30.0'
From ../a
 ! [new tag]         v2.30.0    -> v2.30.0  (unable to update local ref)
 * [new tag]         v2.30.0    -> v2.30.0

is error (although afaics it does actually create the tag correctly, just error code is wrong).

» git fetch -u --depth=1 origin  v2.30.0:refs/tags/v2.30.0
remote: Enumerating objects: 3961, done.
remote: Counting objects: 100% (3961/3961), done.
remote: Compressing objects: 100% (3480/3480), done.
remote: Total 3961 (delta 314), reused 3961 (delta 314), pack-reused 0
Receiving objects: 100% (3961/3961), 9.60 MiB | 63.85 MiB/s, done.
Resolving deltas: 100% (314/314), done.
From ../a
 * [new tag]         v2.30.0    -> v2.30.0

No error.

I can't fully understand what defines this behavior though.

@alexcb
Copy link
Collaborator Author

alexcb commented Jan 26, 2022

git fetch -u --depth=1 origin v2.30.0:refs/tags/v2.30.0

That's an interesting observation!

If it's helpful, I can re-work the gitsource_test.go test case I added to generate a pgp key and sign a tag.

@tonistiigi
Copy link
Member

If it's helpful, I can re-work the gitsource_test.go test case I added to generate a pgp key and sign a tag.

Only if this is needed to trigger this error case. I doubt the actual signature makes a difference but could be the tag annotation. Also validate that the requested tag is visible from the final snapshot. It doesn't look like it would be the case with the current implementation.

@rrjjvv
Copy link

rrjjvv commented Jan 27, 2022

I might be able to shed some light on some of this behavior. I'm not an expert in this domain, but I've been following this with interest.

is error (although afaics it does actually create the tag correctly, just error code is wrong).

I actually get slightly different results (probably due to how buildkit clones/pulls vs. my end-user clone), but with a "full" clone:

$ git tag -l | wc -l
825
$ git rev-parse v2.30.0
2d9685d47a7e516281aa093bf0cddc8aafa72448
$ git fetch -u --depth=1 origin v2.30.0:v2.30.0
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 1 (delta 0), pack-reused 0
Unpacking objects: 100% (1/1), 783 bytes | 783.00 KiB/s, done.
error: cannot update ref 'refs/heads/v2.30.0': trying to write non-commit object 2d9685d47a7e516281aa093bf0cddc8aafa72448 to branch 'refs/heads/v2.30.0'
From https://git.kernel.org/pub/scm/git/git
 ! [new tag]               v2.30.0    -> v2.30.0  (unable to update local ref)
$ git rev-parse v2.30.0
2d9685d47a7e516281aa093bf0cddc8aafa72448

I already had the tag (thus no new tag was created) but I had the same error. But, if I clone without tags (git clone --no-tags):

$ git tag -l | wc -l
0
$ git rev-parse v2.30.0
v2.30.0
fatal: ambiguous argument 'v2.30.0': unknown revision or path not in the working tree.
<snip>
$ git fetch -u --depth=1 origin v2.30.0:v2.30.0
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 1 (delta 0), pack-reused 0
Unpacking objects: 100% (1/1), 783 bytes | 783.00 KiB/s, done.
error: cannot update ref 'refs/heads/v2.30.0': trying to write non-commit object 2d9685d47a7e516281aa093bf0cddc8aafa72448 to branch 'refs/heads/v2.30.0'
From https://git.kernel.org/pub/scm/git/git
 ! [new tag]               v2.30.0    -> v2.30.0  (unable to update local ref)
$ git tag -l | wc -l
0
$ git rev-parse v2.30.0
v2.30.0
fatal: ambiguous argument 'v2.30.0': unknown revision or path not in the working tree.

So I got the exact same error and no tag was created. The lack of existing tag was not a trigger to create one despite the error (for me at least). Why your invocation created one but mine didn't is curious, but probably not relevant.

No error.
I can't fully understand what defines this behavior though.
That's an interesting observation!

This does appear to have a logical explanation from what I'm reading. If you look closely, the unqualified v2.30.0:v2.30.0 variant results in failure to update refs/heads/v2.30.0, but the qualified variant v2.30.0:refs/tags/v2.30.0 is successful. Note "heads" vs. "tags". That's significant. From the git-fetch docs, where it defines <refspec>:

As with pushing with git-push(1), all of the rules described above about what’s not allowed as an update can be overridden
by adding an the optional leading + to a refspec (or using --force command line option). The only exception to this is
that no amount of forcing will make the refs/heads/* namespace accept a non-commit object
(emphasis mine)

That explains the behavior between those two invocations, and despite both 71ca53e8125e36efbda17293c50027d31681a41f (lightweight) and 2d9685d47a7e516281aa093bf0cddc8aafa72448 (annotated) being tags,

$ git cat-file -t v2.30.0 
tag
$ git cat-file -t 2d9685d47a7e516281aa093bf0cddc8aafa72448
tag
$ git cat-file -t 71ca53e8125e36efbda17293c50027d31681a41f
commit

I'm just now seeing that this distinction is also made on the very first line of a standard git show, but this reinforces that it's meaningful, not just descriptive. Thus, the lightweight tag can be written to a head because it truly is a commit object, but not so for an annotated tag. And when using annotated tags from the CLI,

A command that takes a <commit-ish> argument ultimately wants to operate on a
<commit> object but automatically dereferences <tag> objects that point at a <commit>.

In summary:

  • annotated tags themselves cannot be within "heads" (it's not a commit object)
  • a deferenced annotated tag can be within "heads" (which is what the CLI will do for free in most porcelain commands)
  • it's a free-for-all outside of "tags" and "heads" (I didn't quote the docs, but mentioning it as a footnote since I see references to refs/builkit in the source, but I don't want to draw any conclusions)

Again, hopefully this is a help and not a hindrance... I just have a vested interest in this being implemented. Thanks for the work and effort!

@tonistiigi
Copy link
Member

@rrjjvv Thanks for the debug.

probably due to how buildkit clones/pulls vs. my end-user clone

https://gist.github.com/tonistiigi/d5391881660eb8554235df5252515108 has the commands I captured that buildkit currently runs.

So I guess the option left is to detect this case and use refs/tags? As long as we can make git rev-parse <name> work for the user after checkout that would be ok with me. Dereferencing a tag would also change its sha so not really an option (or am I missing something?).

@rrjjvv
Copy link

rrjjvv commented Jan 27, 2022

So I guess the option left is to detect this case and use refs/tags?

That should solve the technical issue. I guess the real question is whether you see this as "we support using tags", or "you can use tags as a shortcut". That could drive whether you truly store them as tags or whether you translate non-SHAs to SHAs on the front end. But I'm guessing that you already "support branches" (and store them as such), so using refs/tags probably is the obvious choice.

Dereferencing a tag would also change its sha so not really an option (or am I missing something?).

No, I don't think you're missing something, but it might not be a simple binary right-or-wrong. But I think there are two separate (but related) issues and breaking them apart might add some clarity.

My original desire, in Alex's project, was to check out a tag. Even more specifically, I wanted to build git version 2.33.0. Between my trust of that project and my requirements, cloning v2.33.0 was "good enough". But it resulted in a panic (which we now know is due to it being an annotated tag), and is what led to this report.

I then took the next logical step and figured "fine, what commit does that tag point to?". Having the code already cloned, I did a simple git log and saw I was on commit 71ca53e8125e36efbda17293c50027d31681a41f (the actual commit object). For reasons that still aren't entirely clear, that didn't work (in their project) either.

That's when I started "trying stuff" in preparation for a bug report. I literally went to https://git.kernel.org/pub/scm/git/git.git/tag/?h=v2.30.0 and plugged in that SHA (2d9685d47a7e516281aa093bf0cddc8aafa72448, which is the annotated tag). That actually ended in success, though it still had the same panic.

I think Alex's PR was truly focused on the first scenario. As a git end-user (using clone, pull, checkout, etc.), I don't even see the annotated tag; it appears that git is not only doing the dereference, but is then using that as the (false) source of truth. Since Earthly is exposing this as a clone operation, I would probably expect to see the commit (not the annotated tag). I guess for buildkit, it would depend on what exactly the intent of this "Snapshot" is? If it's supposed to be a high-level abstraction, akin to a simple clone/checkout, I'd guess it would be acceptable (or even preferable) to lie about the SHA. It looks like the Snapshot is wrapping a number of operations, but I don't have a good feel for whether it's intended to be "do what I say" or "do what I mean".

I'm not clear whether "what happens if I directly use a commit SHA" or "what happens if I directly use an annotated tag SHA" are questions that need answering. I'm guessing both already have well-defined behaviors (though I would think that being truthful would be the logical and expected behavior)

TL;DR; after my rambling, I'm too new to your world to give (educated) feedback or opinions. But there is precedent in fibbing about the SHA (showing the dereferenced commit rather than the annotated tag) in high-level git tooling. And by specifying a tag, you're implicitly giving up exactness for convenience. I think both (dereferencing and not) are equally defendable.

@alexcb alexcb force-pushed the fix-git-clone-tag branch 6 times, most recently from 2e806b8 to 9c1cd60 Compare January 28, 2022 19:49
This fixes errors such as:

    error: cannot update ref 'refs/heads/v2.30.0': trying to write non-commit object 2d9685d47a7e516281aa093bf0cddc8aafa72448 to branch 'refs/heads/v2.30.0'

which occur when cloning a tag rather than branch.

Signed-off-by: Alex Couture-Beil <alex@earthly.dev>
@alexcb
Copy link
Collaborator Author

alexcb commented Jan 28, 2022

Thanks for the great discussion @rrjjvv and @tonistiigi,

I added in an explicit check for lightweight tags vs annotated tags:

When we run git cat-file -t <tag>, we'll get tag for annotated tags, and commit for lightweight tags, which will allow us to keep the existing logic for lightweight tags. Here's an example with lightweight buildkit tags, vs annotated git tags:

lightweight

~/gh/moby/buildkit$ git cat-file -t v0.7.1
commit

In this case we'll keep the existing pullRef such as v0.7.1:v0.7.1.

annotated

~/testing/git$ git cat-file -t v2.30.0
tag

In this case, we'll set pullRef to something similar to: v2.30.0:refs/tags/v2.30.0.

@tonistiigi I expanded the tests to cover both lightweight and annotated tags. I was torn between creating two separate testFetchByTag and testFetchByAnnotatedTag functions vs a single function which accepts a few different flags to control what is being tested; ultimately I settled on the later. Let me know if you have a preference.

@tonistiigi tonistiigi changed the title Add support for git tags Add support for annotated git tags Jan 28, 2022
@tonistiigi tonistiigi merged commit edfc3f0 into moby:master Jan 29, 2022
@crazy-max crazy-max added this to the v0.10.0 milestone Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants