Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out docker-daemon: tagging canonicalization rules #72

Closed
mtrmac opened this issue Sep 5, 2016 · 15 comments
Closed

Figure out docker-daemon: tagging canonicalization rules #72

mtrmac opened this issue Sep 5, 2016 · 15 comments

Comments

@mtrmac
Copy link
Collaborator

mtrmac commented Sep 5, 2016

Right now docker-daemon: destination just passes the input string as a tag in the stream. Docker seems to be doing some canonicalization, but it is unclear what exactly the rules are.

  • docker pull busybox gets the image tagged as docker.io/busybox (per docker inspect)
  • docker-daemon:busybox gets it tagged as docker:busybox
  • docker-daemon:docker.io/library/busybox gets it tagged as docker.io/busybox
  • docker inspect busybox:latest finds docker.io/busybox:latest
  • docker inspect docker.io/busybox.latest does not find busybox:latest

We need to figure out what is going on, document the relevant formats, and perhaps do some normalization in daemonReference.

cc @baude

@runcom
Copy link
Member

runcom commented Sep 6, 2016

Note that AFAICT our projectatomic/docker and docker/docker behave differently wrt how images are tagged (or retrieved, need to look that up)

@mtrmac
Copy link
Collaborator Author

mtrmac commented Sep 6, 2016

@runcom What do you mean exactly? Examples?

@runcom
Copy link
Member

runcom commented Sep 6, 2016

docker pull busybox gets the image tagged as docker.io/busybox (per docker inspect)

This, for instance, is valid for projectatomic/docker but not docker/docker - upstream does have that reference as just "busybox" (per docker inspect)

docker inspect busybox:latest finds docker.io/busybox:latest
docker inspect docker.io/busybox.latest does not find busybox:latest

These last two, same as above, in docker/docker you just get "busybox:latest" in RepoTags

@mtrmac
Copy link
Collaborator Author

mtrmac commented Sep 6, 2016

This, for instance, is valid for projectatomic/docker but not docker/docker

Oh. Thanks for pointing this out, I would have never thought to look there. Does that mean that we should be vendoring in projectatomic/docker/reference instead of docker/docker/reference? (Should be doable but it is not completely trivial, we would need to check whether that affects reference matching in signatures.)

@mtrmac
Copy link
Collaborator Author

mtrmac commented Sep 6, 2016

Alternatively,

  • docker pull busybox gets the image tagged as docker.io/busybox (per docker inspect)
    • docker-daemon:busybox gets it tagged as docker:busybox

This suggests that the different canonicalization projectatomic/docker does might not be handled consistently, i.e. the docker pull path was changed but not the docker load path. (I wish this to be true because if it were, skopeo and atomic wouldn’t need to worry about reference canonicalization at all and we would change the docker load canonicalization.)

@runcom
Copy link
Member

runcom commented Sep 6, 2016

This suggests that the different canonicalization projectatomic/docker does might not be handled consistently, i.e. the docker pull path was changed but not the docker load path. (I wish this to be true because if it were, skopeo and atomic wouldn’t need to worry about reference canonicalization at all and we would change the docker load canonicalization.)

I fear it's like you said, and docker load doesn't get the canonicalization - I'm trying right now though.

@runcom
Copy link
Member

runcom commented Sep 6, 2016

With projectatomic/docker, after switching from docker/docker with busybox already in storage:

$ docker pull busybox
Using default tag: latest
Trying to pull repository docker.io/library/busybox ... 
latest: Pulling from docker.io/library/busybox
Digest: sha256:a59906e33509d14c036c8678d687bd4eec81ed7c4b8ce907b888c607f6a1e0e6
Status: Downloaded newer image for docker.io/busybox:latest

$ docker images | grep busybox
busybox              latest              2b8fd9751c4c        10 weeks ago        1.093 MB
docker.io/busybox    latest              2b8fd9751c4c        10 weeks ago        1.093 MB

$ docker inspect busybox | grep busybox
            "busybox:latest",
            "docker.io/busybox:latest",

on RHELs we can safely assume people are using projectatomic/docker so only the second lines are valid. Though, I feel we'll need to deal with both because we can't expect people to be using projectatomic/docker.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 19, 2016

(Mostly notes to self:)

Yeah, this is the “Add --add-registry and --block-registry options to docker daemon” patch.

docker/docker/reference has the concept of normalization, where semantically equivalent reference.Named values are always mapped to the same string; the normalization trims the default docker.io hostname and the default /library namespace within it (docker.io/library/busyboxbusybox).

  • This impacts any code using reference.Named as a data structure instead of strings; even a fully qualified string input by the user is always presented in the normalized (minimal) form on output, through .Name() or .String().
  • OTOH, namedRef.FullName always returns a fully qualified version (including a host name). [Throughout, .Hostname() and .RemoteName() behave consistently with .FullName(), i.e. the two repositories differ.]
  • namedRef.FullName() also always returns a fully qualified namespace (adding library/ for docker.io hostname, whether explicit or implied); i.e. for both busybox and docker.io/library/busybox, namedRef.Name() is busybox and namedRef.FullName is docker.io/library/busybox.

projectatomic/docker/reference does not have this kind of normalization; instead, it introduces “unqualified” references (which do not contain a hostname on input). ONLY references with an explicit docker.io hostname are normalized by trimming the default /library, but not by trimming the hostname: busyboxbusybox, library/busyboxlibrary/busybox, docker.io/library/busyboxdocker.io/busybox.

  • OTOH, namedRef.FullName for unqualified references does not include a host name!
  • namedRef.FullName(), because it semantically does not default the hostname to docker.io, does not return a fully qualified namespace: for busybox, both namedRef.Name() and namedRef.FullName() is busybox, and for docker.io/library/busybox, namedRef.Name() is docker.io/busybox and namedRef.FullName is docker.io/library/busybox.

In docker/docker, any image fetched by docker pull is always tagged using the (normalized! i.e. minimal) version of the input reference (tagged or digested), and if the input was a tagged reference, also using the digested version.

In projectatomic/docker, only qualified images are, after fetching, tagged using the input reference (normalized to drop /library, but including the host name!); when fetching an unqualified reference, the code fetches an image from the first matching registry in the "default registry list”, and tags it (tagged+digested as above) using only a qualified variant of the reference (i.e. with a hostname); not the input unqualified reference.

  • This happens only in the pull path; other ways to tag an image do not convert the reference into a qualified one.

This is compensated for in the reference lookup code (reference/store.Get), which looks up qualified references directly as expected EDITalways looks up the input reference directly first, but for unqualified references it also iterates through the “default registry list” and uses the first qualified variant which exists.

So, docker pull busybox tags the image as docker.io/busybox and future references to both docker.io/busybox and busybox look up docker.io/busybox; the raw busybox tag is never fetched.

OTOH, an explicit docker tag busybox, or a docker load which includes an unqualified busybox:something tag, tags the image using this unqualified name; future references to this unqualified name do not look up the unqualified tag and only search for docker.io/busybox (and other hostnames in the "default registry list"). EDIT this image can then only be referenced using this unqualified name, references to a hostname-qualified name do not look up the unqualified tag.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 19, 2016

@runcom If ^^^ is correct (I will take another look on Monday, actually testing all of that instead of just reading code), it seems to me that it would be best for projectatomic/docker to modify reference.store.AddTag to convert attempts to tag with an unqualified reference to tagging with a reference qualified with the first hostname on the default registry list.

This should in one place deal with docker tag, docker load, and all other places where it is possible to tag using an unqualified reference which can then never be found.

Though, admittedly, it is not at all clear that the user intent was to use the first hostname on the list for unqualified references… OTOH just storing an unqualified tag and extending the lookup code to also look up for unqualified tags seems worse, then we can have docker.io/busybox, r.a.redhat.com/busybox and busybox pointing at three different images…

The only certain thing is that I am confused and I will need to return to this on Monday :)

(Sure, the docker-daemon: destination code could ask for the image to be tagged using several tags, e.g. both busybox and docker.io/library/busybox, and AFAICS that should be harmless on the normalizing docker/docker implementation and it would make the image accessible with projectatomic/docker without patching projectatomic/docker; but the docker tag situation would still be broken.)

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 21, 2016

This is compensated for in the reference lookup code (reference/store.Get), which looks up qualified references directly as expected, but for unqualified references it also iterates through the “default registry list” and uses the first qualified variant which exists.

Correction: Every reference, qualified or unqualified, is looked up directly as expected. So, the case of docker tag something $whatever; docker inspect $whatever always works fine.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 21, 2016

Overall, the lookup behavior of projectatomic/docker ends up being like this: row indicates how the image was tagged, column what is being looked up. d = docker.io/, l = library/, busybox (e.g. db means docker.io/busybox). ✔️ = tag is found, ❌ = not found.

⬇️Tag \ Reference➡️ b lb db dlb Actual tag
b ✔️ busybox
lb ✔️ library/busybox
db ✔️ ✔️ ✔️ ✔️ docker.io/busybox
dlb ✔️ ✔️ ✔️ ✔️ docker.io/busybox

(With docker/docker, the table is all ✔️ ; all of these references are canonicalized to busybox.)

Locally, the situation is somewhat surprising but understandable: unqualified references have an indeterminate host name, so busybox and library/busybox are treated as separate because with an indeterminate host name the docker.io/library special casing does not apply, and docker.io/*/busybox does not find indeterminatehostname/*/busybox because otherwise it would be reasonable to expect indeterminatehostname/*/busybox to match anyregistryinthedefaultlist/*/busybox at the same time, which does not really make sense.

The trouble with this is (as identified earlier by Scott McCarty) is that docker tag $something busybox:tag; docker push busybox:tag; docker pull busybox:tag does not update the busybox:tag tag, only the docker.io/busybox:tag tag. That does seem clearly wrong.

The fallout for docker-daemon: is that the generated tarball for docker load should include a fully qualified name; docker/docker will normalize it away, but in projectatomic/docker this gets moves us from the top row of the table to the two lower, more interoperable, rows of the table.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 21, 2016

docker/docker will normalize it away, but in projectatomic/docker this gets moves us from the top row of the table to the two lower, more interoperable, rows of the table.

… and, equally important, this way docker load will update the same tags as docker pull. (OTOH, if the user does have a busybox tag, now neither docker pull nor docker load will update it, and references to busybox will use that tag instead of the result of pull/load.)

@runcom
Copy link
Member

runcom commented Nov 21, 2016

@mtrmac could you open a bugzilla so I can fix this in our projectatomic/docker repo with a reproducer and expected behavior?

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 21, 2016

@mtrmac could you open a bugzilla so I can fix this in our projectatomic/docker repo with a reproducer and expected behavior?

Filed as https://bugzilla.redhat.com/show_bug.cgi?id=1397198 , feel free to reassign to a different version or product. Now searching in bugzilla, it seems there may be other concerns / risks as well, but those are up to projectatomic/docker; as far as docker-daemon: is concerned, I think I know what to do.

mtrmac added a commit to mtrmac/image that referenced this issue Nov 21, 2016
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/image that referenced this issue Nov 22, 2016
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/image that referenced this issue Nov 22, 2016
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 26, 2016

For containers/image, this was fixed by #165 . The bugzilla above tracks projectatomic/docker.

@mtrmac mtrmac closed this as completed Nov 26, 2016
Jamesjaj2dc6j added a commit to Jamesjaj2dc6j/lucascalsilvaa that referenced this issue Jun 6, 2022
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers/image#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mrhyperbit23z0d added a commit to mrhyperbit23z0d/thomasdarimont that referenced this issue Jun 6, 2022
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers/image#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
easyriderk0c9g added a commit to easyriderk0c9g/rnin9 that referenced this issue Jun 6, 2022
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers/image#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
oguzturker8ijm8l added a commit to oguzturker8ijm8l/diwir that referenced this issue Jun 6, 2022
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers/image#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
wesnowm added a commit to wesnowm/MatfOOP- that referenced this issue Jun 6, 2022
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers/image#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
gaveen3 added a commit to gaveen3/KimJeongHoon3x that referenced this issue Jun 6, 2022
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers/image#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
gaveen3 added a commit to gaveen3/KimJeongHoon3x that referenced this issue Jun 6, 2022
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers/image#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Manuel-Suarez-Abascal80n9y added a commit to Manuel-Suarez-Abascal80n9y/knimeu that referenced this issue Oct 7, 2022
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers/image#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
straiforos8bsh5n added a commit to straiforos8bsh5n/tokingsq that referenced this issue Oct 7, 2022
…load)

This makes the tagging behavior consistent with (docker pull) for
projectatomic/docker, and does not change behavior for docker/docker.

See the added comment, and containers/image#72 ,
for more explanation.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants