Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(registry): image name parsing behavior #1526

Merged
merged 7 commits into from
Apr 12, 2023

Conversation

Pwuts
Copy link
Contributor

@Pwuts Pwuts commented Jan 17, 2023

What this PR contributes

It replaces Watchtower's own image ref manipulation logic (wherever possible) by docker's own implementations as imported from docker/distribution/reference.

As a result, this PR:

  • solves failing HEAD requests for images with a single-part path
  • solves other off-spec behavior in some cases (such as references where the registry host has a port)
  • reduces the amount of code to be maintained in this repository
  • reduces the complexity of some parts of the code
  • resolves this todo
  • Closes Cannot do a head request for private registry #1525

Changes are limited to pkg/registry, and deprecate pkg/registry/helpers.

How/what/why this works

In reference/normalize.go, ParseNormalizedNamed and ParseDockerRef call splitDockerDomain which normalizes the registry domain from index.docker.io to docker.io if applicable. This allows us to directly use the properties of the Named/NamedTagged/NamedDigested that is returned by these (and other) functions.

  • helpers.NormalizeRegistry(normalizedName.String()) -> reference.Domain(normalizedName)
  • ExtractImageAndTag([...]) -> reference.Path(normalizedRef), reference.Tag(normalizedRef) (if container image is not pinned)
  • auth.GetScopeFromImageName(img, host) -> reference.Path(normalizedName)
  • etc...

Notable choices made along the way

  • Defaulting to docker.io credentials (stored in the config under index.docker.io or https://index.docker.io/v1/) when an image name does not contain a registry host. This was an outstanding todo, and can be done safely under the right conditions.
  • Using index.docker.io to specify an image on Docker Hub is legacy, so I have replaced index.docker.io with docker.io in all relevant places, or removed it altogether, as with this PR, Watchtower should use the right Docker Hub domain by default when no registry is specified in an image name.
  • registry-1.docker.io is only supposed to be used in setting up a pull-through cache registry, not for pulling images directly. Hence, I think it's reasonable to drop "special support" for this case (and disable the corresponding test).
    If anyone really wants to use that domain, they can still specify the image as registry-1.docker.io/library/image, just like they already need to do when using a docker hub mirror.
  • Disabling this test, which tests for something that should never occur: a container with a familiar name (e.g. ubuntu for docker.io/library/ubuntu) that is not hosted on Docker Hub. Containers pulled from registries other than Docker Hub will always have the registry domain in container.ImageName().

Tests

After my updates, all tests succeed except for the two mentioned above, which I have commented for now.

I tried to improve test coverage as best as possible for the files that I changed, including a test in pkg/registry/manifest/manifest_test.go for the specific issue that triggered me to do this work :)

Updates to the documentation

As mentioned above, I have replaced index.docker.io by docker.io wherever relevant and appropriate.
Also added a heads-up + examples of using private registries on local and remote hosts.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congratulations on opening your first pull request! We'll get back to you as soon as possible. In the meantime, please make sure you've updated the documentation to reflect your changes and have added test automation as needed. Thanks! 🙏🏼

@piksel
Copy link
Member

piksel commented Jan 18, 2023

Finally, I'm not sure if WarnOnAPIConsumption will still be necessary after these modifications.

I'm not sure why that would be the case. Watchtower warns the user that the HEAD request failed, and docker pull is used instead. But since this might not be supported on all repositories, we only warn for those that we know support it (when set to Auto), to avoid spamming the logs (as most users don't have any way to fix it, nor gain that much benefit from the HEAD requests since they probably aren't rate-limited).

@piksel
Copy link
Member

piksel commented Jan 18, 2023

index.docker.io is legacy and no longer the primary domain for the Docker Hub registry, so I have swapped index.docker.io and docker.io in all relevant places

Unfortunately, I think this will break users configs, since they have been explicitly told to use the full index.docker.io domain in their docker.configs. But I think we can make a backwards compatibility fix for this...

@codecov
Copy link

codecov bot commented Jan 18, 2023

Codecov Report

Patch coverage: 73.17% and project coverage change: +0.10 🎉

Comparison is base (cfcbcac) 66.87% compared to head (e3333ad) 66.98%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1526      +/-   ##
==========================================
+ Coverage   66.87%   66.98%   +0.10%     
==========================================
  Files          25       25              
  Lines        2373     2311      -62     
==========================================
- Hits         1587     1548      -39     
+ Misses        685      667      -18     
+ Partials      101       96       -5     
Impacted Files Coverage Δ
pkg/registry/auth/auth.go 35.65% <42.85%> (-6.50%) ⬇️
pkg/registry/trust.go 42.37% <70.00%> (-5.32%) ⬇️
pkg/registry/helpers/helpers.go 100.00% <100.00%> (+40.90%) ⬆️
pkg/registry/manifest/manifest.go 78.57% <100.00%> (+5.65%) ⬆️
pkg/registry/registry.go 26.66% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Pwuts
Copy link
Contributor Author

Pwuts commented Jan 18, 2023

Unfortunately, I think this will break users configs

Users who have followed the instructions and specified the registry domain explicitly in their docker-compose config or docker run command should not run into problems, because GetToken eventually calls reference.Parse(), not reference.ParseNormalizedNamed(), meaning any specified domain will not be normalized or altered before using it to fetch credentials from the config file.

However, I see there's a todo (from 2019) to elaborate ParseServerAddress() so that it assumes docker.io when no domain or registry host is specified. Is this still a todo or desired behavior? Because currently there is a test that specifies "parse server address_ should return the organization part if passed an image name missing server name". They can't both be true :)

Edit: I think it's a bad idea to have any kind of fallback behavior with regards to credentials when there is any uncertainty about the registry host, because falling back to credentials for docker.io means submitting credentials for docker.io to any registry for which credentials are not set. This is a security and credential leak risk -> very undesirable.

Edit 2: Apparently, docker's native behavior is to only parse hosts that satisfy strings.ContainsAny(host, ".:") || host == 'localhost', meaning it is always possible to distinguish between an organization name and a host.

@piksel
Copy link
Member

piksel commented Jan 18, 2023

Yeah, the fallback to docker.io creds should only apply when the registry hostname is updated to docker.io (which is the normal case for when the ref lacks any repository host).

@Pwuts
Copy link
Contributor Author

Pwuts commented Jan 18, 2023

Implemented in 0671ee8 (here).

@Pwuts
Copy link
Contributor Author

Pwuts commented Jan 18, 2023

0671ee8 also improves test coverage

@Pwuts
Copy link
Contributor Author

Pwuts commented Jan 20, 2023

@piksel could you approve another codecov run? I think it would be more positive this time. :)

Also, I'm not sure how to proceed. I have made all the semantic changes I intended, but now there's a certain amount of dead code. Should I wait with removing that until after a first round of CR?

@Pwuts
Copy link
Contributor Author

Pwuts commented Jan 20, 2023

Thanks! Looks good, it's only complaining about auth.go because the test for GetToken is skipped for lack of GHCR credentials, and the coverage diff for manifest.go should be positive after cleaning up the now-dead code.

The security vulnerability ("password logging") that the CodeQL check warns for was already there when I got here, and is only in a Tracef call ¯\_(ツ)_/¯

@Pwuts Pwuts changed the title fix: use native docker image ref manipulations fix: improve image name parsing behavior Jan 20, 2023
@Pwuts Pwuts force-pushed the fix/ref-processing branch 4 times, most recently from 675509e to 0a2d4f9 Compare January 20, 2023 17:06
@Pwuts
Copy link
Contributor Author

Pwuts commented Jan 20, 2023

There we go:

  • GetRegistryAddress gives index.docker.io for images specified with registry docker.io, index.docker.io, or without a registry domain
  • EncodedConfigAuth will also try credStore.Get("https://index.docker.io/v1/") if index.docker.io is not found in the config file, since docker login without saves the credentials under that key by default
  • Updated the docs with a heads-up + examples for using local registry hosts

I think that was the last to be done here without further expanding the scope of this PR. :)

@Pwuts Pwuts changed the title fix: improve image name parsing behavior Fix image name parsing behavior Jan 21, 2023
@Pwuts
Copy link
Contributor Author

Pwuts commented Jan 21, 2023

@simskij can I ask for your judgment on the semantic changes so far? :)

@piksel
Copy link
Member

piksel commented Jan 22, 2023

The security vulnerability ("password logging") that the CodeQL check warns for was already there when I got here, and is only in a Tracef call ¯_(ツ)_/¯

Yeah, but there really isn't any need for it anymore. I created a PR (#1534) to remove all the logging that has no clear usage scenario and could potentially leak credentials.


!!! info "Using private images on Docker Hub"
To access private repositories on Docker Hub,
`<REGISTRY_NAME>` should be `index.docker.io`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`<REGISTRY_NAME>` should be `index.docker.io`.
`<REGISTRY_NAME>` should be `docker.io`.

If using docker.io is fully supported, we might as well put it in the docs for new users, no?

Copy link
Contributor Author

@Pwuts Pwuts Jan 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but it is a bit more nuanced than that. Docker uses docker.io internally as the default registry domain, but the Docker Registry API for Docker Hub is hosted on index.docker.io, and so docker libraries expect config.json to contain credentials for index.docker.io.

Worse, the Docker CLI does not recognize any other key than https://index.docker.io/v1/, and docker/cli/cli/config/credentials (that supplies credStore.Get()) applies some backwards compatibility magic to match this key when requesting credStore.Get('index.docker.io'). A call to docker pull PRIVATE_IMAGE (with PRIVATE_IMAGE an image on Docker Hub) fails with config.json only containing an entry for index.docker.io.

Copy link
Contributor Author

@Pwuts Pwuts Jan 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have amended my last 2 commits to "comply" with Docker's own behavior and reflect this in the documentation.

It took a few iterations to get it right and I didn't want to pollute the git history, hence the multiple force-pushes.

Copy link
Contributor Author

@Pwuts Pwuts Jan 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the CLI treats index.docker.io as an exception to the rule of how credentials are stored in config.json, and a comment in docker/cli remarks that this is legacy.

Would you agree that we should play it safe and ensure that a config.json manually generated according to the Watchtower docs should also work with the Docker CLI, even though this means we tell users to go along with this legacy, inconsistent behavior?

@Pwuts Pwuts force-pushed the fix/ref-processing branch 4 times, most recently from 5241c26 to 6a6c7ae Compare January 22, 2023 12:57
@Pwuts
Copy link
Contributor Author

Pwuts commented Jan 22, 2023

Repackaged the changes to reduce diff with upstream and make it easier to CR (per commit)

@Pwuts
Copy link
Contributor Author

Pwuts commented Mar 13, 2023

@piksel any timeline for review/merge?

@piksel
Copy link
Member

piksel commented Mar 14, 2023

No, sorry. But I will take a look as soon as I get some free time.

@piksel piksel self-assigned this Mar 14, 2023
@piksel piksel changed the title Fix image name parsing behavior fix(registry): image name parsing behavior Apr 12, 2023
@piksel piksel merged commit 25fdb40 into containrrr:main Apr 12, 2023
@Pwuts
Copy link
Contributor Author

Pwuts commented Apr 12, 2023

Thanks a lot! :)

@Pwuts Pwuts deleted the fix/ref-processing branch April 12, 2023 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot do a head request for private registry
2 participants