Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/policy.json.md: Separate transports from names #166

Conversation

wking
Copy link
Contributor

@wking wking commented Nov 19, 2016

Spun off from the discussion in #59, this commit decouples image names from the transport used to fetch them. That makes for easier mirroring (no more need for dockerRepository, etc.).

This commit also decouples by-name reference from by-digest reference, since in the digest case you have a cryptographic hash on the images. The only remaining concern for such references is whether the transport can be trusted to not contain malicious images (protecting the naive user from requesting a malicious image by digest), and nameOnly allows you to block digests over transports you don't trust users to pull from responsibly.

If folks feel like this approach has legs, I'm happy to work up the backing Go as well.

@wking wking force-pushed the transport-agnostic-signature-verification branch from be3fe94 to 7acd885 Compare November 19, 2016 04:25
Spun off from the discussion starting in [1], this commit decouples
image names from the transport used to fetch them.  That makes for
easier mirroring (no more need for dockerRepository).

This commit also decouples by-name reference from by-digest reference,
since in the digest case you have a crypographic hash on the images.
The only remaining concern for such references is whether the
transport can be trusted to not contain malicious images (protecting
the naive user from requesting a malicious image by digest), and
nameOnly allows you to block digests over transports you don't trust
users to pull from responsibly.

[1]: containers#59 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking wking force-pushed the transport-agnostic-signature-verification branch from 7acd885 to 4ed95ec Compare November 19, 2016 04:26
@wking
Copy link
Contributor Author

wking commented Nov 19, 2016

Related discussion (with lots of complications due to packing transport/namespace information into image names) in #72.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 21, 2016

Sorry, no.

As we have discussed elsewhere, names must have semantics, and for the purpose of identifying the appropriate set of policy requirements the namespace of possible names must have an allocation mechanism which prevents conflicts. This PR seems to invent a new category of names, with an entirely new semantics (assuming a structure for tags which makes “prefixes” meaningful—or is that supposed to be a literal string prefix? That doesn’t work for version numbers at all), and gives up on the allocation mechanism.

containers/image creates an unified namespace by imposing the existence of top-level containers/image/types.ImageTransport implementations, which primarily exist to govern their own namespaces of image references (types.ImageReference implementations):

Note that ImageTransport is based on "ways the users refer to image storage", not necessarily on the underlying physical transport.

There is no assumption that image identities are reusable across transports; in fact the dir: images as implemented have absolutely no identity other than the path name!

(Yes, then there is a concept that any such types.ImageReference may have a “docker reference” associated, but it is optional, and anyway irrelevant to the scopes used to decide on a set of policy requirements in policy.json: policy.json uses a types.ImageReference to decide on a set of policy requirements, and those may (as in case of signedBy) require an associated “docker reference”.

So, if you want to invent some new kind of names, that would be a new ImageTransport implementation and a new set of transport-specific configurations, not a new top-level structure of policy.json.

Meanwhile, as #59 (comment) argues, for usability it is very desirable to base the policy on names which the user is already using.

That makes for easier mirroring (no more need for dockerRepository, etc.).

It doesn’t. When the user says $tool pull sometransport:mylocalmirror/something/which/contains/a/copy@of!busybox, which is expected to be a mirror of docker://docker.io/busybox, something somewhere needs to be configured to expect the docker://docker.io/busybox identity. Just throwing away the signedIdentity field does not solve this problem; at best it requires the caller of the policy engine to specify this mapping, but that just moves the configuration to a different configuration file.

This commit also decouples by-name reference from by-digest reference, since in the digest case you have a cryptographic hash on the images.

When the system administrator says that all images from $transport:$hostname are rejected, it would be pretty bad to allow the users to pull any of them simply by using a digest. Sure, in some cases the user may be expected to know a trusted digest, and to exercise a due care when obtaining it, but the organization may simply reject $hostname for trust/code quality reasons, or have a policy that there must be a verifiable end-to-end-signature for every pulled image (to be recorded in a log, perhaps), whether or not the user used a digest.

Related discussion (with lots of complications due to packing transport/namespace information into image names) in #72.

In fact #72 is all about image names having transport-dependent semantics: various ways to refer to the same image must always result in the same policy scope search, and for every two distinct images it must be possible to define different policies. How various transports canonicalize names is inherently transport-specific (dir: will not be adding/removing library/ anywhere).


The PR also makes quite a few undocumented changes, like combining the transport name and path into a single string; from a quick look most of them seem not to be an improvement, but I may just have missed some of them. Anyway, the giving up on namespace semantics (or assuming that there is a single one which will fit everyone) is a show-stopper to me.


@runcom Any comments before I close this?

@runcom
Copy link
Member

runcom commented Nov 21, 2016

@runcom Any comments before I close this?

nope, closing :)

@runcom runcom closed this Nov 21, 2016
@wking
Copy link
Contributor Author

wking commented Nov 22, 2016

On Mon, Nov 21, 2016 at 07:20:04AM -0800, Miloslav Trmač wrote:

… for the purpose of identifying the appropriate set of policy
requirements the namespace of possible names must have an
allocation mechanism which prevents conflicts.

I don't think this is true. If Alice and Bob both name busybox:1.25.1
and I trust both of them to name ‘busybox:…’, I can trust images named
by either of them. And if I only trust Alice to name ‘busybox:…’, I'm
not going to trust Bob's name assertion even in the absence of
transport-based namespacing.

This PR seems to invent a new category of names, with an entirely
new semantics (assuming a structure for tags which makes “prefixes”
meaningful—or is that supposed to be a literal string prefix? That
doesn’t work for version numbers at all)…

It is supposed to be literal string prefixing. How does it break down
for versions? A ‘busybox:’ prefix policy works for all version names,
a ‘busybox:1.25.’ prefix policy works for all 1.25 patch releases, and
a ‘busybox.1.25.1’ name policy works only for that exact name.

There is no assumption that image identities are reusable across
transports; in fact the dir: images as implemented have absolutely
no identity other than the path name!

It's even worse than that. As the master-branch of this repitory
stands, someone attempting to distribute a signed OCI image via an
image-layout tarball has to include instructions like “before you
verify this image, you need to place it at
/var/lib/oci/image/busybox”. If you happen to be on Windows trying to
validate an image before you launch a Linux VM based on it, you're
presumably out of luck. dockerReference and dockerRepository in the
current master branch provide some relief from this problem, but
they're one-off fixes that don't address mirroring for non-Docker
images.

With this PR, there are no such restrictions. You can just name the
image without worrying about where you validators are going to be
storing it. And if the verifier wants a transport-linked validation
policy, that's completely up to them.

So, if you want to invent some new kind of names, that would be a
new ImageTransport implementation and a new set of
transport-specific configurations, not a new top-level structure of
policy.json.

I can submit a transport-agnostic transport wrapper if it would help
make these ideas clearer ;).

When the user says $tool pull sometransport:mylocalmirror/something/which/contains/a/copy@of!busybox,
which is expected to be a mirror of docker://docker.io/busybox,
something somewhere needs to be configured to expect the
docker://docker.io/busybox identity.

To make that explicit, I'd expect this to be:

$ tool pool --transport sometransport:mylocalmirror/something/which/contains/a/copy@of!busybox --name busybox:latest

with a policy like:

{
"prefixes": {
"busybox:": [
{
"type": "signedBy",
"keyType": "tuf-RSASSA-PSS",
"keyPath": "/path/to/official-root-keys.json"
"signedIdentity": {
"type": "matchRepository"
}
}
]
}
}

and matchRepository would

where the pubkeys are from [1](or wherever Notary serves the
root.json from [2]).

When the system administrator says that all images from
$transport:$hostname are rejected, it would be pretty bad to allow
the users to pull any of them simply by using a digest.

Agreed, which is why you want a global ‘reject’ and per-transport
‘nameOnly’ for any user/transport pair you don't trust to handle
digests safely.

… various ways to refer to the same image must always result in the
same policy scope search, and for every two distinct images it must
be possible to define different policies.

I'd rather punt that to the user/publishers to sort out. String
(prefix) matching gets you the [i,i] entries in 3. If the publisher
wants to name the image ‘busybox:1.25.1’, you need to ask for
‘busybox:1.25.1’ (unless you're using matchRepository). If the
publisher thinks it makes more sense to call the image
‘docker.io/library/busybox:1.25.1’ 4, then that's what you have to
ask for.

If you need more flexibility, you should be able to configure
it with basic regular expressions 5:

{
"prefixes": {
"busybox:": [
{
"type": "signedBy",
"keyType": "tuf-RSASSA-PSS",
"keyPath": "/path/to/official-root-keys.json"
"nameRegex": ["^(docker.io/){0,1}(library/){0,1}(.*)$", "\3"]
}
]
}
}

But it seems overly magical hard-code too much of that sort of thing
in the library itself.

giuseppe pushed a commit to giuseppe/image that referenced this pull request Jan 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants