Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import OCI (base) images as-is #51

Closed
tianon opened this issue Sep 19, 2022 · 2 comments · Fixed by #61
Closed

Import OCI (base) images as-is #51

tianon opened this issue Sep 19, 2022 · 2 comments · Fixed by #61
Labels
enhancement New feature or request

Comments

@tianon
Copy link
Member

tianon commented Sep 19, 2022

In the case of base images (debian, alpine, ubuntu, etc), using a Dockerfile as our method of ingestion doesn't really buy us very much. It made sense at the time it was implemented ("all Dockerfile, all the time"), but at this point they're all some variation on FROM scratch \n ADD foo.tar.xz / \n CMD ["/bin/some-shell"], and cannot reasonably be "rebuilt" when their base image changes (which is one of the key functions of the official images) since they are the base images in question.

Functionally, consuming a tarball in this way isn't that much different from consuming a raw tarball that's part of, say, an OCI image layout (https://github.com/opencontainers/image-spec/blob/v1.0.2/image-layout.md) -- it's some tarball plus some metadata about what to do with it.

For less trivial images, there's a significant difference (and I'm not proposing to use this for anything beyond simple one-layer base images), but for a single layer this would be basically identical.

As a more specific use case, the Debian rootfs.tar.xz files are currently 100% reproducible. Unfortunately, some of that gets lost when it gets imported into Docker, and thus it takes some additional effort to get from the Docker-generated rootfs back to the original debuerreotype-generated file (see debuerreotype/docker-debian-artifacts#147 (comment) for an example where I've done so).

With the ability to consume an OCI image directly, I would be able to use something like debuerreotype/debuerreotype#108 to go even further and have a 100% fully reproducible image digest as well, and it would be easier to trace a given published image back to the reproducible source generated by the upstream tooling.

@tianon tianon added the enhancement New feature or request label Sep 19, 2022
@tianon
Copy link
Member Author

tianon commented Sep 19, 2022

My current plan for implementation is to piggy-back on #43 in order to introduce a new oci-image-layout builder (name still TBD) where Directory: is assumed to point to an OCI image layout (https://github.com/opencontainers/image-spec/blob/v1.0.2/image-layout.md -- the blobs subdirectory being the important bit) and File: is assumed to point to either a JSON file with a full "OCI content descriptor" in it or an image-layout-index.json that contains a single item (which is then the relevant OCI content descriptor).

For now, I'm thinking we will restrict the mediaType field to only allow/accept application/vnd.oci.image.manifest.v1+json.

Technically, a descriptor is allowed ("OPTIONAL" in the specification) to include urls and annotations, but my initial implementation will likely ignore those. I'm not decided yet on whether to ignore platform or to allow it but also validate that the specified object matches the one we would generate from the relevant Architectures: value for the image. I guess if we might want that validation eventually, we should probably start with it now instead of a breaking change later?

This will also have implications for scripts like diff-pr.sh, the way we calculate dependent images (these images will be implicitly "FROM scratch"), etc, but I think those are all cases of relevant technical debt that we'd need to update one way or another eventually anyways.

For PoC purposes, I plan to build something as a standalone tool for now, but with the explicit intent to make this part of bashbrew itself in the future -- maybe that's not actually the best plan, and a standalone tool has value too? I'm hoping writing the PoC will help me validate that assumption.

@tianon
Copy link
Member Author

tianon commented Dec 3, 2022

Initial PR: #61 👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant