Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream processors not supported by differ #4263

Open
tonistiigi opened this issue May 17, 2020 · 1 comment
Open

Stream processors not supported by differ #4263

tonistiigi opened this issue May 17, 2020 · 1 comment

Comments

@tonistiigi
Copy link
Member

Despite the fact that stream processors are defined in the diff package they are not supported by the differ and only by implementations of Apply() function.

The differ continues to use hardcoded compression step, only supporting uncompressed and gzip. From the docs and the design intention, it seems quite logical that stream processors should be supported on both sides.

Looking at the code, I don't see much issues with supporting the external binary logic via toml configuration. But the Go API for stream processors seems very much designed with only the Apply use-case in mind.

The first issue I see is that return media type is only accessible after processor has been initialized(and might already read data). That means that when finding a processor, only the accept-mediatype can be used. The return mediatype should also be available before initialization in the handler. All the implementations already use it this way, they just currently pass and store the static mediatype from the handler level to the processor implementation. Because of this, the current processor lookup is also buggy in cases where multiple processors would chain together.

The other and much more complicated issue is that the Go API is solely based on ReadCloser, while differ implementations are usually based on writers. One can, of course, use an extra goroutine and io.Pipe to turn a readcloser into a writecloser, but that would be quite inefficient. For example in default gzip case gzip.NewWriter would turn into a ReadCloser with io.Pipe and then when differ writes to archive.WriteDiff that ReadCloser would be turned back to Writer with another io.Pipe. A solution for this would be to allow both reader and writer based implementations as a processor provider and both types of processor chains. That should make sure that default cases don't do io.Pipe and complex cases do at most one. Another option would be to already define a reader/writer type in the registration/config level. This looks quite ugly from API but would work because in reality we only have processors that either convert to uncompressed layer mediatype or from uncompressed layer mediatype. A cleaner API would then have been to instead of trying to define a generic conversion between mediatypes to define compressors/decompressors for layers.

If there is no plan to support stream processors for this, that would be fine for my use case as well. As I mostly just use libraries, I could just add a callback option to differ where I can pass the WriteCloser for compression.

@fuweid @cpuguy83

@fuweid fuweid added this to the 1.5 milestone May 18, 2020
@fuweid
Copy link
Member

fuweid commented May 18, 2020

SGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants