-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Input digest tagger #5192
Input digest tagger #5192
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5192 +/- ##
==========================================
- Coverage 70.68% 70.51% -0.17%
==========================================
Files 408 410 +2
Lines 15581 15624 +43
==========================================
+ Hits 11014 11018 +4
- Misses 3757 3792 +35
- Partials 810 814 +4
Continue to review full report at Codecov.
|
Sorry for the delay in reviewing this @piotr-szybicki. I'll try to look today — this particular issue has been on my wishlist for a bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @piotr-szybicki — I must apologize profusely for taking so long to review this PR. It generally looks great and is exactly what I was hoping for.
The one big quibble is the name of pkg/skaffold/build/dep
. The team talked about this and we'd like to move this package up in the hierarchy and call it graph
(so pkg/skaffold/graph
).
If you're sick of this PR, I'm happy to take this on.
return "", err | ||
} | ||
h.Write([]byte(fi.Mode().String())) | ||
h.Write([]byte(fi.Name())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this should be a file-path relative to the artifact workspace, and it should use filepath.ToPath()
to ensure it's encoded consistently between Windows and *nix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is no filepath.ToPath()
function I used filepath.Clean()
that seems to do the job. Also, Do you think I should change it in the hash.fileHasher()
? For consistency.
if err != nil { | ||
return "", err | ||
} | ||
h.Write([]byte(fi.Mode().String())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're hashing inputs, I don't think the mode matters since that will be affected by my OS and local umask settings. And the builder itself may change these modes as it's writing files into the layers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
absolutely. Do you think I should also remove it from the hash.fileHasher()? Feels like the internal scaffold cache and the tagger should be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ehh, never mind, they will never be consistent. I just took a look at the hash.go
code and see that you are also hashing the build arguments. Totally make sense if you consider the development life cycle. Something that I don't think is appropriate to be a part of the image hash (Might be wrong on that one). I mean the way I see it we want to be close to what docker is doing with it's layer.DiffID
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @piotr-szybicki — I must apologize profusely for taking so long to review this PR. It generally looks great and is exactly what I was hoping for.
The one big quibble is the name of
pkg/skaffold/build/dep
. The team talked about this and we'd like to move this package up in the hierarchy and call itgraph
(sopkg/skaffold/graph
).If you're sick of this PR, I'm happy to take this on.
Hi, no problem has been busy with other stuff myself and didn't chase. I looked over the comments and they make total sense I will implement all the changes you requested. I will also add some documentation as generally understanding how this works is not that trivial. As I found out. It's different for docker comparing to Bazel or jib.
Makefile
Outdated
@@ -14,7 +14,7 @@ | |||
GOPATH ?= $(shell go env GOPATH) | |||
GOOS ?= $(shell go env GOOS) | |||
GOARCH ?= $(shell go env GOARCH) | |||
BUILD_DIR ?= ./out | |||
BUILD_DIR ?= /home/piotrszybicki/work/cube |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this, it's removed in the following commit.
663ebbe
to
bb37e68
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @piotr-szybicki! I've put in a few comments. If we can get these addressed, then it'll be ready for commit.
#5492 (and #5504) brought to light that we need to incorporate more than just the inputs — and this input tagger should use the artifact-hashing mechanism that's used elsewhere in Skaffold. Unfortunately this artifact-hasher is not yet exposed in an easy-to-consume manner (mechanism, lister). This is a bigger change and can be done separately if you don't want to take it on (just let me know).
inputs = append(inputs, h) | ||
} | ||
|
||
return encode(inputs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just inline the sha256 encoding of the file path and file contents directly into the for{}
loop? inputs
isn't used for any purpose and just adds memory pressure. Some artifacts — with their dependencies — may have hundreds or thousands of files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have very limited understanding of how this encoder works, but the json.Encoder
that's being used takes the entire set of files and returns a single encoding based on that set. is it possible to encode each file individually and then combine the encodings to achieve the same result?
it's worth noting that the elements of inputs
are the hashes of each file and its contents, not the entire file contents themselves. so while there could be thousands of files, this would be a set of thousands of hashes, each of which is only a few bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the input slice contains only a file paths, I iterate over them in the encode method hashing as you said content and relative path. Maybe the name is misleading.
bb37e68
to
fca78f9
Compare
@piotr-szybicki I took the liberty of rebasing your PR (I made some changes that made the merge conflicts quite annoying for you) - I hope this is ok! I also addressed a few of @briandealwis's last comments - mostly formatting and cleaning suggestions, though I made one change to the arguments to |
@nkubala this is absolutely fine. Thanks for doing it. I was about to do this myself but my full time job is killing me lately. |
@briandealwis Sorry for the late response. But a year of my work is coming to a culmination release over coming weekend. I actually knew about this issue If you take a look at the original comment I made when creating this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once the InputDigest type is fixed up
19ff291
to
75e30c9
Compare
I fix all the comments, and tests. I of the Travis CI Jobs has flaked out on me. Can anyone rerun? |
2951d84
to
2c44e0e
Compare
2c44e0e
to
2388896
Compare
@piotr-szybicki sorry for the delay. do you have some cycles to rebase your PR. I would love to merge this. |
@tejal29 i'm on it :) |
…orArtifact to a separate package to avoid cyclic dependecies problem
…over the file name with the full relative path. Remove mode from the hash. Fix some typos
2388896
to
7099959
Compare
7099959
to
4a6b4b2
Compare
all merge conflicts resolved and CI finally showing all green, so I'm going to merge this! @piotr-szybicki thanks so much for the contribution and all of your patience! |
@nkubala As I see apiVersion: skaffold/v2beta14
kind: Config
profiles:
- name: test-custom-template
build:
tagPolicy:
customTemplate:
template: "release-aaa-{{.DIGEST}}"
components:
- name: DIGEST
inputDigest: {}
artifacts:
- image: quay.io/my/project
context: ../../..
bazel:
target: //project:docker.tar $ skaffold build -p test-custom-template
creating runner: creating tagger: creating tagger: creating components: unknown component for custom template: DIGEST {GitTagger:<nil> ShaTagger:<nil> EnvTemplateTagger:<nil> DateTimeTagger:<nil> CustomTemplateTagger:<nil> InputDigest:0x7a479b8}
. If above error is unexpected, please open an issue https://github.com/GoogleContainerTools/skaffold/issues/new to report this error. Is it by design? Should I file a bug? |
I just reread |
Implementation of inputDigest tagger from the Improve taggers proposal This PR is much bigger than I originally intended. But whichever way I looked I was hitting the circular dependencies. So the first 2 commits are about extracting
build.ArtifactGraph
andbuild.Artifact
into its own package.It does mostly what internal skaffold hasher is doing a minus couple of things:
Some areas to think of (maybe in this or in future PR's):