-
Notifications
You must be signed in to change notification settings - Fork 159
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: aaron-prindle If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Super exciting to see this work happening, thanks for kicking it off. Would you mind talking through this at tomorrow's Build WG meeting? Also, this needs test updates, and probably some kind of integration test that build step containers happen in order, since we used to just trust in init containers to do that. |
80d210c
to
d7d8331
Compare
Sure, I can go through the motivations and design of this change at the meeting tomorrow! I've updated the unit tests but am having some trouble with the integration test: TestBuildLowTimeout |
d7d8331
to
6cc5a35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great @aaron-prindle ! I think the really important bit to review is probably in the image itself, to understand the logic that we're using to get the containers to wait before they execute. Before we merge this we'll need to take a look at it - maybe it can be added to this PR? (I think this PR was mostly to get feedback so sorry if I'm rushing you! :D I'm mostly very excited about maybe getting this done before tektoncd/pipeline#252)
shouldWaitForPrevStep, i, i+1)}, | ||
) | ||
c.want.Containers[i].Args = []string{} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im guessing that writing this block of code made it easier to modify the expected test inputs and outputs without having to actually change the test cases - but i think for future readability and maintenance of these tests it'll be clearer to change the test cases themselves
@@ -277,12 +281,26 @@ func MakePod(build *v1alpha1.Build, kubeclient kubernetes.Interface) (*corev1.Po | |||
return nil, err | |||
} | |||
// Prepend the custom container to the steps, to be augmented later with env, volume mounts, etc. | |||
|
|||
// initContainers = append(initContainers, *cust) | |||
// TODO(aaron-prindle) not if I correctly handled this piece.... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at the Git
and GCS
cases, I don't think this block needs to change? I think this is just for a custom method of retrieving source data to operate on
JSONConfigEnvVar = "ENTRYPOINT_OPTIONS" | ||
InitContainerName = "place-tools" | ||
// TODO(aaron-prindle) change this to wherever is sensible | ||
DefaultEntrypointImage = "gcr.io/aprindle-vm-test/entrypoint:latest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm guessing this can get added to the existing release logic
I think I'd actually prefer to get the forklift done first, since after #464 it should be much easier. I agree though, whichever goes first, we should make sure the other work knows it's second, or we're going to waste effort. I think this work is great, but I'd like to start with a design (even just an informal tech talk, possibly at the WG meeting Wednesday) to walk through how it works. And any PR that adds this behavior also needs e2e tests for step ordering, since we can't rely on k8s to order them anymore. For those reasons I think the forklift is going to be quicker. |
Sounds like a good plan to me @imjasonh ! Maybe we can let this PR be more of a POC in that case then. |
2b81f15
to
6d3f04d
Compare
7d792b4
to
024a28f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naive question : any reason to use an environment variable instead of /my/wrapper --flag1 --flag2 -- /original/entrypoint --and command
?
pkg/entrypoint/options.go
Outdated
|
||
// Options exposes the configuration necessary | ||
// for defining the process being watched and | ||
// where in GCS an upload will land. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCS stands for Google Cloud Storage right ? We may want to be less specific than that here 👼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry this is fixed now! It is originally adapted from: https://github.com/kubernetes/test-infra/blob/master/prow/entrypoint/options.go#L35-L37
7d7752a
to
8cd9714
Compare
@vdemeester There shouldn't be any particular reason for using ENV vars vs flags. ENV vars are used here as it is based off of the entrypoint image in: https://github.com/kubernetes/test-infra/blob/master/prow/cmd/entrypoint/main.go |
e40f689
to
271a145
Compare
`test-infra` is updated to the latest version to include the required infrastructure. Bonuses: * documented the contents of the `hack` directory; * documented how to use the `release.sh` script;
fd7d566
to
4f56053
Compare
Paired on this w/ @imjasonh today, got integration tests passing! Here were some other work in progress pieces that we discussed:
Edit: regarding the last point, it is my understanding that go-containerregistry uses the correct authentication from the image directly. I am marking this complete, please comment if I am misunderstanding. Edit 2: I've put all these points up top |
The following is the coverage report on pkg/.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So cool to see this moving along! 🎉
My main feedback is that I think we need to:
- break up some of the longer functions into smaller pieces with unit test coverage (this will help address the -25.5 coverage on
pod.go
) - make sure we need all of the functionality in
entrypoint
and make sure entrypoint itself has good unit test coverage
A few more requests:
- Can we add a README for the entrypoint binary?
- In preparation for porting this to build-pipeline, could you include a markdown file that describes how the steps are implemented now? Something we could add to our deep dive docs that @shashwathi started
- Could you squash the commits together into one that explains why we are switching away from init containers, and maybe gives a shout-out to https://github.com/kubernetes/test-infra/tree/master/prow/entrypoint which some of our
entrypoint
binary was based on? - Was this commit included on purpose? 7a0c771
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you include in the comments here some docs on what this binary is and what it does? e.g. like https://github.com/knative/build-pipeline/blob/master/cmd/bash/main.go#L17
ArtifactDir string `json:"artifact_dir,omitempty"` | ||
|
||
*wrapper.Options | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems super similar to https://github.com/kubernetes/test-infra/blob/master/prow/entrypoint/options.go - we should at least give a shout-out to it probably.
also are we 100% sure we need all the same options? (e.g. we don't do automatic artifact uploading afaik, do we need ArtifactDir
?)
pkg/entrypoint/options.go
Outdated
flags.StringVar(&o.PreRunFile, "prerun-file", | ||
DefaultPreRunFile, "The prerun file to wait for.") | ||
flags.StringVar(&o.PostRunFile, "postrun-file", | ||
DefaultPostRunFile, "If postrun file to write.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain more about what "file to wait for" means here? (e.g. the process will wait for this file to exist before xyz)
loader.Complete(fs.Args()) | ||
|
||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add test coverage for these functions + files we're adding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/entrypoint/run.go
Outdated
} | ||
} | ||
return returnCode, commandErr | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id really like to see this function broken up a bit with unit test coverage
pkg/entrypoint/run_test.go
Outdated
// } | ||
// if string(data) != expected { | ||
// t.Errorf("%s: expected contents: %q, got %q", name, expected, data) | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was it intentional to comment this out? if so, let's delete it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was unintentional, the tests are added back now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this in the latest commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be there now
) | ||
|
||
func main() { | ||
fmt.Println("Welcome to succeed test!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is "succeed test"? can you add some docs to this main.go explaining what this binary is for?
@@ -491,3 +525,397 @@ func getFailureMessage(pod *corev1.Pod) string { | |||
// Lastly fall back on a generic error message. | |||
return "build failed for unspecified reasons." | |||
} | |||
|
|||
// from build-pipeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be broken up into multiple files? also i think this comment needs more details
TokenURI string `json:"token_uri"` | ||
AuthProviderX509CertURL string `json:"auth_provider_x509_cert_url"` | ||
ClientX509CertURL string `json:"client_x509_cert_url"` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a docker package with this logic in it, with unit test coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code and all specific code related to docker has been removed thanks to k8schain!
return nil, fmt.Errorf("couldn't get config for image %s: %v", image, err) | ||
} | ||
cache.set(image, cfg.Config.Entrypoint) | ||
return cfg.Config.Entrypoint, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this function needs to get broken up into a few functions (with unit test coverage)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments re: consuming ggcr. I haven't looked at the rest of the PR but figured some pointers would help.
cc @mattmoor we could probably make this more discoverable or easier to use if we added better examples to the READMEs
c.mtx.Unlock() | ||
} | ||
|
||
type authToken struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned this before, but I think like 95% of the added code in this file could go away if you do something like:
kc, _ := k8schain.New(kubeclient, k8schain.Options{
ImagePullSecrets: secret.Name,
})
img, _ := remote.Image(ref, remote.WithAuthFromKeychain(kc))
Obviously handle errors too, but what you're doing here is basically the point of k8schain
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, this is working for me now! Note: I have to use go-contaierregistry w/ this change: google/go-containerregistry#328 due to some k8s version import mismatch. More info in that PR
ClientX509CertURL string `json:"client_x509_cert_url"` | ||
} | ||
|
||
func getGCRAuthorizationKey() ([]authToken, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think k8schain
should be sufficient for you, but I already implemented this in a way that's consumable by ggcr.
Here's an implementation as a keychain:
https://github.com/google/go-containerregistry/blob/e0f538712eb0767ae6959531febe6041fa466602/pkg/v1/google/keychain.go
Here's an authn.Authenticator
implementation that uses the same DefaultTokenSource, you can use it with the remote.WithAuth
option:
https://github.com/google/go-containerregistry/blob/e0f538712eb0767ae6959531febe6041fa466602/pkg/v1/google/auth.go#L45
If you want to get fancy, you could combine your k8schain
with the google.Keychain
using authn.NewMultiKeychain
:
https://github.com/google/go-containerregistry/blob/e0f538712eb0767ae6959531febe6041fa466602/pkg/authn/multikeychain.go
Something like:
// this will first try to authenticate using the k8schain,
// then fall back to the google keychain,
// then fall back to anonymous
mkc := authn.NewMultiKeychain(kc, google.Keychain)
img, _ := remote.Image(ref, remote.WithAuthFromKeychain(kc))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really awesome, this appears to be doing exactly what I wanted for authentication. Thanks for your help @jonjohnsonjr!!
b9e7928
to
bb879ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really love to see this change include a README describing how ordering is maintained without init containers. It's great having tests to prove it works, but anybody looking into it to fix bugs in the future is going to appreciate documentation to back it up.
JSONConfigEnvVar = "ENTRYPOINT_OPTIONS" | ||
InitContainerName = "place-tools" | ||
// TODO(aaron-prindle) change this to wherever is sensible | ||
DefaultEntrypointImage = "gcr.io/aprindle-vm-test/entrypoint:latest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make this a flag, like *credsImage
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
MountPoint = "/tools" | ||
BinaryLocation = MountPoint + "/entrypoint" | ||
JSONConfigEnvVar = "ENTRYPOINT_OPTIONS" | ||
InitContainerName = "place-tools" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these need to be exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are used in pod_test.go which is why they are exported
pkg/entrypoint/options_test.go
Outdated
name string | ||
input Options | ||
expectedErr bool | ||
}{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: collapse {{
here (and }}
at the end) to save vertical and horizontal space. Put },\n{
on one line like }, {
for the same reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/entrypoint/options_test.go
Outdated
MarkerFile: "marker.txt", | ||
}, | ||
}, | ||
expectedErr: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This can be omitted since it's false by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/entrypoint/options_test.go
Outdated
} | ||
|
||
for _, testCase := range testCases { | ||
err := testCase.input.Validate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This can be an if err := ...
block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/entrypoint/run.go
Outdated
done <- command.Wait() | ||
}() | ||
select { | ||
case err := <-done: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto about case commandErr = <-done:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/entrypoint/run_test.go
Outdated
// } | ||
// if string(data) != expected { | ||
// t.Errorf("%s: expected contents: %q, got %q", name, expected, data) | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this in the latest commit.
name string | ||
input Options | ||
expectedErr bool | ||
}{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto about collapsing whitespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change should be in the commit here now
@@ -43,7 +43,7 @@ func TestValidateBuild(t *testing.T) { | |||
Template: &v1alpha1.TemplateInstantiationSpec{ | |||
Arguments: []v1alpha1.ArgumentSpec{{ | |||
Name: "foo", | |||
Value: "hello", | |||
Value: "hello-world", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes don't seem necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a WIP but these changes are unfortunately necessary for (until I change some pieces around) as these tests hit MakePod and MakePod now calls out to the internet to fetch image metadata. This is a temporary hack as "hello-world" is a docker image but this is obvi not great as now unit tests have an external image dependence and require internet access. I am working on fixing this.
test/serviceaccount/secret.yaml
Outdated
@@ -15,6 +15,7 @@ apiVersion: v1 | |||
kind: Secret | |||
metadata: | |||
name: test-readonly-credentials | |||
# type: kubernetes.io/dockerconfigjson |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why comment this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing originally, i had a setup that only worked with the dockerconfigjson format so I modified this secret to be of that type for my tests. Now it works for either type (dockercfg or dockerconfigjson), I will remove these comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bb879ac
to
b13327b
Compare
df2f486
to
2fa7aab
Compare
@aaron-prindle: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
What is the problem being solved? This PR adds an entrypoint image which is capable of running another command with some pre-run/post-run hooks. The goal of this is to allow for the the entrypoint of a container in a Build Step to have it's entrypoint rewritten by mounting this entrypoint image into the container provided in a Step, rewriting the entrypoint, then running the user container w/ the additional pre-run/post-run hooks. The addition of this image enables the removal of init containers as per knative#521. This was originally part of knative#470 but is now split out. What future work remains to be done? This PR adds the entrypoint image and release step. The future work involved is modifying Build to stop using init container and use this entrypoint image as a way of making that possible.
What is the problem being solved? This PR adds an entrypoint image which is capable of running another command with some pre-run/post-run hooks. The goal of this is to allow for the the entrypoint of a container in a Build Step to have it's entrypoint rewritten by mounting this entrypoint image into the container provided in a Step, rewriting the entrypoint, then running the user container w/ the additional pre-run/post-run hooks. The addition of this image enables the removal of init containers as per knative#521. This was originally part of knative#470 but is now split out. What future work remains to be done? This PR adds the entrypoint image and release step. The future work involved is modifying Build to stop using init container and use this entrypoint image as a way of making that possible.
What is the problem being solved? This PR adds an entrypoint image which is capable of running another command with some pre-run/post-run hooks. The goal of this is to allow for the the entrypoint of a container in a Build Step to have it's entrypoint rewritten by mounting this entrypoint image into the container provided in a Step, rewriting the entrypoint, then running the user container w/ the additional pre-run/post-run hooks. The addition of this image enables the removal of init containers as per knative#521. This was originally part of knative#470 but is now split out. What future work remains to be done? This PR adds the entrypoint image and release step. The future work involved is modifying Build to stop using init container and use this entrypoint image as a way of making that possible.
Note this is now getting broken up into separate PRs, starting with #522 |
It looks like #522 was closed without merging. Is the removal of init containers dead? |
Nope, just set back somewhat due to staffing, and moving the meat of the work to the Pipelines repo since that's where new development is happening. tektoncd/pipeline#448 is part of this work. |
@evankanderson tektoncd/pipeline#224 is probably the issue to watch if you're interested :D |
Fixes #9
Proposed Changes
This PR modifies the PodSpec generated by build to run Steps as normal containers and not initcontainers. As Build requires Steps to run in a linear fashion and normal containers would typically all run at the same time, entrypoint rewriting is used in order to enforce the run order of the containers. This should fix some of the issues described in #9 as normal container logs are not cleared as aggressively as initContainer logs. This change change will also allow Build to use sidecar containers in the future.
Known Issues (These need to be fixed):
12/19/2018 issue update:
As entrypoint is rewritten for these containers, containers relying on runningEntrypoint
and notCommand
will not work as expected. This can be resolved by fetching and storing the container entrypoint information directly as done here: Load entrypoint from remote tektoncd/pipeline#183Support .dockercfg secrets (not just .dockerconfigjson) for entrypoint rewritingHandle multiple registries (gcr.io, us.gcr.io, other registries, etc better for metadata server solution) (should prob just use the prefix from the image here)Add a test, test-succeed that verifies that ko built, private images are pullable/runnableAdd a test, test-step-ordering that verifies that steps are run in the correct orderUpdate pod ResourceRequirements rewriting to use the max requirement from the set of containersEnsure that the correct registry is used when attempting to auth with secrets, see: https://github.com/knative/build/blob/master/test/docker-basic/0-secret.yamlEdit: regarding the last bullet, it is my understanding that go-containerregistry uses the correct authentication from the image directly. I am marking this complete, please comment if I am misunderstanding.
Additional points: