Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: incorrect templating not producing error #765

Closed

Conversation

mikeykhalil
Copy link
Contributor

@mikeykhalil mikeykhalil commented Apr 17, 2019

Changes

Fixes issue #627.

This PR allows tasks to be more properly validated when being created. This allows users to catch templating errors earlier on.

This PR also addresses an issue where resource variable validation would basically maintain a global list of variables. However, I believe we really want to create 2 list of variables. One list for input resource variables and one list for output resource variables.

I am still not 100% convinced this is the best way to solve the issue, but it does seem to do the trick. Will sleep on it, but happy to address feedback on this PR or entertain other ideas on how to resolve this issue.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

  • Includes tests (if functionality changed/added)
  • Includes docs (if user facing) (note: I didn't write any docs for this, seems unnecessary).
  • Commit messages follow commit message best practices

See the contribution guide
for more details.

Release Notes

Return error to user for incorrect templating of resource variables.

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mikeykhalil
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: dlorenc

If they are not already assigned, you can assign the PR to them by writing /assign @dlorenc in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 17, 2019
@googlebot googlebot added the cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit label Apr 17, 2019
@tekton-robot tekton-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 17, 2019
return validateVariables(steps, "resources", "outputs.", outputVars)
}

func getInputResourceVariables(inputs *Inputs) (map[string]struct{}, *apis.FieldError) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why you chose to use two separate functions for input & output here - it seems like it would be possible to write a single function to handle both with a signature like:

func getResourceVariables(resources []TaskResource, pathPrefix string)

I might be misreading but it looks like the only line of difference between getInputResourceVariables and getOutputResourceVariables is the path string in the returned apis.FieldError, which could be resolved using a pathPrefix arg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're totally right! There was plenty of duplicate code there. I went ahead and made these changes. Thanks for the suggestion.

// foo.bar -> foo
// foo.bar.baz -> foo
vars[i] = strings.SplitN(groups["var"], ".", 2)[0]
vars[i] = groups["var"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a hard time figuring out if the logic has remained the same here or has been changed to make an intentional difference to behaviour. Would you mind just quickly describing what this change is?

Copy link
Contributor Author

@mikeykhalil mikeykhalil Apr 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I intentionally made this change.

I added a new test to the templating package which hopefully provides additional insight as to why this was needed. This new test would fail without this change.

strings.SplitN(groups["var"], ".", 2)[0] was just returning the first split of a variable. So if groups["var"] was set to foo.bar.baz, it would just return foo.

Since the vars map passed into ValidateVariable can contain nested variables now, like fooImage.url, I believe we want to return the entire variable string to validate against.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK got it, thankyou, I understand the original issue and this fix more clearly now!

@ghost
Copy link

ghost commented Apr 17, 2019

I pulled down the branch and tried out the validations by breaking and un-breaking a variable in the resource list of an example Task. Worked well!

@bobcatfish
Copy link
Collaborator

/ok-to-test

@tekton-robot tekton-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 17, 2019
Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @mikeykhalil !! I have a few comments and also:

Your description of the changes in the PR is pretty detailed but the commit message itself (8f5dcbc) is pretty brief - can you add the detail from the PR description to the commit message? (https://github.com/tektoncd/pipeline/blob/master/CONTRIBUTING.md#commit-messages can provide some guidance if needed)

I don't think release notes are necessary for this, but happy to write a description if someone thinks otherwise.

We should be more clear about guidelines here! I think any change that potentially would be visible to a user should get a release note - which would include a bug fix like this one :D I'm happy to fill it in if you'd rather not tho

return vars, nil
}

func getOutputResourceVariables(outputs *Outputs) (map[string]struct{}, *apis.FieldError) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's worth having unit tests for getOutputResourceVariables and getInputResourceVariables specifically - that might also help address @sbwsg 's feedback about knowing exactly what changed, b/c we'd be able to see it in the test cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changed slightly and has been reduced to one function, getResourceVariables. Your comment is still completely valid though. I have an idea for a test that could be constructive for getResourceVariables, but I want to put a bit more thought into it so that it's not tightly coupled with the current structure of a given resource type. Will circle back to this tomorrow.

@mikeykhalil
Copy link
Contributor Author

Thanks for the initial reviews! Sorry for some of the sloppiness in the original PR. Ended up working on that past my bedtime and it showed 🤣.

Anyways, I think I addressed most of the comments.

2 items that still need to be addressed are:

  1. Fix commit message to meet contributing standards. I was planning on squashing all of the commits and then writing a detailed commit message. Let me know if this is okay, otherwise, I can add details to each of the commit messages.
  2. Add tests for getResourceVariables.


// AttributesFromType returns a list of available attributes that can be replaced
// through templating.
func AttributesFromType(prt PipelineResourceType) ([]string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could have some unit tests for AttributesFromType too :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you recommended this! Ended up running into an issue where this function was not working for one of the types (PipelineResourceTypeStorage). I included a fairly kludgy solution and a comment to help illustrate the problem.

We have run into this unfortunate event where technically we cannot determine what attributes an object has strictly from the PipelineResourceType alone. This is because of the subtypes that exists for the PipelineResourceTypeStorage type. At the moment, both of the subtypes happen to have equivalent keys in their Replacements(). So technically we can just pick one of the subtypes and still return the proper attributes. However, we may want to reconsider the approach on how we obtain or store these attributes.

I do not have more time to look at this at the moment, but will find time soon to try to come up with a more elegant solution.

@bobcatfish
Copy link
Collaborator

Fix commit message to meet contributing standards. I was planning on squashing all of the commits and then writing a detailed commit message. Let me know if this is okay, otherwise, I can add details to each of the commit messages.

Sounds good!

Add tests for getResourceVariables.

And AttributesFromType looks like another good candidate :D

But generally I think this is nearly good to go @mikeykhalil , looking great! :D 🎉 🎉 🎉


{
Name: "type",
Value: string(PipelineResourceTypeGCS),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PipelineResourceTypeStorage could be something else than PipelineResourceTypeGCS though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I think this is the blocking issue at the moment. I left a comment above in the code which goes into more detail.

We have run into this unfortunate event where technically we cannot determine what attributes an object has strictly from the PipelineResourceType alone. This is because of the subtypes that exists for the PipelineResourceTypeStorage type. At the moment, both of the subtypes happen to have equivalent keys in their Replacements(). So technically we can just pick one of the subtypes and still return the proper attributes. However, we may want to reconsider the approach on how we obtain or store these attributes.

Still not exactly sure how to go about solving this problem, but I will think about it some more and update this PR with a better solution once I find the time.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there isn't enough information in a TaskResource struct to correctly infer the type of a PipelineResourceTypeStorage. The exact storage type only gets determined when a TaskRun is created I think, right? It doesn't seem to me like there's a way to validate storage when the Task is applied since its type parameter wont be declared until later.

Options I can see right now are:

  1. Leave the code as it is :- make an assumption about the storage type. Future storage types will need to include these keys to validate or will need to change this validation code.
  2. Augment the TaskResource with enough info to infer the correct storage type to validate against. This seems counter to the design of PipelineResourceTypeStorage - the dynamic binding of the storage's exact type is intentionally left until the TaskRun to allow different sources to feed through a pipeline, I think?
  3. Agree on a set of templatable params that all storage types must implement.
  4. Iterate over all known storage types and check whether the templated fields match at least one of those storage kinds.
  5. Abandon validating storage params in a Task template. Maybe do it in TaskRuns instead.

Have I misunderstood the implementation of TaskResource / TaskResourceBinding? Are there other options available to us here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your analysis seems correct to me and I was thinking about a few of those options as well.

I agree that option 2 seems to counter the current design, though it's possible I have a misunderstanding.

I think the rest of the options you mentioned are solid approaches.

In my mind, option 3 sounds like an ideal approach and a convenient characteristic for all types of PipelineResourceTypeStorage to share. However, this might not be realistic depending on the types of PipelineResourceTypeStorage we plan to support in the future.

Option 4 seems like something that could be built now with not much effort and provide immediate benefit. There could be some cases in the future where different
types of PipelineResourceTypeStorage don't share the same templatable params, in which case we would only be able to catch certain errors at TaskRun execution time.

Option 5 seems reasonable and provides similar benefits as option 4. We will catch most kinds of templating errors earlier on but still have certain errors at TaskRun execution time.


After some thought, if we feel we could reach a consensus on option 3 then that seems like the most robust approach.

If that's not something we think we can enforce, then option 4 or 5 sound great to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 3 seems like a design that would be worth raising with the community - I doubt we could confidently come to a widely agreeable consensus in this github thread alone. So the process for that would be to create a github issue describing the problem along with a suggested resolution and design doc if the problem is gnarly enough (this one is, I think).

It may also be that we can push this specific problem back until there is more movement on Pipeline Resource Extensibility, which may end up setting the "validatable surface area" simply as a byproduct of the interfaces that it defines for extensions.

Having said all that it seems to me that we could move ahead with options 4 or 5 in this PR and work on option 3 out-of-band. If it were me I'd probably skip validating storage types completely, leave a comment explaining why, and create a github issue to describe the problem and possible solutions in detail.

Copy link

@ghost ghost May 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually thinking about this some more, I reckon simply pushing the validation to the TaskRun (if at all possible, otherwise skipping for now) is totally fine. The whole point of the PipelineResourceTypeStorage is that it's a placeholder to be filled in later. The Task shouldn't need to really worry about validating it deeply. At least from my POV.

Copy link

@ghost ghost May 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now realizing that option 5, skipping storage params, is kind of tricky. It's not possible to easily express "ignore variables of this type" when doing the template validation atm. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, if the task shouldn't have to worry about validating it deeply, then it's possible that we do not really want this kind of patch and may just want to revisit this in a more broad way with a github issue and a design doc at some point.

Otherwise, it might be possible to go with a hybrid approach where we deeply validate all types of variables besides storage. Then for storage params, we just validate the top level variable (similar to how the validation works currently). But yeah, not sure if there is an easy way to express that as you said.

Been busy so I haven't had time to take a look, but I can take a look this weekend to see if i can come up with a nice way to solve it.

@dlorenc
Copy link
Contributor

dlorenc commented May 25, 2019

It looks like this needs a small rebase now, are you still interested in getting this finished up @mikeykhalil ?

@tekton-robot
Copy link
Collaborator

@mikeykhalil: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-tekton-pipeline-go-coverage 43eaa8b link /test pull-tekton-pipeline-go-coverage
pull-tekton-pipeline-build-tests 43eaa8b link /test pull-tekton-pipeline-build-tests
pull-tekton-pipeline-unit-tests 43eaa8b link /test tekton-pipeline-unit-tests
pull-tekton-pipeline-integration-tests 43eaa8b link /test pull-tekton-pipeline-integration-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mikeykhalil
Copy link
Contributor Author

It looks like this needs a small rebase now, are you still interested in getting this finished up @mikeykhalil ?

Hey, sorry for the lack of activity from me. A few things came up and I was out of town for a bit.

It seems like there is not a clear way to deeply validate all kinds of variables except for those of type storage. And even if there was, the behavior would be inconsistent and possibly confusing for users.

For now, I think the consensus is that we are better off closing this PR and maybe addressing this issue once #238 gains more traction.

Thanks for all of the help, everyone! It's a bummer we couldn't get something merged to improve the validation at task creation time, but I think we at least have some more context for how we may want to address the issue in the future. Feel free to reopen or ping me if I can help in some way.

@bobcatfish
Copy link
Collaborator

Thanks for all your hard work @mikeykhalil !!! Thanks for helping us investigate and understand this problem better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants