Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to use same Pipeline with different Resources #200

Closed
bobcatfish opened this issue Oct 30, 2018 · 10 comments · Fixed by #248
Closed

Ability to use same Pipeline with different Resources #200

bobcatfish opened this issue Oct 30, 2018 · 10 comments · Fixed by #248
Assignees
Labels
design This task is about creating and discussing a design

Comments

@bobcatfish
Copy link
Collaborator

bobcatfish commented Oct 30, 2018

(Discovered by @pivotal-nader-ziada while working on #177)

This issue is about deciding on the solution we prefer (see "possible solutions" below) and implementing the change.

Expected Behavior

A user should be able to take an existing Pipeline and run it against their own setup without
modifying the Pipeline itself, meaning using their own:

  1. GitHub fork / branch (e.g. for PR testing)
  2. Image registry
  3. k8s cluster (to deploy to)

Actual Behavior

In the current implementation, to change (3), a user would create their own PipelineParams, so they would not need to change the Pipeline.

But to change (1) or (2), they would have to create new Resources with their specific configuration. Since Resources are bound in the Pipeline, this means they would need to change the Pipeline as well.

We are considering moving clusters out of PipelineParams and into a Resource type (#177), but this would mean that for a user to use their own cluster, they would again need to change the Pipeline.

Possible solutions

@pivotal-nader-ziada and I have prototyped 7 possible solutions in yaml in this branch.

  1. We expand PipelineParams to include a generic list of keys and values, which can be templated
    inside all types (e.g. Pipelines, Resources, etc.).

    See example_option_1:

    • pipelines/kritis-resources.yaml contains templating for values like URL and revision
      in the Resources
apiVersion: pipeline.knative.dev/v1alpha1
kind: PipelineResource
metadata:
 name: kritis-resources-git
 namespace: default
spec:
 type: git
 params:
 - name: url
   value: ${url}
 - name: revision
   value: ${revision}
  • pipelineparams.yaml contains the values for these params
    params:
    - name: resources.kritis-resources-git.url
      value: https://github.com/grafeas/kritis
    - name: resources.kritis-resources-git.revision
      value: master
    - name: resources.kritis-resources-test-git.url
      value: https://github.com/grafeas/kritis-test
    - name: resources.kritis-resources-image.url
      value: gcr.io/staging-image
    - name: resoruces.kritis-test-cluster.endpoint
      value: 'https://prod.gke.corp.com'

_This is very similar to option 3, the difference is that in this version we still have
Resource CRDs, which would need to be updated on any changes to PipelineParams.

  1. PipelineParams can override values inside of Resources.

    This would be basically the same as option 1, except limited to Resources instead of working
    for any CRD.

  2. Instead of declaring Resources as separate CRDs, we define them all inside PipelineParams

    See example_option_3:

    • pipelineparams.yaml now contains all Resource definitions
    resources:
      - name: kritis-resources-git
        spec:
          type: git
          params:
          - name: url
            value: https://github.com/grafeas/kritis
          - name: revision
            value: master
      - name: kritis-resources-test-git
        spec:    
          type: git
          params:
          - name: revision
            value: master
          - name: url
            value: https://github.com/grafeas/kritis-test  
      - name: kritis-resources-image
        spec:  
          type: image
          params:
          - name: url
            value: gcr.io/staging-image
      - name: kritis-test-cluster
        spec:  
          type: cluster
          params:
          - name: endpoint
            value: 'https://prod.gke.corp.com' 
  • Resources are no longer their own type of CRD
  1. Instead of binding Resources in Pipeline, we bind them in PipelineRun (and TaskRun)

    See example_option_4:

    • kritis-pipeline-run.yaml now contains all resource bindings
    taskSourceBindings:
    - name: unit-test-kritis
      inputSourceBindings:
      - key: workspace # bind to the name in the task
        resourceRef:
          name: kritis-resources-git
    - name: push-kritis
      inputSourceBindings:
        - key: workspace # bind to the name in the task
          resourceRef:
            name: kritis-resources-git
      outputSourceBindings:
        - key: builtImage # bind to the name in the task
          resourceRef:
            name: kritis-resources-image
  • pipeline/kritis.yaml still contains the passedConstraints so that it
    can express the order
    tasks:
      - name: unit-test-kritis          # 1.  Run unit Tests
        taskRef:
          name: make
        params:
          - name: makeTarget
            value: test
      - name: push-kritis               # 2.  Build And Push Tests
        taskRef:
          name: build-push
        inputSourceBindings:
          - key: workspace
            passedConstraints: [unit-test-kritis]
        params:
          - name: pathToDockerfile
            value: deploy/Dockerfile
      - name: deploy-test-env           # 3. Finally Deploy to Test environment
        taskRef:
          name: deploy-with-helm
        inputSourceBindings:
          - key: workspace
            passedConstraints: [push-kritis]

Cons:

  • Resources are now handled in both the Pipeline and the PipelineRun
  1. A combination of (4) + (2): if you want to override a value, you can do that by binding an overridden resource in (4).

    We ruled this one out because we feel it is too confusing.

  2. No PipelineParams; people need to change the Pipeline. Move the ServiceAccount and Result store information into the Pipeline itself.

    See example_option_6:

    • The PipelineParams are gone, values are moved into pipeline/kritis.yaml
spec:
    tasks:
    ....
    serviceAccount: 'demoServiceAccount'
    results:
        runs:
          name: 'runsBucket'
          type: 'gcs'
          url: 'gcs://somebucket/results/runs'
        logs:
          name: 'logBucket'
          type: 'gcs'
          url: 'gcs://somebucket/results/logs'
        tests:
          name: 'testBucket'
          type: 'gcs'
          url: 'gcs://somebucket/results/tests'        
  • The PipelineRun in invocations/kritis-pipeline-run.yaml now has very
    little in it.
apiVersion: pipeline.knative.dev/v1alpha1
kind: PipelineRun
metadata:
  name: kritis-pipeline-run-12321312984
  namespace: default
spec:
    pipelineRef:
        name: kritis-pipeline
    triggerRef:
        type: manual

Cons:

  • Still need to change 2 things: create a Resource with the configuration you
    want, and change the Pipeline to use that Resoruce
  1. Either (1) or (2) but instead of using PipelineParams, use PipelineRun
    and TaskRun.
@bobcatfish bobcatfish added the design This task is about creating and discussing a design label Oct 30, 2018
@bobcatfish
Copy link
Collaborator Author

I am currently leaning toward option (3).

@tejal29
Copy link
Contributor

tejal29 commented Oct 30, 2018

Wov! Thanks for all the detailed examples and description.
That said, in my head the problem is:
We want to create multiple instances of same resource/artifact with one more different values.
Right now, in the CRD we are embedding the object instance information as well (in case of Git resource it is version, in case of Image its digest)

One approach would be to create CRD with revision being auto-populated by the system.

type Resource struct {
	Name string               `json:"name"`
	Type PipelineResourceType `json:"type"`
        URL string `json:"url"`
        // Revision is a version representing this object.
	// Populated by the system.
	// Read-only.
	// +optional
      Revision string
}
type GitResource struct {
    *Resource
}

And then, PipelineParameters will essentially be a list of Resource Versions used in the pipeline and few more special concepts like clusters etc.
I think this is essentially the first solution which you described.

But all that said, i am inclined to number 3. All parameters will be handled in the same way.

@bobcatfish
Copy link
Collaborator Author

Right now, in the CRD we are embedding the object instance information as well (in case of Git resource it is version, in case of Image its digest)

This applies to all of the fields in each Resource, so for a Git Resource we are looking at:

type GitResource struct {
	Name string               `json:"name"`
	Type PipelineResourceType `json:"type"`
	URL  string               `json:"url"`
	// Git revision (branch, tag, commit SHA or ref) to clone.  See
	// https://git-scm.com/docs/gitrevisions#_specifying_revisions for more
	// information.
	Revision string `json:"revision"`
}

Particularly the URL would be of interest, e.g. for using forks (like github.com/knative/build-pipeline vs github.com/bobcatfish/build-pipeline)

And for an image we're looking at:

// ImageResource defines an endpoint where artifacts can be stored, such as images.
type ImageResource struct {
	Name   string               `json:"name"`
	Type   PipelineResourceType `json:"type"`
	URL    string               `json:"url"`
	Digest string               `json:"digest"`
}

Where again URL is a value likely to need to change, if I wanted to run against my own registry.

But all that said, i am inclined to number 3. All parameters will be handled in the same way.

kk I like 3 as well, I think we should go for it :D

@bobcatfish bobcatfish self-assigned this Nov 2, 2018
@bobcatfish
Copy link
Collaborator Author

I wonder if option 3 makes it harder for a user to extend w/ custom resources

If Resources were separate CRDs, it's possible that a user could add their own controllers for their specific types of reosurces 🤔

@bobcatfish bobcatfish added this to the 0.0.1 Alpha release milestone Nov 3, 2018
@bobcatfish
Copy link
Collaborator Author

If Resources were separate CRDs, it's possible that a user could add their own controllers for their specific types of reosurces 🤔

I'm really fixated on this idea and plan to write up another issue to discuss it as soon as i can! Puts a bit of wrench in option (3) tho :J

@nader-ziada
Copy link
Member

The idea of allowing users to extend the system by adding their own resources is important, this flexibility can help drive adoption of our pipeline significantly. I do think it will requires more work to enable this anyways.

so now I'm leaning towards a combination of (4) and (6).

Moving the binding of the resources and the results to the pipelineRun will make it consistent with TaskRun

The user should flow the same constructs when running one task with a TaskRun or a pipeline with a PipelineRun, and we don't have any similar construct to PipelineParams for tasks

and since we already changed the Cluster to be a resource, the only thing in PipelineParams is the results, which I think should also move to the PipelineRun to be consistent with the TaskRun

Pros:

  • Pipelines and Tasks are consistent
  • one less CRD to worry about it
  • Pipeline, Task and Resource are for definition while PipelineRun and TaskRun are for running the tasks with the appropriate version

Cons:

  • user has to add the resource in both Pipeline and PipelineRun

@bobcatfish
Copy link
Collaborator Author

(p.s. wrote up #238 about extensibility via Resources)

@bobcatfish
Copy link
Collaborator Author

bobcatfish commented Nov 10, 2018

@pivotal-nader-ziada I'm also leaning toward (4) now that we talk about it more

Do you have any thoughts on how we can improve this situation where we now have to talk about Resources in both the Pipeline (b/c we have to express order) AND the PipelineRun?

For example in the Pipeline:

      - name: push-kritis               # 2.  Build And Push Tests
        taskRef:
          name: build-push
         inputSourceBindings:
          - key: workspace
            passedConstraints: [unit-test-kritis]
        params:
          - name: pathToDockerfile
            value: deploy/Dockerfile

And in the PipelineRun:

 - name: push-kritis
      inputSourceBindings:
        - key: workspace # bind to the name in the task
          resourceRef:
            name: kritis-resources-git
          version: d8d8dd8
      outputSourceBindings:
        - key: builtImage # bind to the name in the task
          resourceRef:
            name: kritis-resources-image

My gut reaction is to suggest that we kill two birds with one stone, and tackle #137 by making the passedConstraint an attribute of the Task instead of the Resource, e.g. the first snippet in the above example would become:

      - name: push-kritis               # 2.  Build And Push Tests
        taskRef:
          name: build-push
          passedConstraints: [unit-test-kritis] 
        params:
          - name: pathToDockerfile
            value: deploy/Dockerfile

This way the Pipeline itself only expresses the ordering.

Not sure if the Params should stay in the Pipeline in this design - @dlorenc has pointed out that params are looking more and more like Resources also.

@nader-ziada
Copy link
Member

nader-ziada commented Nov 12, 2018

The one issue I can think of regarding making passedConstraint on the task level is what if the one task has two inputs each one of them from a different previous task? I think the granularity of the passedConstraint on the resource makes building pipeline more powerful

but how about the following idea:

  • rename passedConstraint to dependsOn

pipeline

     - name: push-kritis                     
        taskRef:
          name: build-push
         inputSourceBindings:
          - key: workspace
             resourceRef: myGitRepo
            dependsOn: [unit-test-kritis]      # renamed from passedConstraint
        params:
          - name: pathToDockerfile
            value: deploy/Dockerfile

pipelineRun

- name: push-kritis
      inputSourceBindings:
        - key: workspace                        # bind to the name in the task
           version: dc6758e 
      outputSourceBindings:
        - key: builtImage                       # bind to the name in the task
          # no version needed for output since it will still be created

not sure if outputSourceBinding needs to be part of pipelineRun at all?

@bobcatfish
Copy link
Collaborator Author

@pivotal-nader-ziada + @shashwathi and I talked some of this through today and came to some conclusions:

  • The version field would only let a user override some values in a Resource, not all of them, for example for a git resource you could change the commitish but not the repo (e.g. to point at a fork), and it's hard to say where to draw the line
  • We like the idea of having a user create a new Resource when they want to change a Resource
  • Due to the work that @shashwathi is doing on Link outputSourceBindings from one PipelineTask to inputSourceBindings of next task #148 it looks like we won't need to use the version field to pass metadata between Tasks when a Task depends on the output of another Task

Therefore we'd like to try out this design:

  1. Resources instances are referred to in PipelineRuns (not Pipelines)
  2. Pipelines still link Resources to tasks via passedConstraint (new name coming soon see passedConstraints is confusing and non intuitive way of defining task dependency. #137)
  3. We remove the version field

bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Nov 21, 2018
This change makes it possible to reuse a Pipeline with different
Resources without having to change the Pipeline itself. It does this by
moving the linking of the Resources a Task requires from the Pipeline
into the PipelineRun.

The relationship between Resources and the Tasks that are expected to be
executed on them is still expressed in the Pipeline (via `providedBy`).

ResourceBindings move from Pipeline to PipelineRun and become
ResourceDependencies. No longer calling these "bindings" in the API or
using the term "source" fixes + the additional docs in using.md fixes tektoncd#139.

The most difficult part of this change was updating
validatePipelineTaskAndTask - hoping to refactor this in a follow up
(which would also address tektoncd#213).

This interface is still probably not in its final form and hopefully we
can iterate on it!

Fixes tektoncd#200
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Nov 21, 2018
We had been hoping to use `version` to cover two cases:
1. To allow a user to override the version of the Resource they are using
   (e.g. for a particular git resource, use a different revision)
2.  For the case where a Task modifies a resource then provides it to the
   next Task, we wanted to express the modification in the version

In the previous commit, we made it so that a user can change a Resource
by creating a new one and specifying the new one in teh PipelineRun,
and they can do the same thing via a TaskRun.

For the second case, in #124 @shashwathi is designing how the output of
one task will be passed to the next, and it is likely going to be via
PVCs and will not need the version field.

Fixes tektoncd#200
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Nov 21, 2018
This change makes it possible to reuse a Pipeline with different
Resources without having to change the Pipeline itself. It does this by
moving the linking of the Resources a Task requires from the Pipeline
into the PipelineRun.

The relationship between Resources and the Tasks that are expected to be
executed on them is still expressed in the Pipeline (via `providedBy`).

ResourceBindings move from Pipeline to PipelineRun and become
ResourceDependencies. No longer calling these "bindings" in the API or
using the term "source" fixes + the additional docs in using.md fixes tektoncd#139.

The most difficult part of this change was updating
validatePipelineTaskAndTask - hoping to refactor this in a follow up
(which would also address tektoncd#213).

This interface is still probably not in its final form and hopefully we
can iterate on it!

Fixes tektoncd#200
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Nov 21, 2018
We had been hoping to use `version` to cover two cases:
1. To allow a user to override the version of the Resource they are using
   (e.g. for a particular git resource, use a different revision)
2.  For the case where a Task modifies a resource then provides it to the
   next Task, we wanted to express the modification in the version

In the previous commit, we made it so that a user can change a Resource
by creating a new one and specifying the new one in teh PipelineRun,
and they can do the same thing via a TaskRun.

For the second case, in #124 @shashwathi is designing how the output of
one task will be passed to the next, and it is likely going to be via
PVCs and will not need the version field.

Fixes tektoncd#200
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Nov 21, 2018
We had been hoping to use `version` to cover two cases:
1. To allow a user to override the version of the Resource they are using
   (e.g. for a particular git resource, use a different revision)
2.  For the case where a Task modifies a resource then provides it to the
   next Task, we wanted to express the modification in the version

In the previous commit, we made it so that a user can change a Resource
by creating a new one and specifying the new one in teh PipelineRun,
and they can do the same thing via a TaskRun.

For the second case, in #124 @shashwathi is designing how the output of
one task will be passed to the next, and it is likely going to be via
PVCs and will not need the version field.

Fixes tektoncd#200
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Nov 22, 2018
This change makes it possible to reuse a Pipeline with different
Resources without having to change the Pipeline itself. It does this by
moving the linking of the Resources a Task requires from the Pipeline
into the PipelineRun.

The relationship between Resources and the Tasks that are expected to be
executed on them is still expressed in the Pipeline (via `providedBy`).

ResourceBindings move from Pipeline to PipelineRun and become
ResourceDependencies. No longer calling these "bindings" in the API or
using the term "source" fixes + the additional docs in using.md fixes tektoncd#139.

The most difficult part of this change was updating
validatePipelineTaskAndTask - hoping to refactor this in a follow up
(which would also address tektoncd#213).

This interface is still probably not in its final form and hopefully we
can iterate on it!

Fixes tektoncd#200
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Nov 22, 2018
We had been hoping to use `version` to cover two cases:
1. To allow a user to override the version of the Resource they are using
   (e.g. for a particular git resource, use a different revision)
2.  For the case where a Task modifies a resource then provides it to the
   next Task, we wanted to express the modification in the version

In the previous commit, we made it so that a user can change a Resource
by creating a new one and specifying the new one in teh PipelineRun,
and they can do the same thing via a TaskRun.

For the second case, in #124 @shashwathi is designing how the output of
one task will be passed to the next, and it is likely going to be via
PVCs and will not need the version field.

Fixes tektoncd#200
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Nov 22, 2018
This change makes it possible to reuse a Pipeline with different
Resources without having to change the Pipeline itself. It does this by
moving the linking of the Resources a Task requires from the Pipeline
into the PipelineRun.

The relationship between Resources and the Tasks that are expected to be
executed on them is still expressed in the Pipeline (via `providedBy`).

ResourceBindings move from Pipeline to PipelineRun and become
ResourceDependencies. No longer calling these "bindings" in the API or
using the term "source" fixes + the additional docs in using.md fixes tektoncd#139.

The most difficult part of this change was updating
validatePipelineTaskAndTask - hoping to refactor this in a follow up
(which would also address tektoncd#213).

This interface is still probably not in its final form and hopefully we
can iterate on it!

Fixes tektoncd#200
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Nov 22, 2018
We had been hoping to use `version` to cover two cases:
1. To allow a user to override the version of the Resource they are using
   (e.g. for a particular git resource, use a different revision)
2.  For the case where a Task modifies a resource then provides it to the
   next Task, we wanted to express the modification in the version

In the previous commit, we made it so that a user can change a Resource
by creating a new one and specifying the new one in teh PipelineRun,
and they can do the same thing via a TaskRun.

For the second case, in #124 @shashwathi is designing how the output of
one task will be passed to the next, and it is likely going to be via
PVCs and will not need the version field.

Fixes tektoncd#200
knative-prow-robot pushed a commit that referenced this issue Nov 22, 2018
This change makes it possible to reuse a Pipeline with different
Resources without having to change the Pipeline itself. It does this by
moving the linking of the Resources a Task requires from the Pipeline
into the PipelineRun.

The relationship between Resources and the Tasks that are expected to be
executed on them is still expressed in the Pipeline (via `providedBy`).

ResourceBindings move from Pipeline to PipelineRun and become
ResourceDependencies. No longer calling these "bindings" in the API or
using the term "source" fixes + the additional docs in using.md fixes #139.

The most difficult part of this change was updating
validatePipelineTaskAndTask - hoping to refactor this in a follow up
(which would also address #213).

This interface is still probably not in its final form and hopefully we
can iterate on it!

Fixes #200
knative-prow-robot pushed a commit that referenced this issue Nov 22, 2018
We had been hoping to use `version` to cover two cases:
1. To allow a user to override the version of the Resource they are using
   (e.g. for a particular git resource, use a different revision)
2.  For the case where a Task modifies a resource then provides it to the
   next Task, we wanted to express the modification in the version

In the previous commit, we made it so that a user can change a Resource
by creating a new one and specifying the new one in teh PipelineRun,
and they can do the same thing via a TaskRun.

For the second case, in #124 @shashwathi is designing how the output of
one task will be passed to the next, and it is likely going to be via
PVCs and will not need the version field.

Fixes #200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design This task is about creating and discussing a design
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants