Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unecessary work within reconciliation loop - LimitRanges #2666

Closed
jlpettersson opened this issue May 21, 2020 · 9 comments
Closed

Avoid unecessary work within reconciliation loop - LimitRanges #2666

jlpettersson opened this issue May 21, 2020 · 9 comments
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@jlpettersson
Copy link
Member

Expected Behavior

In a larger organization with e.g. 400+ apps using Tekton CI/CD pipelines, and where developers usually do git-push to trigger new PipelineRuns with typically 10-20 tasks, several times a day, many TaskRuns is created.

I expect that Tekton is an efficient CI/CD platform and also is sound with the requests to the Kubernetes control plane, especially etcd that is a critical component in any Kubernetes cluster.

Actual Behavior

A typical git-to-deploy Pipeline will probably consist of 10-20 TaskRun pods to be created.
For every TaskRun to be created we do (at least) these requests to the ApiServer — this is synchronous request-response pairs that also delay TaskRun creation:

  1. Lookup things for artifact-store — this is mostly related to PipelineResources and pre-workspaces - we may rethink this.
  2. Lookup serviceAccounts for creds-init — this may be redesigned with Improve UX of getting credentials into Tasks #2343
  3. Lookup feature configMap — this will be improved by Introduce config map watcher for feature flags #2637 (could be more static as well?)
  4. Lookup LimitRanges that is configured for the namespace.

Looking up LimitRanges is a bit strange responsibility for the TaskRun-controller within the reconciliation loop. There is a few use-cases:

  1. The Namespace does not have an LimitRange specified (I assume this is the most common case)

    In this case a LIST operation against ApiServer and etcd is not needed for every TaskRun

  2. The Namespace does have a LimitRange specified, and it contains Default values. (I assume this to be the second most common case)

    A LimitRange provides constraints that can:

    • Set default request/limit for compute resources in a namespace and automatically inject them to Containers at runtime.

    This is fine, and this should work without that Tekton need to do any extra job. We even have an example of this.

    In this case a LIST operation against ApiServer and etcd is not needed for every TaskRun

  3. The Namespace does have a LimitRange specified, but without Default values. (I assume this to be the rarest case).

    This is the only time our extra request for every TaskRun provide any value. But is this the responsibility of the TaskRun-controller within the reconciliation loop?

    In an environment where operations or platform-team has configured it like this, it probably means that they want the users to specify resource limit explicitly for their workload.

    • A step may declare custom resource limits.
    • A Task may declare a Step template to be used for its steps.
    • There could be a global default configuration for default Step Template with this, as pointed out in Add global stepTemplate config #2600
    • There could also be a per PipelineRun default Step template - specified in a TriggerTemplate.

All above is good alternatives to lookup a namespace configuration for every TaskRun creation (a very common operation when using Tekton).

@dibyom dibyom added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label May 27, 2020
@afrittoli
Copy link
Member

afrittoli commented Jul 22, 2020

1. Lookup things for artifact-store — this is mostly related to `PipelineResources` and pre-workspaces - we may rethink this.

This artifact store lookup is addressed by #2947

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 20, 2020
@vdemeester
Copy link
Member

/remove-lifecycle stale

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 16, 2020
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 14, 2021
@jlpettersson
Copy link
Member Author

/remove-lifecycle stale

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 14, 2021
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 15, 2021
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 14, 2021
@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

5 participants