Avoid unecessary work within reconciliation loop - LimitRanges #2666

jlpettersson · 2020-05-21T14:35:34Z

Expected Behavior

In a larger organization with e.g. 400+ apps using Tekton CI/CD pipelines, and where developers usually do git-push to trigger new PipelineRuns with typically 10-20 tasks, several times a day, many TaskRuns is created.

I expect that Tekton is an efficient CI/CD platform and also is sound with the requests to the Kubernetes control plane, especially etcd that is a critical component in any Kubernetes cluster.

Actual Behavior

A typical git-to-deploy Pipeline will probably consist of 10-20 TaskRun pods to be created.
For every TaskRun to be created we do (at least) these requests to the ApiServer — this is synchronous request-response pairs that also delay TaskRun creation:

Lookup things for artifact-store — this is mostly related to PipelineResources and pre-workspaces - we may rethink this.
Lookup serviceAccounts for creds-init — this may be redesigned with Improve UX of getting credentials into Tasks #2343
Lookup feature configMap — this will be improved by Introduce config map watcher for feature flags #2637 (could be more static as well?)
Lookup LimitRanges that is configured for the namespace.

Looking up LimitRanges is a bit strange responsibility for the TaskRun-controller within the reconciliation loop. There is a few use-cases:

The Namespace does not have an LimitRange specified (I assume this is the most common case)

In this case a LIST operation against ApiServer and etcd is not needed for every TaskRun
The Namespace does have a LimitRange specified, and it contains Default values. (I assume this to be the second most common case)
A LimitRange provides constraints that can:
- Set default request/limit for compute resources in a namespace and automatically inject them to Containers at runtime.
This is fine, and this should work without that Tekton need to do any extra job. We even have an example of this.

In this case a LIST operation against ApiServer and etcd is not needed for every TaskRun
The Namespace does have a LimitRange specified, but without Default values. (I assume this to be the rarest case).

This is the only time our extra request for every TaskRun provide any value. But is this the responsibility of the TaskRun-controller within the reconciliation loop?

In an environment where operations or platform-team has configured it like this, it probably means that they want the users to specify resource limit explicitly for their workload.
- A step may declare custom resource limits.
- A Task may declare a Step template to be used for its steps.
- There could be a global default configuration for default Step Template with this, as pointed out in Add global stepTemplate config #2600
- There could also be a per PipelineRun default Step template - specified in a TriggerTemplate.

All above is good alternatives to lookup a namespace configuration for every TaskRun creation (a very common operation when using Tekton).

The text was updated successfully, but these errors were encountered:

afrittoli · 2020-07-22T08:18:28Z

1. Lookup things for artifact-store — this is mostly related to `PipelineResources` and pre-workspaces - we may rethink this.

This artifact store lookup is addressed by #2947

tekton-robot · 2020-10-20T09:05:06Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

vdemeester · 2020-11-16T07:50:36Z

/remove-lifecycle stale

tekton-robot · 2021-02-14T08:42:12Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

jlpettersson · 2021-02-14T17:41:09Z

/remove-lifecycle stale

tekton-robot · 2021-05-15T17:55:09Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot · 2021-06-14T18:40:09Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

tekton-robot · 2021-07-14T18:51:28Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

tekton-robot · 2021-07-14T18:51:29Z

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dibyom added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label May 27, 2020

jlpettersson mentioned this issue Jul 22, 2020

taskrun pod of resource.request settings is not good enough for customization #2984

Closed

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 20, 2020

tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 16, 2020

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 14, 2021

tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 14, 2021

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 15, 2021

tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 14, 2021

tekton-robot closed this as completed Jul 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid unecessary work within reconciliation loop - LimitRanges #2666

Avoid unecessary work within reconciliation loop - LimitRanges #2666

jlpettersson commented May 21, 2020

afrittoli commented Jul 22, 2020 •

edited

Loading

tekton-robot commented Oct 20, 2020

vdemeester commented Nov 16, 2020

tekton-robot commented Feb 14, 2021

jlpettersson commented Feb 14, 2021

tekton-robot commented May 15, 2021

tekton-robot commented Jun 14, 2021

tekton-robot commented Jul 14, 2021

tekton-robot commented Jul 14, 2021

Avoid unecessary work within reconciliation loop - LimitRanges #2666

Avoid unecessary work within reconciliation loop - LimitRanges #2666

Comments

jlpettersson commented May 21, 2020

Expected Behavior

Actual Behavior

afrittoli commented Jul 22, 2020 • edited Loading

tekton-robot commented Oct 20, 2020

vdemeester commented Nov 16, 2020

tekton-robot commented Feb 14, 2021

jlpettersson commented Feb 14, 2021

tekton-robot commented May 15, 2021

tekton-robot commented Jun 14, 2021

tekton-robot commented Jul 14, 2021

tekton-robot commented Jul 14, 2021

afrittoli commented Jul 22, 2020 •

edited

Loading