FR: Task-level (and maybe Pipeline-level) resource requests and limits #4470

lbernick · 2022-01-12T16:15:04Z

Feature request

Currently, users can set resource requests and limits at the Step level, which are summed to determine the resource requests of the pod that runs a TaskRun. However, some users want to be able to directly specify the resource requests for a Task, rather than per Step.

Related issues

Improve UX of Step Resource Requests #2986: a FR for Task-level resource requests (however, the Step resource request behavior described is no longer true)
Specify resource quota for a Pipeline #4271: FR for Pipeline-level resource requests
resource hoarding for steps of a taskrun #4347: Confusion that step-level resource requests are summed

lbernick · 2022-03-09T21:30:39Z

/kind design
/assign

lbernick · 2022-03-10T20:16:06Z

a bit of a wrinkle here. Let's say we want to allow people to specify the total resource requests and limits that should be used by all steps in their task:

k8s states that containers without resource limits are considered to have higher limits than those with limits configured (https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resources)
If the task resource limit is applied to only one container, the pod will therefore not have an effective limit.
If the task limit is applied to each container, the pod has a much higher limit than desired. This is especially problematic if requests are not set, because the request will then automatically be set to the same value as the limit, and the pod may have difficulty being scheduled.
If the task limit is spread out among containers, a task where one step is more resource intensive than all the others could get oomkilled or throttled.
resource requirements can't be updated, so we can't dynamically adjust them as steps run.

One way around this is to support only Task-level resource requests, not limits.

Another option is to run Steps as init containers rather than containers (instead of supporting this feature). As a result, k8s will correctly determine the effective pod resource requirements. We would have to rework our entrypoint but I imagine it would be doable. However, we would have no way to support Sidecars.

dibyom · 2022-03-14T16:42:10Z

Another option is to run Steps as init containers rather than containers (instead of supporting this feature). As a result, k8s will correctly determine the effective pod resource requirements. We would have to rework our entrypoint but I imagine it would be doable. However, we would have no way to support Sidecars.

We used to run Tekton in init containers before, init containers have some nice characteristics but also come with a bunch of drawbacks: see #224

joshuasimon-taulia · 2022-03-16T22:47:42Z

imo, #4176 made defining cpu/mem requests much less intuitive. assuming steps in a task will never run in parallel, task-level resource definition would definitely alleviate some of that pain.

lbernick · 2022-03-21T17:19:27Z

design doc

Must join tekton-dev or tekton-users to view/comment

lbernick · 2022-03-28T16:35:19Z

/kind tep

lbernick · 2022-04-08T14:24:36Z

Opened TEP-0104 to propose this feature.

austinzhao-go · 2022-05-09T19:46:08Z

Hi @lbernick, perhaps I could pick this issue up?

digesting the required logic from TEP and target to raise a draft PR in this week.

lbernick · 2022-05-09T19:51:29Z

/assign @austinzhao-go

Thanks Austin!

lbernick · 2022-05-11T13:53:37Z

I want to amend what I said in #4470 (comment); since the container limits are enforced on individual containers by the container runtime, and pod effective limits seem less relevant than pod effective requests, I've updated the design in tektoncd/community#703 to propose support for Task-level limits as well.

austinzhao-go · 2022-05-11T16:00:27Z

thanks this update @lbernick

just confirm my understanding about the update:

task-level limit field will be added and written into step-level (as will pass Pod scheduler check with a higher total amount, but still get enforced by contain-level in runtime)
resources field (requests + limits) will be added under 2 positions
- Task.sepc.resources
- PipelineRun.TaskRunSpecs.resources (think will overwrite Task.* ones if both specified as for a runtime precedence)
sidecar container will be specified separately (from task-level) for resource requirements and by sidecar-wise (so keep as now, and NOT have a resource field under sidecars -- which will mean for all sidecars)

not change:

final requirements (after applying task-level resources requirement) will be written into the last-step to take effect.

lbernick · 2022-05-11T17:57:02Z

Yup that's correct @austinzhao-go !

tekton-robot · 2022-08-09T17:59:15Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

joshuasimon-taulia · 2022-08-09T18:07:17Z

/remove-lifecycle stale
we really need an intuitive way to set requests/limits for the entire pod/task

lbernick · 2022-08-09T19:01:45Z

/lifecycle frozen

This should be ready soon :)

lbernick · 2022-08-16T14:22:46Z

This feature is available on main now and will be out in the next release. We won't be implementing Pipeline-level resource requirements for the foreseeable future (it's hard to have a non confusing API for this when some components are running in parallel and some sequentially), although if there's a strong use case for it we can reconsider.

lbernick added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 12, 2022

This was referenced Jan 12, 2022

Improve UX of Step Resource Requests #2986

Closed

resource hoarding for steps of a taskrun #4347

Closed

tekton-robot assigned lbernick Mar 9, 2022

tekton-robot added the kind/design Categorizes issue or PR as related to design. label Mar 9, 2022

tekton-robot added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Mar 28, 2022

lbernick mentioned this issue Apr 11, 2022

[TEP-0104]: Task-level resource requests tektoncd/community#673

Merged

dibyom added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Apr 25, 2022

tekton-robot assigned austinzhao-go May 9, 2022

austinzhao-go mentioned this issue May 11, 2022

[TEP-0104]: Support Task-level resource limits tektoncd/community#703

Merged

austinzhao-go mentioned this issue May 17, 2022

[TEP-0104] Support Task-level Resource Requirements for TaskRun: Part #1 Fields Addition & Validation w/ Docs Updates #4877

Merged

11 tasks

This was referenced Jun 29, 2022

[TEP-0104] Add Validation for Step-level Resource Requirements #5054

Closed

[TEP-0104] Update Pod with Task-level Resource Requirements #5082

Merged

This was referenced Jul 7, 2022

Allow to use variable replacement when defining resource limits and requests #4080

Open

[FR] Specifying compute resources on Pipeline Tasks #5110

Closed

austinzhao-go mentioned this issue Jul 26, 2022

[TEP-0104] Populate Task-level Resource Requirements from PipelineRun to TaskRun #5212

Merged

7 tasks

austinzhao-go mentioned this issue Aug 3, 2022

Fix Existing Requests and Limits with LimitRange #5269

Closed

7 tasks

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 9, 2022

tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 9, 2022

tekton-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Aug 9, 2022

tekton-robot closed this as completed in #5082 Aug 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FR: Task-level (and maybe Pipeline-level) resource requests and limits #4470

FR: Task-level (and maybe Pipeline-level) resource requests and limits #4470

lbernick commented Jan 12, 2022 •

edited

Loading

lbernick commented Mar 9, 2022

lbernick commented Mar 10, 2022

dibyom commented Mar 14, 2022

joshuasimon-taulia commented Mar 16, 2022

lbernick commented Mar 21, 2022

lbernick commented Mar 28, 2022

lbernick commented Apr 8, 2022

austinzhao-go commented May 9, 2022

lbernick commented May 9, 2022

lbernick commented May 11, 2022

austinzhao-go commented May 11, 2022 •

edited

Loading

lbernick commented May 11, 2022

tekton-robot commented Aug 9, 2022

joshuasimon-taulia commented Aug 9, 2022

lbernick commented Aug 9, 2022

lbernick commented Aug 16, 2022

FR: Task-level (and maybe Pipeline-level) resource requests and limits #4470

FR: Task-level (and maybe Pipeline-level) resource requests and limits #4470

Comments

lbernick commented Jan 12, 2022 • edited Loading

Feature request

Related issues

lbernick commented Mar 9, 2022

lbernick commented Mar 10, 2022

dibyom commented Mar 14, 2022

joshuasimon-taulia commented Mar 16, 2022

lbernick commented Mar 21, 2022

lbernick commented Mar 28, 2022

lbernick commented Apr 8, 2022

austinzhao-go commented May 9, 2022

lbernick commented May 9, 2022

lbernick commented May 11, 2022

austinzhao-go commented May 11, 2022 • edited Loading

lbernick commented May 11, 2022

tekton-robot commented Aug 9, 2022

joshuasimon-taulia commented Aug 9, 2022

lbernick commented Aug 9, 2022

lbernick commented Aug 16, 2022

lbernick commented Jan 12, 2022 •

edited

Loading

austinzhao-go commented May 11, 2022 •

edited

Loading