-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp how pipeline supports Limitranges #4176
Conversation
@@ -680,7 +681,7 @@ func (c *Reconciler) createPod(ctx context.Context, tr *v1beta1.TaskRun, rtr *re | |||
EntrypointCache: c.entrypointCache, | |||
OverrideHomeEnv: shouldOverrideHomeEnv, | |||
} | |||
pod, err := podbuilder.Build(ctx, tr, *ts) | |||
pod, err := podbuilder.Build(ctx, tr, *ts, limitrange.NewTransformer(ctx, tr.Namespace, c.KubeClientSet)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is to, later refactor other "pod" transformation that do not need Task
information to be done outside of the Build
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pass through a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call?
(Not necessarily blocking, especially since this has been open for so long and it's nearing a release-cut, but something I'd like to push for in a quick followup if it doesn't make it into this PR)
The following is the coverage report on the affected files.
|
This looks good to me (save for the TODOs that still need to be filled in) and I like the large body of test cases you've added. The docs would definitely help (you coule probably copy/paste your PR description into a doc and it would already be a big improvement over what we have now :D) One code nit: I think we should either stick with a "stateful builder" pattern (adding a |
fa9e223
to
da1c09d
Compare
The following is the coverage report on the affected files.
|
da1c09d
to
ccc99be
Compare
The following is the coverage report on the affected files.
|
I tend to agree, but at the same time, I didn't want to do the full refactoring in the same PR to keep it at a decent size. Would it work if we document the follow-up in an issue to take later (just after this one getting merged) or should I go ahead and do the refactoring here as well ? (in a different commit) |
Yeah I think making it a follow-up is totally fine. No need to do a rewrite as part of this PR. |
I'll mark this as reviewable (pending adding some documentation and fixing the CI). Main reason is that 90% of cases should be covered with just that set of change. Without having an "code" support of |
ccc99be
to
cc5a703
Compare
The following is the coverage report on the affected files.
|
Interesting, the unit tests are green locally.. what is happening 🤔 |
I'm able to reproduce the failures after pulling the PR, I wonder if there are files that still need to be pushed from your local? The tests are failing for me with diffs like this:
Here's the complete list of errored UTs I'm seeing:
go version just in case it's relevant:
|
@sbwsg yeah I am able to reproduce now.. for some reason, my working copy was in a weird state.. all fixed now, gotta fix the tests 😛 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to gather a lot of these LimitRange cases for Tekton. This lgtm as far as the changes.
pkg/pod/limitrange/transformer.go
Outdated
"k8s.io/client-go/kubernetes" | ||
) | ||
|
||
// var zeroQty = resource.MustParse("0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove this comment.
dafd77c
to
25ec928
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
ping @tektoncd/core-maintainers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great docs!!! Thanks for walking through the problem so thoroughly - I had a bit of feedback on some parts that confused me.
What do you think about including an end to end test for this? Maybe it's overkill - but it seems doable so maybe it's worth it? If not an end to end test, maybe a reconciler test?
Ultra nit-pick: i think there's a hanging )
in the release notes 🙏
/approve
docs/limitrange.md
Outdated
- Ephemeral storage | ||
|
||
The next question is : how pods with resource and limits are run/scheduled ? | ||
The scheduler *computes* the amount of CPA and memory requests (using **Requests**) and tries to find a node to schedule it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo maybe? CPA -> CPU
docs/limitrange.md
Outdated
|
||
The way Limits and Requests works in Kubernetes is because it is assumed that all containers run in parallel, and init container run before, each one after the others. | ||
|
||
That assumption — containers running in parallel — is not true in Tekton. They do all start together (because there is no way around this) **but** the *entrypoint hack* is making sure they actually run in sequence and thus there is always only one container that is actually consuming some resource at the same time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder if we could link to some docs on the entrypoint hack - i'm surprised our developer docs dont have much on this but maybe we could link to https://github.com/tektoncd/pipeline/tree/main/cmd/entrypoint#entrypoint ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I think it would be worth enhancing the docs around what the entrypoint
does 👼🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
docs/limitrange.md
Outdated
|
||
That assumption — containers running in parallel — is not true in Tekton. They do all start together (because there is no way around this) **but** the *entrypoint hack* is making sure they actually run in sequence and thus there is always only one container that is actually consuming some resource at the same time. | ||
|
||
This means, we need to handle limits, request and LimitRanges in a *non-standard* way. Let's try to define that. Tekton needs to take into account all the aspect of the LimitRange : the min/max as well as the default. If there is no default, but there is min/max, Tekton need then to **set** a default value that is between the min/max. If we set the value too low, the Pod won't be able to be created, similar if we set the value too high. **But** those values are set on **containers**, so we **have to** do our own computation to know what request to put on each containers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a bit confused by this part - specifically why we need to set the defaults.
After reading through everything I think the overall reason is that we can't rely on the existing LimitRange mutating webhook, so we pretty much need to explicitly set everything ourselves, is that right?
Maybe you could emphasize that a bit here, something like:
This means, we need to handle limits, request and LimitRanges in a *non-standard* way. Since the existing LimitRange mutating webhook won't take Tekton's requirements into account, Tekton needs to fully control all the values the LimitRange webhook might set...
Not sure if that's clearer, feel free to ignore 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading through everything I think the overall reason is that we can't rely on the existing LimitRange mutating webhook, so we pretty much need to explicitly set everything ourselves, is that right?
Yes indeed. If we want allocate resource in relatively optimial manner, we cannot rely on the LimitRange mutating webhook.
docs/limitrange.md
Outdated
default. I think that's also what k8s does, at least in our computation. | ||
- **Default value:** we need to "try" to respect that as much as possible. | ||
- `defaultLimit` but no `defaultRequest`, then we set `defaultRequest` to be same as `defaultLimit`. | ||
- `defaultRequest` but no `defaultlimit`, then we use the `min` limit as the `defaultLimit` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why we wouldn't use the max
as the defaultLimit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, I think that's a typo / wrong thing here, I'll fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because, in a gist, Requests are used for scheduling (so we need the "minimum") and Limits are used for "killing" the running process (so we kinda need the "maximum")
- **containers:** those will be summed at the end, so it gets a bit complex | ||
- a container needs to have request and limits at least at the `min` | ||
- the sum of the container request/limits **should be** as small as possible. This should be | ||
ensured by using the "smallest" possible request on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this answers my question above about why the min
is being used as the defaultLimit
- i think it would help me understand better if you explained near the beginning of the section that we want the sum to be as small as possible and why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this was updated - and/or my understanding is just better now, but anyway lgtm 👍 👍 👍
docs/pipelineruns.md
Outdated
time from the invoked `Task`, Tekton only requests the *maximum* values for CPU, memory, and ephemeral | ||
storage from within each `Step`. This is sufficient as `Steps` only execute one at a time in the `Pod`. | ||
Requests other than the maximum values are set to zero. | ||
time from the invoked `Task`, Tekton will requests the compute values for CPU, memory, and ephemeral |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: tekton will requests -> tekton will request
(in taskruns.md as well)
pkg/pod/limitrange/doc.go
Outdated
limitations under the License. | ||
*/ | ||
// Package limitrange defines logic for supporting Kubernetes LimitRange for the specific Tekton use cases | ||
package limitrange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, i like the new package :D
random question and maybe off topic: would it make sense to put this in internal/
? I'm wondering if we want to start putting as much code as possible in internal/
, i.e. being careful about what code other projects might start depending on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a fervent defenser of internal
package, so yes, I think we should put in under internal to start with and we can move it outside at some point if need be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only "open question" is pkg/internal/limitrange
or internal/pkg/limitrange
?
We don't need to get access to this package in the e2e tests or cmd
packages, so I'd rather have pkg/internal
but 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either is fine by me! :D 💃 pkg/internal
sgtm - thanks @vdemeester !
pkg/pod/limitrange/transformer.go
Outdated
// We are adding +1 to the number of container to take into account the init containers. | ||
// The reason to use +1 only is that, k8s tread request on all init container as one (getting the max of all) | ||
nbContainers := len(p.Spec.Containers) + 1 | ||
// FIXME(vdemeester) maxLimitRequestRatio to support later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we link to an issue vs just your username? (otherwise this kinda thing is hard to follow up on later on)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, kept this here but indeed I intended to create an issue about this 👼🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#4230 👼🏼
Up until now, pipeline support for LimitRange is rather limited and confusing and can lead to inconsistency: - It is not applied to `InitContainers` - It zero-out user requests to keep the max and assign to one step, which can lead to invalid `Pod` definition. - It uses only `Min` and never reads `Default` and `DefaultRequest` - It also doesn't support `MaxLimitRequestRatio` This commits aims to fix that by adding more support to LimitRange. Note that, to understand some of the choice, some assumption on how LimitRange works is required. On `Pod` Containers: - *Requests* are not enforced. If the node has more resource available than the request, the container can use it. - *Limits* on the other hand, are a hard stop. A container going over the limit, will be killed. It is thus important to get `Requests` right, to allow scheduling. Requests and limits can come from both Containers and Init Containers. - For init containers, the max of each type is taken - For containers, it sums all requests/limits for each containers This means, if you got the following: - initContainer1 : 1 CPU, 100m memory - initContainer2 : 2 CPU, 200m memory - container1 : 1 CPU, 50m memory - container2 : 2 CPU, 250m memory - container3 : 3 CPU, 500m memory The computation will be: - CPU : 2 (max init containers) + 6 (sum of containers) = 8 CPU - Memory: 200m (max init containers) + 800m (sum of containers) = 1000m (1G) LimitRange enforce (mutates the `Pod`) some Limits and Requests (using `Default` and `DefaultRequest`) and validate those (`Min`, `Max` and `MaxLimitRequestRatio`). They are applied by namespace, and it is possible to have multiple `LimitRange` in a namespace. The way Limits and Requests works in Kubernetes is because it is assumed that all containers run in parallel (which they do — except in tekton with some hack), and init container run before, each one after the others. That assumption — running in parallel — is not really true in Tekton. They do all start together (because there is no way around this) *but* the /entrypoint hack/ is making sure they actually run in sequence and thus there is always only one container that is actually consuming some resource at the same time. This means, we need to handle limits, request and LimitRanges in a /non-standard/ way. Let's try to define that. Tekton needs to take into account all the aspect of the LimitRange : the min/max as well as the default. If there is no default, but there is min/max, Tekton need then to *set* a default value that is between the min/max. If we set the value too low, the Pod won't be able to be created, similar if we set the value too high. *But* those values are set on *containers*, so we *have to* do our own computation to know what request to put on each containers. To add to the complexity here, we also need to support `MaxLimitRequestRatio`, which is just adding complexity on top of something complex. That said, ideally, if we take the default correctly, we should be able to have support for `MaxLimitRequestRatio` for free. This commits tries to add support for this, by computing the minimum request to apply that satisfy the `LimitRange`(s), applying them to `Containers` as well `InitContainers`. Note: If there is multiple `LimitRange` in the namespace, Tekton tries to make the best out of it *but* if they are conflicting with each other (a `Max` on one that is smaller than the `Min` on the other), its the user responsability. Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
d49fc81
to
e7cac97
Compare
The following is the coverage report on the affected files.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **containers:** those will be summed at the end, so it gets a bit complex | ||
- a container needs to have request and limits at least at the `min` | ||
- the sum of the container request/limits **should be** as small as possible. This should be | ||
ensured by using the "smallest" possible request on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this was updated - and/or my understanding is just better now, but anyway lgtm 👍 👍 👍
|
||
That assumption — containers running in parallel — is not true in Tekton. They do all start together (because there is no way around this) **but** the *[entrypoint](https://github.com/tektoncd/pipeline/tree/main/cmd/entrypoint#entrypoint) hack* is making sure they actually run in sequence and thus there is always only one container that is actually consuming some resource at the same time. | ||
|
||
This means, we need to handle limits, request and LimitRanges in a *non-standard* way. Since the existing LimitRange mutating webhook won't take Tekton's requirements into account, Tekton needs to fully control all the values the LimitRange webhook might set. Let's try to define that. Tekton needs to take into account all the aspect of the LimitRange : the min/max as well as the default. If there is no default, but there is min/max, Tekton need then to **set** a default value that is between the min/max. If we set the value too low, the Pod won't be able to be created, similar if we set the value too high. **But** those values are set on **containers**, so we **have to** do our own computation to know what request to put on each containers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for updating this! its much more clear to me now 🙇♀️
hm maybe a flakey test
/test pull-tekton-pipeline-alpha-integration-tests |
This merged, but I wanted to reiterate my review comment from above for posterity:
|
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : #4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Using a LimitRange lister here instead, so this doesn't end up hitting the real API server on each call. Taking into account a review : tektoncd#4176 (comment). Signed-off-by: Vincent Demeester <vdemeest@redhat.com>
Changes
Up until now, pipeline support for LimitRange is rather limited and
confusing and can lead to inconsistency:
InitContainers
which can lead to invalid
Pod
definition.Min
and never readsDefault
andDefaultRequest
MaxLimitRequestRatio
This commits aims to fix that by adding more support to LimitRange.
Note that, to understand some of the choice, some assumption on how
LimitRange works is required.
On
Pod
Containers:container can use it.
killed.
It is thus important to get
Requests
right, to allow scheduling.Requests and limits can come from both Containers and Init Containers.
This means, if you got the following:
The computation will be:
1000m (1G)
LimitRange enforce (mutates the
Pod
) some Limits and Requests (usingDefault
andDefaultRequest
) and validate those (Min
,Max
andMaxLimitRequestRatio
). They are applied by namespace, and it ispossible to have multiple
LimitRange
in a namespace.The way Limits and Requests works in Kubernetes is because it is assumed that all
containers run in parallel (which they do — except in tekton with some hack), and init
container run before, each one after the others.
That assumption — running in parallel — is not really true in Tekton. They do all start
together (because there is no way around this) but the /entrypoint hack/ is making sure they
actually run in sequence and thus there is always only one container that is actually
consuming some resource at the same time.
This means, we need to handle limits, request and LimitRanges in a /non-standard/ way. Let's
try to define that. Tekton needs to take into account all the aspect of the LimitRange :
the min/max as well as the default. If there is no default, but there is min/max, Tekton
need then to set a default value that is between the min/max. If we set the value too low,
the Pod won't be able to be created, similar if we set the value too high. But those
values are set on containers, so we have to do our own computation to know what request to
put on each containers. To add to the complexity here, we also need to support
MaxLimitRequestRatio
, which is just adding complexity on top of something complex. Thatsaid, ideally, if we take the default correctly, we should be able to have support for
MaxLimitRequestRatio
for free.This commits tries to add support for this, by computing the minimum
request to apply that satisfy the
LimitRange
(s), applying them toContainers
as wellInitContainers
.Note: If there is multiple
LimitRange
in the namespace, Tekton triesto make the best out of it but if they are conflicting with each
other (a
Max
on one that is smaller than theMin
on the other),its the user responsability.
Signed-off-by: Vincent Demeester vdemeest@redhat.com
/kind feature
Opening this early to get feedback on the approach as this is kind of a breaking change, but it fixes all the problems users are facing today without adding new problems 😛
What is still to do:
LimitRange
in the namespace (by creating a *virtualSupportMaxLimitRequestRatio
in *computationsLimitRange
(most likely which a whole new file just about that)/cc @tektoncd/core-maintainers @danielhelfand
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
functionality, content, code)
Release Notes