Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement adding priority to ResourceQuota #48648

Closed
bsalamat opened this issue Jul 7, 2017 · 16 comments · Fixed by #57963
Closed

Implement adding priority to ResourceQuota #48648

bsalamat opened this issue Jul 7, 2017 · 16 comments · Fixed by #57963
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.

Comments

@bsalamat
Copy link
Member

bsalamat commented Jul 7, 2017

Is this a BUG REPORT or FEATURE REQUEST?: Feature request

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

@kubernetes/sig-scheduling-feature-requests

ref/ #47604

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Jul 7, 2017
@resouer
Copy link
Contributor

resouer commented Aug 3, 2017

Talked with bsalamat offline, will /assign resouer

@bsalamat
Copy link
Member Author

bsalamat commented Aug 3, 2017

A high level idea of how priority should be added to ResourceQuota is to add a field to resource quota that specifies a priority class name. When the field is set, it indicates that the quota applies to that priority class. If the field is not present, quota applies to all priority classes.
It is worth noting that quota is assigned per namespace, not per user, in Kubernetes. Adding priority to ResourceQuota will not change anything with respect to the scope of ResourceQuota. In other words, when priority is defined for a ResourceQuota object, the quota is applied to all pods at that priority in the specified namespace.

@davidopp
Copy link
Member

davidopp commented Aug 3, 2017

Yes, that sounds reasonable. Here are a few other observations (in the below points, "exists" means "exists in the same namespace")

  • resourcequota admission controller should reject creation or update of ResourceQuota object if new ResourceQuota names a PriorityClass that does not exist, i.e. is not a pre-defined or user-specified PriorityClass (but empty is allowed)
  • resourcequota admission controller should reject creation of ResourceQuota object if another ResourceQuota object with same PriorityClass already exists, or if a ResourceQuota object with no PriorityClass exists
  • resourcequota admission controller should reject update of ResourceQuota object if the new (updated) PriorityClass matches that of another ResourceQuota object, or if another ResourceQuota object has no PriorityClass
  • priority admission controller should reject deletion of a PriorityClass object if it is named in any ResourceQuota object in any namespace

One last comment: I wonder if we should have some way to say "this ResourceQuota should be associated with the current default PriorityClass." We can't use "PriorityClass field not present" to mean this since we're already using that to mean something else in the approach @bsalamat described above. I guess we could use "PriorityClass present but empty" to mean "whatever the current default PriorityClass is"...

@bsalamat
Copy link
Member Author

bsalamat commented Aug 3, 2017

@davidopp I think we should use a ResourceQuota with no PriorityClass to mean default quota for all pods and then we should allow another ResourceQuota to co-exist in that namespace that specifies quota for a particular priority.
Here is an example:

ResourceQuota1:
    Namespace: xyz
    Memory: 10GB
ResourceQuota2:
    Namespace: xyz
    Memory: 1GB
    PriorityClass: critical

In the above example, quota for pods except those created with "critical" priority is 10GB of memory and quota for those created with "critical" priority is 1GB.
I think the above rule is enough to cover most of the use cases and we do not need to link ResourceQuota to default priority.
WDYT?

@davidopp
Copy link
Member

davidopp commented Aug 3, 2017

I think if we're going to allow the kind of thing you described, it is simpler to say the non-priority quota applies to everything. In other words, in your example the sum of resources across all pods in the namespace cannot exceed 10GB, and the sum of resources across critical pods in the namespace cannot exceed 1GB.

But I actually don't think we should allow a namespace to have both kinds of quota. I think there are probably only two common use cases: the "legacy" use case where the users of the cluster are not going to use the priority feature, and the use case where priority is used. Allowing a non-priority quota in a namespace is needed for the first case, but it seems harmful in the second case, since it gives the namespace some amount of quota at the highest priority. So I don't really see people using both of these together, unless there's something I'm not thinking of.

@bsalamat
Copy link
Member Author

bsalamat commented Aug 3, 2017

I think if we're going to allow the kind of thing you described, it is simpler to say the non-priority quota applies to everything. In other words, in your example the sum of resources across all pods in the namespace cannot exceed 10GB, and the sum of resources across critical pods in the namespace cannot exceed 1GB.

Yes, I agree.

But I actually don't think we should allow a namespace to have both kinds of quota. I think there are probably only two common use cases: the "legacy" use case where the users of the cluster are not going to use the priority feature, and the use case where priority is used. Allowing a non-priority quota in a namespace is needed for the first case, but it seems harmful in the second case, since it gives the namespace some amount of quota at the highest priority. So I don't really see people using both of these together, unless there's something I'm not thinking of.

In the second case, how should one give a large quota to all priorities, but give a small quota to high priority pods? If we don't support co-existence of quota with no priority and quota with priority, then cluster admin has to create one ResourceQuota for each PriorityClass in each namespace. They must also add new ResourceQuota everytime a new PriorityClass is added.

@davidopp
Copy link
Member

davidopp commented Aug 4, 2017

If we don't support co-existence of quota with no priority and quota with priority, then cluster admin has to create one ResourceQuota for each PriorityClass in each namespace. They must also add new ResourceQuota everytime a new PriorityClass is added.

Yes, this is what I had in mind.

But I don't think this is so bad, as cluster admins are unlikely to add new PriorityClasses often.

I imagine the workflow they want is to create the new PriorityClass, then give people access to it. In the approach you described, everyone with a non-priority quota would immediately get access to a new PriorityClass, unless the admin added a quota for the new PriorityClass before adding the PriorityClass, which would mean we would have to allow quotas that name non-existant PriorityClasses.

We do need a better way to do default creation of objects in namespaces (in this case a ResourceQuota), via addon manager or upstreaming OpenShift project templates or something like that. But we need that for other reasons anyway.

@bsalamat
Copy link
Member Author

bsalamat commented Aug 4, 2017

I imagine the workflow they want is to create the new PriorityClass, then give people access to it. In the approach you described, everyone with a non-priority quota would immediately get access to a new PriorityClass, unless the admin added a quota for the new PriorityClass before adding the PriorityClass, which would mean we would have to allow quotas that name non-existant PriorityClasses.

That's a fair point. It is worth noting that this will work when absence of quota means "no quota". I think we already have ways to set this quota policy. So, your design looks fine to me.

@k82cn
Copy link
Member

k82cn commented Aug 4, 2017

I also agree with davidopp@'s design; it cover all user cases in my mind. And --

I think if we're going to allow the kind of thing you described, it is simpler to say the non-priority quota applies to everything. In other words, in your example the sum of resources across all pods in the namespace cannot exceed 10GB, and the sum of resources across critical pods in the namespace cannot exceed 1GB.

This's a kind of hierarchical quota to me; prefer to handle it in kubernetes/enhancements#269 or incubator proj for it. In priority/preemption project, flat Quota is better.

@resouer
Copy link
Contributor

resouer commented Aug 4, 2017

Thanks for enlightening, also, personally vote for keep design simple in early phase.

Will send out PR next week.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 3, 2018
@bsalamat
Copy link
Member Author

bsalamat commented Jan 3, 2018

/remove-lifecycle stale

@vikaschoudhary16
Copy link
Contributor

/assign

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 29, 2018
@bsalamat
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 29, 2018
k8s-github-robot pushed a commit that referenced this issue Jun 4, 2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Introduce priority class in the resource quota

**What this PR does / why we need it**:
Implements kubernetes/community#933
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #48648 

**Special notes for your reviewer**:
Test Cases are still to be covered. Opening this PR to make discussion convenient with code references.
Will update test cases only after design PR has got merged.

**Release note**:

```release-note
Ability to quota resources by priority
```
/kind feature
/priority important-soon
/sig scheduling
/sig node
/cc @resouer @derekwaynecarr @sjenning @bsalamat @timstclair @aveshagarwal @ravisantoshgudimetla
@bsalamat
Copy link
Member Author

@vikaschoudhary16 Could you please update the ResourceQuota documentation with the addition of pod priority?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
None yet
7 participants