-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add defaults configurable runtimeClassName #15271
feat: add defaults configurable runtimeClassName #15271
Conversation
Some other things I can add, please advise your thoughts:
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #15271 +/- ##
==========================================
- Coverage 84.79% 84.58% -0.22%
==========================================
Files 218 219 +1
Lines 13504 13584 +80
==========================================
+ Hits 11451 11490 +39
- Misses 1687 1727 +40
- Partials 366 367 +1 ☔ View full report in Codecov by Sentry. |
400da99
to
dd292b5
Compare
I think we should follow what we've been doing with podAffinity here - #15250 Specifically defaulting
|
Currently, how does setting runtime class surface up through a pod up to a deployment?
I think a unit test is fine - given that it's simply a passthrough to the PodSpec vs. something more complex |
dd292b5
to
f63e7f9
Compare
@dprotaso, thank you kindly for your review.
This makes a lot more sense and I've updated my changes accordingly.
Currently, that's defaulted to a non-accessible field in the Service spec, which makes sense.
With a Pod, it is immediately rejected. With a Deployment, there's a condition like the following: - lastTransitionTime: "2024-06-05T19:53:46Z"
lastUpdateTime: "2024-06-05T19:53:46Z"
message: 'pods "nginx-7c4446d454-" is forbidden: pod rejected: RuntimeClass "kata"
not found'
reason: FailedCreate
status: "True"
type: ReplicaFailure
Yeah this is reasonable. |
f63e7f9
to
9924b19
Compare
9924b19
to
136d743
Compare
ad884ce
to
e03d298
Compare
18b6943
to
d639d68
Compare
if runtimeClassName == "" { | ||
return nil | ||
} | ||
return ptr.String(runtimeClassName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to validate with https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/validation/validation.go#L314 and fail fast?
For example right now random names pass eg. "runtime-class-name": "_ : {}",
although they should print:
error : [a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')]
Tested with
{
name: "runtime class name should fail",
wantErr: false,
data: map[string]string{
QueueSidecarImageKey: defaultSidecarImage,
"runtime-class-name": "_ : {}",
},
wantConfig: &Config{
DigestResolutionTimeout: digestResolutionTimeoutDefault,
ProgressDeadline: ProgressDeadlineDefault,
QueueSidecarCPURequest: &QueueSidecarCPURequestDefault,
QueueSidecarImage: defaultSidecarImage,
QueueSidecarTokenAudiences: sets.New(""),
RegistriesSkippingTagResolving: sets.New("kind.local", "ko.local", "dev.local"),
RuntimeClassNames: map[string]RuntimeClassNameLabelSelector{"_": RuntimeClassNameLabelSelector{}},
DefaultAffinityType: defaultAffinityTypeValue,
},
},
in config_test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skonto, thank you for your thoughts. More validation is essential and will make it more robust.
I've tried adding in
+ if err := apivalidation.ValidateRuntimeClassName(class, field.NewPath(".data."+RuntimeClassNameKey)); err != nil {
+ return nil, fmt.Errorf("%v invalid: %v", RuntimeClassNameKey, err)
+ }
but now have an import failure. Haven't gotten one of these errors in a bit
$ go mod tidy
go: finding module for package k8s.io/kubernetes/pkg/apis/core/validation
go: found k8s.io/kubernetes/pkg/apis/core/validation in k8s.io/kubernetes v1.30.2
go: downloading k8s.io/component-helpers v0.0.0
go: downloading k8s.io/kubelet v0.0.0
go: downloading k8s.io/cloud-provider v0.0.0
go: finding module for package k8s.io/client-go/features
go: found k8s.io/client-go/features in k8s.io/client-go v0.30.2
go: knative.dev/serving/pkg/deployment imports
k8s.io/kubernetes/pkg/apis/core/validation imports
k8s.io/component-helpers/node/util/sysctl: reading k8s.io/component-helpers/go.mod at revision v0.0.0: unknown revision v0.0.0
go: knative.dev/serving/pkg/deployment imports
k8s.io/kubernetes/pkg/apis/core/validation imports
k8s.io/component-helpers/scheduling/corev1: reading k8s.io/component-helpers/go.mod at revision v0.0.0: unknown revision v0.0.0
go: knative.dev/serving/pkg/deployment imports
k8s.io/kubernetes/pkg/apis/core/validation imports
k8s.io/kubelet/pkg/apis: reading k8s.io/kubelet/go.mod at revision v0.0.0: unknown revision v0.0.0
go: knative.dev/serving/pkg/deployment imports
k8s.io/kubernetes/pkg/apis/core/validation imports
k8s.io/kubernetes/pkg/cluster/ports imports
k8s.io/cloud-provider/options: reading k8s.io/cloud-provider/go.mod at revision v0.0.0: unknown revision v0.0.0
from adding the dependencies
"k8s.io/apimachinery/pkg/util/validation/field"
apivalidation "k8s.io/kubernetes/pkg/apis/core/validation"
I suspect it is something with version pinning and will continue investigating further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just import apimachineryvalidation.NameIsDNSSubdomain(name, false)
and it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated in 4adbced
pkg/deployment/config_test.go
Outdated
wantErr: true, | ||
data: map[string]string{ | ||
QueueSidecarImageKey: defaultSidecarImage, | ||
"runtime-class-name": ` ???; 231424 `, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear the error here is captured due to yaml unmarshaling:
error unmarshaling JSON: while decoding JSON: json: cannot unmarshal string into Go value of type map[string]deployment.RuntimeClassNameLabelSelector
and does not cover the K8s validation aspects see comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to implement the use of ValidateRuntimeClassName as suggested.
If the YAML cannot be parsed, how should the validation be handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to emphasize that some K8s validation might be needed. After yaml is parsed we can check for a valid name according to K8s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, it makes a lot of sense to have it.
if !v.Matches(lbs) || v.specificity() < specificity { | ||
continue | ||
} | ||
if v.specificity() > specificity || strings.Compare(k, runtimeClassName) < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is string comparison a rule to select a class when they have equal specificity?
Looks a bit ad hoc imho. Btw could you add a test for this and some scenario with multiple runtime classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation comes from the existing domain selection code for a Service
serving/pkg/reconciler/route/config/domain.go
Lines 95 to 146 in 4fe029f
func NewDomainFromConfigMap(configMap *corev1.ConfigMap) (*Domain, error) { | |
c := Domain{Domains: map[string]DomainConfig{}} | |
hasDefault := false | |
for k, v := range configMap.Data { | |
if k == configmap.ExampleKey { | |
continue | |
} | |
internalConfig := domainInternalConfig{} | |
err := yaml.Unmarshal([]byte(v), &internalConfig) | |
if err != nil { | |
return nil, err | |
} | |
if len(internalConfig.Selector) == 0 { | |
hasDefault = true | |
internalConfig.Type = DomainTypeWildcard | |
} | |
c.Domains[k] = DomainConfig{ | |
Selector: &LabelSelector{Selector: internalConfig.Selector}, | |
Type: internalConfig.Type, | |
} | |
} | |
if !hasDefault { | |
c.Domains[DefaultDomain] = DomainConfig{Selector: &LabelSelector{}, Type: DomainTypeWildcard} | |
} | |
return &c, nil | |
} | |
// LookupDomainForLabels returns a domain given a set of labels. | |
// Since we reject configuration without a default domain, this should | |
// always return a value. | |
func (c *Domain) LookupDomainForLabels(labels map[string]string) string { | |
domain := "" | |
specificity := -1 | |
// If we see VisibilityLabelKey sets with VisibilityClusterLocal, that | |
// will take precedence and the route will get a Cluster's Domain Name. | |
if labels[networking.VisibilityLabelKey] == serving.VisibilityClusterLocal { | |
return "svc." + network.GetClusterDomainName() | |
} | |
for k, v := range c.Domains { | |
// Ignore if selector doesn't match, or decrease the specificity. | |
if !v.Selector.Matches(labels) || v.Selector.specificity() < specificity { | |
continue | |
} | |
if v.Selector.specificity() > specificity || strings.Compare(k, domain) < 0 { | |
domain = k | |
specificity = v.Selector.specificity() | |
} | |
} | |
return domain | |
} |
TLDR; the one with most matching labels wins.
Good thinking. I've added two tests for this behaviour in 4adbced#diff-2a14f9c49eb40c3a55958c8e2d57523cd618232821a9d310c1ddd7cc64d91ef9R476-R517
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR; the one with most matching labels wins.
Also a classname can't have a label (key, value) that does not exist in the service labels.
On top of that, if you have two class names that have the same specificity, lexicographic order in class names is what matters:
("three" < "two"):
{
name: "priority with multiple label selectors and three labels set",
serviceLabels: map[string]string{
"needs-two": "yes",
"needs-three": "yes",
"needs-four": "yes",
},
runtimeClassNames: map[string]RuntimeClassNameLabelSelector{
"one": {},
"two": {
Selector: map[string]string{
"needs-two": "yes",
"needs-three": "yes",
},
},
"three": {
Selector: map[string]string{
"needs-two": "yes",
"needs-three": "yes",
},
},
},
want: ptr.String("three"),
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a classname can't have a label (key, value) that does not exist in the service labels.
I'm not sure I understand. Are you meaning to validate a selector against the labels of Services that exist?
If so, that wouldn't make a lot of sense to me.
On top of that, if you have two class names that have the same specificity, lexicographic order in class names is what matters:
("three" < "two"):
Yes that is true, but I would think an uncommon practical issue.
Do you have any alternative patterns for selected which I may consider?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand.
serviceLabels: map[string]string{
"needs-two": "yes",
"needs-three": "yes",
"needs-four": "yes",
},
runtimeClassNames: map[string]RuntimeClassNameLabelSelector{
"one": {},
"two": {
Selector: map[string]string{
"needs-two": "yes",
"needs-five": "yes",
},
},
two is ignored due to the v.MatchLabels condition. Matches has this loop that checks against the service labels:
for label, expectedValue := range s.Selector {
value, ok := labels[label]
if !ok || expectedValue != value {
return false
}
}
I am trying to summarize the key points for the logic. It is not just "the one with most matching labels wins."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any alternative patterns for selected which I may consider?
I am ok as long as we document it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That appears to be correct behaviour, the service labels in the above sample don't include "need-five".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skonto, wonderful!
I've added a line here to explain the behavior: https://github.com/knative/docs/pull/6022/files#diff-55b5f9ddd150abde7c42e2c06362452f26f4d9ce5d902ca5db8c0e27eef518e2R141
The feature needs proper documentation as well. Pls create a PR against knative/docs after this is merged. |
4adbced
to
3726752
Compare
func (d Config) PodRuntimeClassName(lbs map[string]string) *string { | ||
runtimeClassName := "" | ||
specificity := -1 | ||
for k, v := range d.RuntimeClassNames { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create a func for calculating specificity and also unify types: LabelSelector (domain) and RuntimeClassNameLabelSelector to avoid duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say do it in a follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skonto, apologies since I missed this message. Agree, that would be helpful!
@dprotaso gentle ping for review. |
pkg/deployment/config.go
Outdated
if slices.Contains(apimachineryvalidation.NameIsDNSSubdomain(class, false), class) { | ||
return nil, fmt.Errorf("%v %v selector not valid DNSSubdomain", RuntimeClassNameKey, class) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing and it doesn't seem right.
Looking at the code any string returned by apimachineryvalidation.NameIsDNSSubdomain
implies there's an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having some weird results with the function.
package main
import (
"fmt"
apimachineryvalidation "k8s.io/apimachinery/pkg/api/validation"
)
func main() {
for _, v := range []string{"a", "gvisor", "kata"} {
fmt.Println(v, apimachineryvalidation.NameIsDNSSubdomain(v, false))
}
}
results in
🐚(2) go run .
a []
gvisor []
kata []
looking at the tests, a
should succeed
https://github.com/kubernetes/apimachinery/blob/master/pkg/util/validation/validation_test.go#L55
I set it how it is currently around the tests, as the tests seems to match up right.
Happy to update and understand what's happening here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty array means it's valid - so I think we should just error out if there is a non-empty array returned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, revisited it again and according to the tests the behaviour should be in place now.
Updated in 6b6ed60
3726752
to
6b6ed60
Compare
pkg/deployment/config.go
Outdated
@@ -175,6 +228,19 @@ func NewConfigFromMap(configMap map[string]string) (*Config, error) { | |||
return nil, fmt.Errorf("unsupported %s value: %q", defaultAffinityTypeKey, affinity) | |||
} | |||
} | |||
if err := yaml.Unmarshal([]byte(runtimeClassNames), &nc.RuntimeClassNames); err != nil { | |||
return nil, fmt.Errorf("%v cannot be parsed, please check the format", RuntimeClassNameKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should include the error messages in the error we are returning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fa369bb
to
0cc8edc
Compare
allow the setting of Pod RuntimeClassName via defaults
0cc8edc
to
f0832f4
Compare
/lgtm thanks for all the changes! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: BobyMCbobs, dprotaso The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
allow the setting of Pod RuntimeClassName via defaults
Fixes #12729
Proposed Changes
Release Note
inspired by
serving/pkg/reconciler/route/config/domain.go
Lines 95 to 146 in 4fe029f