-
Notifications
You must be signed in to change notification settings - Fork 39.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use omitempty for optional fields in Job Pod Failure Policy #126046
Use omitempty for optional fields in Job Pod Failure Policy #126046
Conversation
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
9fcb840
to
f523be6
Compare
@alculquicondor please add to kubernetes/enhancements#3329 |
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
/label api-review |
I've checked that it only matters which version of k8s api is used to build the JobSet webhook. In particular, I was able to successfully create JobSets on k8s 1.26 and 1.27 clusters using JobSet built against a patched version of k8s API (for 1.29). |
Thinking through our serialization issues
On balance, I think it's better to let a consumer handle the empty array to nil shift than to have weird embedding problems. /lgtm |
LGTM label has been added. Git tree hash: 3c37175e4d9e62b18975b5ffa9caefeb956f6c20
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks, I will prepare cherry-picks down to 1.28 https://kubernetes.io/releases/. I think it will be particularly useful to cherry-pick to 1.29 or 1.30 for JobSet. cc @danielvegamyhre |
One more thing tested is I compared the response patches generated by the webhook generated using k8s.io/api before and after the change. To see that I enable log level 10 in the API server, and use before:
after:
So it is surprising that in the broken flow API server receives "null" for |
I've opened the cherry-pick PR. PTAL |
Awesome thanks for the quick fix on this! Taking a look now. |
|
||
// Represents the requirement on the pod conditions. The requirement is represented | ||
// as a list of pod condition patterns. The requirement is satisfied if at | ||
// least one pattern matches an actual pod condition. At most 20 elements are allowed. | ||
// +listType=atomic | ||
// +optional | ||
OnPodConditions []PodFailurePolicyOnPodConditionsPattern `json:"onPodConditions" protobuf:"bytes,3,opt,name=onPodConditions"` | ||
OnPodConditions []PodFailurePolicyOnPodConditionsPattern `json:"onPodConditions,omitempty" protobuf:"bytes,3,opt,name=onPodConditions"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where we're validating immutable jobs on update:
allErrs = append(allErrs, apivalidation.ValidateImmutableField(spec.PodFailurePolicy, oldSpec.PodFailurePolicy, fldPath.Child("podFailurePolicy"))...)
would an empty array vs nil cause a validation failure, or are we collapsing those with apiequality.Semantic.DeepEqual?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added to validation_test.go the following test case and it passes:
"update pod failure policy - nil to empty": {
old: batch.Job{
ObjectMeta: metav1.ObjectMeta{Name: "abc", Namespace: metav1.NamespaceDefault},
Spec: batch.JobSpec{
Selector: validGeneratedSelector,
Template: validPodTemplateSpecForGeneratedRestartPolicyNever,
PodFailurePolicy: &batch.PodFailurePolicy{
Rules: []batch.PodFailurePolicyRule{{
Action: batch.PodFailurePolicyActionIgnore,
OnPodConditions: []batch.PodFailurePolicyOnPodConditionsPattern{},
OnExitCodes: &batch.PodFailurePolicyOnExitCodesRequirement{
Operator: batch.PodFailurePolicyOnExitCodesOpIn,
Values: []int32{42},
},
}},
},
},
},
update: func(job *batch.Job) {
job.Spec.PodFailurePolicy.Rules[0].OnPodConditions = nil
},
},
(I could post a follow up PR). Yes, we use apiequality.Semantic.DeepEqual
underneath by calling
func ValidateImmutableField(newVal, oldVal interface{}, fldPath *field.Path) field.ErrorList { |
Do you have the jobset CRD that gets produced prior to this PR? Those validation errors when submitting the jobset CR object look to be coming from the server, not client-side validation. I'm surprised the null values aren't getting pruned out by the server by PruneNonNullableNullsWithoutDefaults when submitting the object. |
I mostly agree that the peer However, I don't understand why a CRD produced from this is complaining about null values for these fields... the server explicitly strips out null values for non-nullable fields without default values in the schema, so it should not have been complaining about these fields. Seeing the JobSet CRD schema would help in understanding why this is causing an error. |
Here is a link to the relevant section of the base CRD in the JobSet repo. Note the CRD is quite large since it has a batchv1.JobTemplate embedded in it. |
A guess, maybe there is a different code path in the api server, because the webhook actually returns the base64 encoded patch rather than the full object (as in the #126046 (comment)) |
there are custom steps done after applying the patch from the webhook to decode and apply defaulting: kubernetes/staging/src/k8s.io/apiserver/pkg/admission/plugin/webhook/mutating/dispatcher.go Lines 381 to 390 in 8ba158c
it appears that not all of the special things added into CR decoding are being done there kubernetes/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go Line 1099 in 8ba158c
kubernetes/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go Lines 1226 to 1251 in 8ba158c
kubernetes/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go Lines 1311 to 1376 in 8ba158c
|
…46-upstream-release-1.29 Automated cherry pick of #126046: Use omitempty for optional fields in Job Pod Failure Policy
…46-upstream-release-1.28 Automated cherry pick of #126046: Use omitempty for optional fields in Job Pod Failure Policy
…46-upstream-release-1.30 Automated cherry pick of #126046: Use omitempty for optional fields in Job Pod Failure Policy
What type of PR is this?
/kind bug
/kind api-change
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #126040
Special notes for your reviewer:
My procedure for testing (repro):
0. repro by creating Jobset with Job template embedding the snippet:
In results in the following errors:
k8s.io/api
:https://github.com/kubernetes/api
into JobSet repo underbin/k8s_api/api
, and checkout therelease-1.29
branchgo.mod
withreplace k8s.io/api => ./bin/k8s_api/api
to use the code, and adjust its Dockerfile with new linereplace k8s.io/api => ./bin/k8s_api/api
, thengo mod tidy
make kind-image-build
kind load docker-image gcr.io/k8s-staging-jobset/jobset:0b8c19c-dirty --name kind
kubectl edit deploy/jobset-controller-manager -njobset-system
kubectl get job -oyaml
returns the correct snippet:(same snippet is created when creating the Job directly, without JobSet)
I have also confirmed that it only matters which version of k8s API is used to build the JobSet. In particular, when used fixed version to build JobSet, then I was able to create JobSets on 1.26 and 1.27 k8s clusters.
I see we did similar fixes in the past like here: #121000
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: