Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the proper way to compare resource object? #592

Closed
chenwng opened this issue Feb 6, 2019 · 26 comments
Closed

What is the proper way to compare resource object? #592

chenwng opened this issue Feb 6, 2019 · 26 comments
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@chenwng
Copy link

chenwng commented Feb 6, 2019

In the generated code, reflect.DeepEqual is used to check if the resource is changed. I created a deployment then used reflect.DeepEqual to compare it with the one fetched from k8s. The issue is the one fetched from k8s was changed by deployment controller (I assume it is) by filling with default values and is different from the deployment I constructed. This will trigger a update to deployment. And it seems this will happen continuously. Is this something expected? Or I am missing something?

Thanks.

@Adirio
Copy link
Contributor

Adirio commented Feb 20, 2019

/triage support
Only pass the Deployment.Spec to reflect.DeepEqual. Deployment.Status is expected to change.

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Feb 20, 2019
@chenwng
Copy link
Author

chenwng commented Feb 22, 2019

Thanks for your reply. I did some further tests. It is like what you said the Deployment.Status was kept being updated.
I also dumped the deployment spec fetched from api server. It looks like it was updated by deployment controller to fill with default values, but that was not the reason for the update event. I think I misunderstood the reason for those update events.
If Deployment.Status is something I don't care about, how can I ignore those update events?

@Adirio
Copy link
Contributor

Adirio commented Feb 22, 2019

Just don't take any action.
You could probably use a predicate to filter those events too but that will make little difference.

@nrfox
Copy link

nrfox commented Feb 22, 2019

I have run into this issue as well. I think doing reflect.DeepEqual(generatedSpec, existingSpec) on the whole spec can be problematic for a couple reasons.

  1. If the apiserver does any defaulting of the resource object then the generated spec from the controller needs to be filled out entirely to ensure that the generated spec matches the existing spec.
  2. This check: reflect.DeepEqual(generatedSpec, existingSpec) will always fail for certain resource types. Services of type ClusterIP that do not have their ClusterIP field set will get a ClusterIP assigned to them by the apiserver (or something internally). The deepequal check will fail and the controller will try to Update the service spec with the empty ClusterIP field and get an error back since this field cannot be updated once it has been set.

Maybe it’d be better if the scaffolding only compared for changes for things that the controller should take action on or are part of the CRD abstraction e.g. the Replicas field on the deployment should be updated if it doesn’t match the Replicas field on the CRD instead of doing a blanket deepequal on the spec.

@schweikert
Copy link

It also bothered me that too many unnecessary updates were done because of the DeepEqual, so I implemented this approach:

  • Use hashstructure to calculate a hash of the deployment that I want to have
  • Store that hash under an annotation ("mydomain/last-applied-hash")
  • In the reconcile function, compare the stored hash with the newly computed hash, and only update if they differ

This way, I don't need to do a deep-compare each time the reconcile function is triggered, and I also push less changes to kubernetes.

@daxmc99
Copy link

daxmc99 commented May 29, 2019

https://github.com/banzaicloud/k8s-objectmatcher might be a possible solution to this?

@DirectXMan12
Copy link
Contributor

Server-side apply is the ultimate solution, when it eventually lands as beta.

@pepov
Copy link

pepov commented Aug 2, 2019

@DirectXMan12 will the server-side apply solution work without contacting the API server? Isn't a dry-run call against the API necessary in that case as well?

@pepov
Copy link

pepov commented Aug 2, 2019

Ideally (for example with https://github.com/banzaicloud/k8s-objectmatcher and controller-runtime of course) we get some (or most) of our managed objects from the cache and we can decide whether there are any changes or not locally.

I understand server-side apply is more elegant, but if it involves a call to the API server then it's not exactly a solution to the above problem.

@DirectXMan12
Copy link
Contributor

@pepov with server-side apply, you do need to contact the api server, but you don't need to compare objects. Instead, you explicitly set all the fields you care about on an empty object (not one you've gotten from the cache) and then submit that to the API server. The API server takes care of figuring out the difference. That means that you always submit against the API server, but that the submission doesn't always change things, without having to care about local comparison.

@DirectXMan12
Copy link
Contributor

(whoops, didn't mean to close)

@DirectXMan12 DirectXMan12 reopened this Aug 5, 2019
@pepov
Copy link

pepov commented Aug 6, 2019

@DirectXMan12 the issue in our case was that we hit the API server too hard with too much requests, also the overall time for an operator cycle increased because of this.

@chenwng
Copy link
Author

chenwng commented Aug 6, 2019

@pepov with server-side apply, you do need to contact the api server, but you don't need to compare objects. Instead, you explicitly set all the fields you care about on an empty object (not one you've gotten from the cache) and then submit that to the API server. The API server takes care of figuring out the difference. That means that you always submit against the API server, but that the submission doesn't always change things, without having to care about local comparison.

In this case, when should we make the submit? If we make the submit in every reconciliation, will that cause too much load for api server in worst case?
Btw, is this server-side apply available now?

@DirectXMan12
Copy link
Contributor

@ChenDoRo ideally, no, it should not cause too much load, but it depends on your usecase. It's in alpha now, and will most likely be beta in the next kubernetes release

@pepov was it actually a server-side issue? You can also hit client-side rate limiting pretty easy with the default values. At any rate, I'd be curious to see your numbers on that.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 4, 2019
@JeremyMarshall
Copy link

Maybe late to the party but I raised this in kubernetes and they pointed me at kubernetes/apimachinery#75

apiequality.Semantic.DeepEqual, see https://godoc.org/k8s.io/apimachinery/pkg/api/equality)

@caarlos0
Copy link
Contributor

apiequality.Semantic.DeepEqual, see https://godoc.org/k8s.io/apimachinery/pkg/api/equality)

this still won't work in some cases, like when the default for a field is not its zero-value.

I end up doing the same thing @schweikert recommended.

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 27, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@feloy
Copy link

feloy commented May 8, 2020

I also have the problem using reflect.DeepEqual or equality.Semantic.DeepEqual because some fields are set with default non-zero values by some controller (like ImagePullPolicy, RestartPolicy, and so on) if these fields are not set by the operator.

I found this utility function that makes the job for me: it compares only fields that are non-zero in the expected struct:

import "k8s.io/apimachinery/pkg/api/equality"


if !equality.Semantic.DeepDerivative(expected.Spec, found.Spec) {
  // some field set by the operator has changed
}

@mjaow
Copy link

mjaow commented Jan 27, 2021

I also have the problem using reflect.DeepEqual or equality.Semantic.DeepEqual because some fields are set with default non-zero values by some controller (like ImagePullPolicy, RestartPolicy, and so on) if these fields are not set by the operator.

I found this utility function that makes the job for me: it compares only fields that are non-zero in the expected struct:

import "k8s.io/apimachinery/pkg/api/equality"


if !equality.Semantic.DeepDerivative(expected.Spec, found.Spec) {
  // some field set by the operator has changed
}

It will ignore some update, maybe I'm trying to add a field to deployment(or delete some fields of it), but it will ignore it

@devlifealways
Copy link

Same here, Semantic.DeepDerivative ignores a couple of fields, which causes the test to fail each time.

@rda3mon
Copy link

rda3mon commented Apr 11, 2021

It also bothered me that too many unnecessary updates were done because of the DeepEqual, so I implemented this approach:

  • Use hashstructure to calculate a hash of the deployment that I want to have
  • Store that hash under an annotation ("mydomain/last-applied-hash")
  • In the reconcile function, compare the stored hash with the newly computed hash, and only update if they differ

This way, I don't need to do a deep-compare each time the reconcile function is triggered, and I also push less changes to kubernetes.

This works very well, but can avoid having one more dependency by using crypto/sha256 instead of hashstructure on marshalled spec to get a hash. Here is an example

var hashStore = make(map[string]string)
newSS := r.buildStatefulSet(m, d)
newSSMarshal, _ := json.Marshal(newSS)
func asSha256(o interface{}) string {
  h := sha256.New()
  h.Write([]byte(fmt.Sprintf("%v", o)))

  return fmt.Sprintf("%x", h.Sum(nil))
}
else if asSha256(newSSMarshal) != hashStore[newSS.Name] {
  log.Info("Updating StatefulSet", "StatefulSet.Namespace", newSS.Namespace, "StatefulSet.Name", newSS.Name)
  err = r.Update(ctx, newSS)
  if err != nil {
    log.Error(err, "Failed to update StatefulSet", "StatefulSet.Namespace", newSS.Namespace, "StatefulSet.Name", newSS.Name)
    return ctrl.Result{}, err
  }
  hashStore[newSS.Name] = asSha256(newSSMarshal)
  return ctrl.Result{Requeue: true}, nil
}

Chrisbattarbee added a commit to palantir/k8s-spark-scheduler that referenced this issue Oct 21, 2021
k8s sets fields when we submit our CRD to the api, this causes issues when we come to compare local versions to it later. kubernetes-sigs/kubebuilder#592
Chrisbattarbee added a commit to palantir/k8s-spark-scheduler that referenced this issue Oct 21, 2021
* Loosen equality on crds.

k8s sets fields when we submit our CRD to the api, this causes issues when we come to compare local versions to it later. kubernetes-sigs/kubebuilder#592

* Remove redundant check

* Remove redundant word

* Address comments
@aweis89
Copy link

aweis89 commented Jun 27, 2022

It also bothered me that too many unnecessary updates were done because of the DeepEqual, so I implemented this approach:

  • Use hashstructure to calculate a hash of the deployment that I want to have
  • Store that hash under an annotation ("mydomain/last-applied-hash")
  • In the reconcile function, compare the stored hash with the newly computed hash, and only update if they differ

This way, I don't need to do a deep-compare each time the reconcile function is triggered, and I also push less changes to kubernetes.

One thing to be mindful about this approach is it doesn't eliminate configuration drift that originates outside the controller in-question. If someone updates the controlled object state, say using kubectl, in a way that the desired state is out of sync with the actual state, the generated hash still won't differ and the controller won't revert that change (until the reconciliation happens to produce a different result from its last update). This is often not how controllers are meant to behave.

The controllerutil lib uses the equality.Semantic.DeepEqual(existing, obj) method which I think is generally what folks should use for the above the reason

@Madongming
Copy link

In the generated code, reflect.DeepEqual is used to check if the resource is changed. I created a deployment then used reflect.DeepEqual to compare it with the one fetched from k8s. The issue is the one fetched from k8s was changed by deployment controller (I assume it is) by filling with default values and is different from the deployment I constructed. This will trigger a update to deployment. And it seems this will happen continuously. Is this something expected? Or I am missing something?

Thanks.

I also encountered a similar problem. The solution is to update first, and use the client.DryRunAll parameter option, and then use reflect.DeepEqual to compare Spec to achieve the goal. Code reference

	if err := r.Client.Update(ctx, object, client.DryRunAll); err != nil { // client package is "sigs.k8s.io/controller-runtime/pkg/client"
		return err
	}

	if reflect.DeepEqual(curObjct.Spec, newObject.Spec) {
		logger.Info("Object is not changed, skip update")
		return nil
	}
        if err := r.Client.Update(ctx, object); err != nil {
		return err
	}

It‘s work now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests