-
Notifications
You must be signed in to change notification settings - Fork 258
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Proposal for handling late-initialization in ACK.
- Loading branch information
Showing
3 changed files
with
278 additions
and
0 deletions.
There are no files selected for viewing
1 change: 1 addition & 0 deletions
1
docs/design/proposals/late-initialization/images/late_initialization.drawio
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<mxfile host="Electron" modified="2021-06-23T20:57:36.076Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/12.9.9 Chrome/80.0.3987.163 Electron/8.2.1 Safari/537.36" etag="88_X_xZvr2moexoWVV37" version="12.9.9" type="device"><diagram name="Page-1" id="e7e014a7-5840-1c2e-5031-d8a46d1fe8dd">7Vxbc6M2FP41nmkf4kEIcXm0nezutNvpTtLObh8JKLa6GCjIcdxfXwkkAxKOWcfYNHYmk6CDEOIcfecqGMHZ8uVj5qeL35IQRyPTCF9G8HZkmo5rs7+csCkJyDVKwjwjYUkCFeGB/IsFUXZbkRDnjY40SSJK0iYxSOIYB7RB87MsWTe7PSVR866pP8ca4SHwI536lYR0IajA9qoTnzCZL8StXVM88KMffJ9nySoW9xuZ8Kn4KU8vfTmWeNB84YfJukaCdyM4y5KElkfLlxmOOGsl28rrPuw4u513hmPa5YJfbsB6Yjt/W+7kc7iZm5NPd96NJYZ59qMVls9RzJZuJIfYxFN+uFpGn8kTjkjMWlN2Y0oCkvrF3dm5u5gSuuFncEaWmOKM0SNxwZeKNl0vCMUPqR/wQddsUTHagi4j1gLskAma+uySbNuOIj/NyWMxH4NRMhysspw843ucl+uJU58xn5AfTSIyjxmNJnzgnN2HxPM/eOMW2owS+Y84mm6FN0uiJCseVIoPTpMV5ZOebZccH1/nt+QduzF+qZEE/z/ihD1ytmFd5FlHrAWBFSia62rhWYK0qC05JPv5Yq3PtyNX8mYHQuQ/In5TE/+HaPXCKLNbbR2wh2QTmvqCvwHjQyHPJyawB9ENtAhiScKQn5Sy+Iyf+IPBinIvnrUg0Sz5jqVU4qRYbFlCfepXKyBNSEwLZqAp+2XsmRljNEJsmjPWBlWb/fLuGWXCZEOzhcVvhP2crnFOjyRW1BSrpUnVbZGq1CPHFyq8Yvp0mIaK8KHdDdNylRxf/JYmfkyD8ArnzhKFTYl6Z4azfYXz6eCMLNSEM+oIZ9ST+FEH8ddE04TSrSJrBpxFMk9iP6pLW2fb6wvxBwxjg5eOzkvQwksA+uKl7bwbXppmN2Z6vfHS3c9LpgOecSi4ieNwwuM1bmsiP89J0FQq+IXQb/x4bIrWXxzWY9OzRfv2ReC8aGxqjZoEdiiCV6WQJ6sswB0WDg4bwaQurJo0UJuaELQMRz5l2rGhpNpEJO7wha/Eai0oOmrbliOUzyMuqoeEyjim0xzHUcahfjbHVBuHidHf1LoJnHSeLkKvzspGrbOqFmt5/2rpbvl/+Gp29UDo/6oZoHduNWvpanYy+5UHlcxvyJivwPhx9Ua7OiRmU7xn9kalBnsNKPs0/U6u7NXDMl1VKqbXOsJBKWzgNkMKAJSMTleNbSmxCbCVgY6ksrdOrjLhXpUw0pORYPzr6pE5+PxaP02jzS7FUVterSB/IlGkkDRds1O3tIVDRWRSODdHijy2iWIJdD05CIw2qPeVHEQdbOLBUB8IMKGSkQUqN7sCEyFvjF4fagc0j4YePe1njgsOsXA9IExi7xY6ljE06Og5OI3715zNscTvKEbx/DkbpIl/8vVhF/6uPq8mUKDgWZPnaX3eDmmjgw3hXldW6pKBWEykWky1ONnVYjqqi6kO1Le91BNYkNvLWYZ9yoylUUCWsUtEJe/WetqK9dyayrr5bFOg/ZlPr0fA7Q0yBwY4qMSO5qGxo4bcjrHjsQAnb1eTqlUB7l7g7PcVTVf0gtBmnRttdtseIDuipTsRN+Rg/7Pi+5Wm1eanm6Bk94Sv9PnjT0V2l/sP7G/9GIGfq8vZ0Zz/R1z+f6ZhqXDvt6rWkBN4zGRXn11uPKQ4aD37WJxlPs0q56xkfGD+bkhoErd2D3j3SRxzL4jwPmUHdln5xGWv97sI1YBpqwrOtwiPkGyQFaQbYwwbJSTHsY5QQupsP+xB2Q9keWNHKakYB5oQ6Bhjc89YfVsRPc1hX26SsG0L4WlTHXaHVEd/wdHQ8vyeIh50KNCgmk7UhuobZ3oOw7nQdKLZZh1PC7IOm4DeWyp+m+N5eypeG6pv7LTFr6fxqN3Co85JPG94uHl5SVArwBsh93aWJMbcZaYLnHHHm/BGnPC+Cz+eF6S4PD+qJ0MuzGFGrrLJoi1HclKVINFyLTGcZJf3wCoMjh6zfyS0iKHTZBcMr4WGrViHUVhwjhD07uTCft95WPZfeY/CO9D6A7V2dFrT7+jhqcFNconOdJUv3q+RtJucd85uIvUQRmO+sIklx+ULoOB1dkeFGtyrLUvLdQO66k9VVRbm7kuSkyJFCW+zsm/b5tN+y7FKzgG2hENOm950+5KrHg15KsaMGK8bDquRF7njC8EePD/4+iyaD9NkWYemVYGrRqzaUH2bLb0YDowLSqsCVZLw3PDpswQ+EPio2VLLhIfhB9oaftShesaPNH91/IBLSZeq4DHPDR63w1ct9hYTYzaTb4JLRaOsJQLDkoSqmFi0NvXW8cqJssSxN56T7zANBN3QRmqt0LIOBDhCllrD1MbqG+F6YAfMxvaFS9kvhlxL1bZWC+ItJHudBvN91iEHAil1B5dlH4gnx9IMpjpU33DSY3MAKzhdym4wpLpAbXFb24rqD0cXUGp0rOaLz9ahyUYNkCfONrp6kA2soeype6+YVbcHoLPvl3Y7fIuh6+Y5uR+u9HY9G41Ot3NuYB4skq/USUFbyhDd41N7bKtjMeZ6p1UWekrhWjM84GtvCvpbsuEnLSJ6eurh+g2/N3/D7/xi1XMY18+4veUzbueXqKlJ9PpZlIMDJ3No0tXzQ9d3ft/wzm+PAmXN6rPLpXdVfdoa3v0H</diagram></mxfile> |
Binary file added
BIN
+73.6 KB
docs/design/proposals/late-initialization/images/late_initialization.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
277 changes: 277 additions & 0 deletions
277
docs/design/proposals/late-initialization/late-initialization.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,277 @@ | ||
# ACK Handling Server Side Defaults | ||
|
||
## Problem | ||
|
||
During reconciliation, ACK service controller consumes the custom resource spec (i.e. desired state) from “etcd” and | ||
then creates/updates the AWS resource to match that desired state. ACK service controller uses the custom resource spec | ||
fields to construct Create/Update AWS API request. | ||
|
||
However, some AWS API’s behavior does not completely rely on the fields present in Create/Update AWS API request alone. | ||
In the backend, these AWS APIs either add new fields with default values or update the value of field present in request | ||
with newly calculated value which represents the latest state of the resource in AWS. AWS APIs return these default or | ||
updated values with Create/Update API output and Read/List/Describe API output after resource creation. | ||
|
||
There are two main problems arise due to this behavior of AWS APIs, | ||
|
||
a) When an AWS API does not strictly use the property value as is in the backend, but updates it, the desired spec of | ||
the K8s resource in the “etcd” does not correctly represent the actual state of resource in AWS. This is a problem because | ||
for K8s user, the spec of custom resource is the source of truth for state of K8s resource. | ||
|
||
b) When a default field is populated during AWS resource creation, then during reconciliation, there will always be delta | ||
present between desired and latest K8s representation of the custom resource and an Update AWS API call be made during | ||
reconciliation. This update call may or may not have any side-effects on AWS resource and currently service teams either | ||
place custom code to avoid this OR let the calls happen if it does not alter state of the resource. | ||
|
||
> Problem (b) can be viewed as a variation of Problem (a), where server changes the property value from “null” to something. | ||
This is worth noting because solving Problem (a) in a generic and consistent manner will provide the solution for | ||
Problem (b) as well. | ||
|
||
|
||
## Solutions | ||
|
||
To compare the solutions for above problem, we will use the following facets: | ||
|
||
1. Consistency with K8s conventions | ||
2. Consistency across ACK service controllers | ||
3. GitOps compatibility | ||
4. User experience for custom resource lifecycle | ||
|
||
✅ → doable with no/some efforts | ||
|
||
❌ → not achievable | ||
|
||
### Solution 1 | ||
|
||
In this solution, the status fields will include all the fields which are CreateOutput shape members. Current status | ||
fields are the CreateOutput shape members which are not present in CreateInput shape (spec fields). | ||
|
||
During reconciliation, the Read(One|Many),Update request inputs will be created from desired state but giving precedence | ||
to status fields over spec in case of conflict. | ||
|
||
|
||
1. *Consistency with K8s conventions* ❌ | ||
1. In K8s convention, the spec fields are not duplicated into status fields. This will confuse k8s users when they | ||
describe their custom resources. | ||
2. *Consistency across ACK service controllers* ✅ | ||
1. There should be no complexity in following this approach across all ACK service controllers. | ||
3. *GitOps compatibility* ✅ | ||
1. There should be no complexity in implementing GitOps for ACK custom resources. | ||
2. Approach mentioned in low level design below should work. | ||
4. *User experience for custom resource lifecycle* ✅ | ||
1. Applying updates to custom resource can become confusing for the user. | ||
2. To update custom resource spec, K8s user will have to check resource status because status will have most updated | ||
value of all the fields. | ||
|
||
Other Cons : | ||
|
||
* The number of status fields will be very large and there will be a lot of duplication in custom resource description | ||
|
||
|
||
### Solution 2 | ||
|
||
This solution builds on top of “Solution 1”, with the assumption that “Problem” mentioned above is not very common | ||
amongst AWS APIs. | ||
|
||
In this solution, service teams can specify the spec fields which will be duplicated in both spec and status of custom | ||
resource using generator.yaml | ||
|
||
|
||
1. *Consistency with K8s conventions* ❌ | ||
1. In K8s convention, the spec fields are not duplicated into status fields. This will confuse k8s users when they | ||
describe their custom resources. | ||
2. *Consistency across ACK service controllers* ❌ | ||
1. Since all AWS APIs do not have this “Problem”, behavior across different service controllers will not seem consistent. | ||
3. *GitOps compatibility* ✅ | ||
1. There should be no complexity in implementing GitOps for ACK custom resources. | ||
2. Approach mentioned in low level design below should work. | ||
4. *User experience for custom resource lifecycle* ✅ | ||
1. Applying updates to custom resource can become confusing for the user. | ||
2. To update custom resource spec, K8s user will have to check resource status because status will have most updated | ||
value for some of the fields. | ||
|
||
|
||
|
||
### Solution 3 (Preferred Solution) | ||
|
||
In this solution, If a default value is added or existing value is updated as part of Create/Update API output, ACK service controller will update the spec/status of the custom resource accordingly. These updates will be persisted in custom resource’s Annotation to refer in future reconciliations. | ||
|
||
|
||
1. *Consistency with K8s conventions* ✅ | ||
1. In K8s convention, the spec fields are not duplicated into status fields. | ||
2. However we will be updating the desired state of custom resource from ACK service controller, which is not really the recommended k8s way. | ||
2. *Consistency across ACK service controllers* ✅ | ||
1. All ACK service controllers will be consistent. Spec fields in each custom resource will represent the desired state of the cluster. | ||
3. *GitOps compatibility* ✅ | ||
1. There should be no complexity in implementing GitOps for ACK custom resources. | ||
2. Approach mentioned in low level design below should work. | ||
4. *User experience for custom resource lifecycle* ✅ | ||
1. Applying updates to custom resource will not be confusing for the user. Users will update the spec and service controller will create Update request based on new spec. | ||
|
||
|
||
> I personally prefer “Solution 3” provided it gives a consistent GitOps behavior. | ||
> The biggest selling point for “Solution 3” is that it is consistent with the K8s convention where desired state is | ||
> defined by spec of the custom resource and also custom resource spec in “etcd” is consistent with AWS resource. | ||
> To make the Update resource user experience better, In addition to the Status’s condition, we will also add annotation | ||
> capturing the changes which were made to desired spec. [More details in Low Level Design] | ||
|
||
## High Level Design | ||
|
||
![High Level Design](./images/late_initialization.png) | ||
[source](./images/late_initialization.drawio) | ||
|
||
|
||
## POC | ||
|
||
I did a small POC to determine how GitOps(Fluxcd) behaves when desired spec is updated by the controller. | ||
|
||
Setup: | ||
|
||
* A basic “Guestbook” [operator](https://github.com/vijtrip2/guestbook/blob/main/controllers/guestbook_controller.go) from | ||
kubebuilder getting started guide. | ||
* “Guestbook” [manifest](https://github.com/vijtrip2/flux-get-started/blob/master/guestbook/guestbook-sample.yaml) being | ||
deployed using fluxcd. | ||
* Controller adds a defaultValue in spec on each run, updates an existing value in spec and updates the annotation on each | ||
reconcile loop. | ||
|
||
Observations: | ||
|
||
* During the start of each reconciliation, the spec and status are reset to values present in the git repository. | ||
* Annotations are not reset, if a value is updated in annotations, it is present for future reconciliations. | ||
* Changes made after reconciliation are not reflected back into the git repository. “Fluxcd” does not support this. | ||
|
||
The Low level design below is determined from the POC done above. | ||
|
||
## Low Level Design | ||
|
||
To implement the flow described in high level design, we will walk through reconciliation of an example resource through | ||
following steps and layout in detail how these steps will be achieved by ACK controller. | ||
We assume that Fluxcd is used to apply desired state of custom resource. | ||
|
||
1. Assume there is a custom resource with following spec and status fields. | ||
|
||
“fieldA” and “fieldD” are typical spec and status field respectively. “fieldA” is provided by K8s user and “fieldD” | ||
is updated in status based on latest observation. | ||
|
||
“defaultFieldB” if not provided by K8s user, gets updated to “valueB” by reconciler. | ||
|
||
overrideFieldC‘s is modified by reconciler whenever provided by the k8s user. Ex: “-xyz” gets appended to it’s value. | ||
|
||
```yaml | ||
Spec: | ||
fieldA string | ||
defaultFieldB string | ||
overrideFieldC string | ||
Status: | ||
fieldD string | ||
``` | ||
and the manifest file(desired state) looks like, | ||
```yaml | ||
Spec: | ||
fieldA: valueA | ||
overrideFieldC: valueC | ||
``` | ||
2. When this resource is reconciled by the ACK service controller, we expect the CreateOutput to have following value | ||
for these fields. fieldA → “valueA“, defaultFieldB → “valueB“, overrideFieldC → “valueC-xyz“, fieldD → ”valueD“ | ||
Based on this result, ACK service controller will persist the following object in etcd: | ||
Annotations: | ||
```yaml | ||
ack/spec-overrides: | ||
“{ | ||
\“defaultFieldB\”: | ||
{ | ||
\“serverOverride\”:\“valueB\” | ||
}, | ||
\“overrideFieldC\”: | ||
{ | ||
\“userInput\”:\“valueC\”, | ||
\“serverOverride\”:\“valueC-xyz\” | ||
} | ||
}” | ||
Spec: | ||
fieldA: valueA | ||
defaultFieldB: valueB | ||
overrideFieldC: valueC-xyz | ||
Status: | ||
fieldD: valueD | ||
``` | ||
> Note: the annotation above has single string value. It is been broken into multiple lines to showcase the structure better. | ||
> Note: Fields whose value are defaulted, do not contain “userInput” and that is how they are distinguished. | ||
> Note: The change will be made in the sdkCreate and sdkUpdate methods, in the code-generator repository. Whenever there is an update required for spec field, based on the Create/Update output shape, we will record it in the Annotations, so that during next reconciliation, this context is preserved. | ||
> Note: sdkCreate will marshall the spec field and use string comparison to determine whether spec needs update or not. This will prevent from runtime panic of incomparable fields while using “==” or DeepEqual() method. | ||
3. During the next reconciliation of manifest mentioned in #1, the desired state will be constructed not just from spec | ||
in etcd but also the Annotations. and the desired object will look like the object persisted in etcd in #2 | ||
After reading the desired state from etcd, the reconciler code will go through all the spec fields and see whether | ||
they exist in the spec-override annotation map. If a corresponding entry exists, reconciler will marshall current | ||
desired value of spec field and compare with “userInput” in annotation override, | ||
* If there is no difference between “userInput” and marshalled desiredValue, reconciler will update the desired state | ||
with serverOverride from spec-override annotation. | ||
* If the marshalled string does not match “userInput”, it means there was some change in manifest by the user, and | ||
serverOverride will from spec-override map will not be used for desired state. | ||
In this case, There will be no delta between desired and latest object, And no AWS updates will be triggered. | ||
> Note: The change will be made again in code-generator, inside “ResourceFromRuntimeObject” method of resourceDescriptor | ||
> to take into account the Annotations when constructing the spec. | ||
4. Now let’s assume that User updates the manifest and provides the value for both “defaultFieldB” and “overrideFieldC”. | ||
The manifest will look like | ||
```yaml | ||
Spec: | ||
fieldA: valueA | ||
defaultFieldB: userValueB | ||
overrideFieldC: userValueC | ||
``` | ||
Now, during reconciliation, when the desired object is being constructed by parsing Annotations, reconciler will see | ||
that defaultFieldB is not absent and remove the annotation for defaultFieldB. Reconciler will do the same for “overrideFieldC” | ||
since userValueC is not equal to userInput i.e. “valueC” | ||
5. At this time, the desired object looks different from latest object in AWS and an Update will happen. The result of | ||
UpdateOutput will contain following fields, fieldA → “valueA”, defaultFieldB → “userValueB” and overrideFieldC → “userValueC-xyz” | ||
During sdkUpdate, Annotations will be updated again but only for the overrideFieldC since that is only field different | ||
from current state in AWS. | ||
The object persisted in etcd will look like | ||
```yaml | ||
Annotations: | ||
ack/spec-overrides: | ||
“{ | ||
\“overrideFieldC\”: | ||
{ | ||
\“userInput\”:\“userValueC\”, | ||
\“serverOverride\”:\“userValueC-xyz\” | ||
} | ||
}” | ||
Spec: | ||
fieldA: valueA | ||
defaultFieldB: userValueB | ||
overrideFieldC: userValueC-xyz | ||
Status: | ||
fieldD: valueD | ||
``` | ||
And that is how reconciler will handle simple case of default values and server overrides of desired state. | ||
## Discussion Points | ||
* Whenever “ack/spec-overrides” Annotation is present, should we also update the status to have a condition that indicated that ACK controller has overriden some spec fields? |