-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix #3587 #3472 adding more control over indexing, key function, storage #3943
Conversation
...etes-model-core/src/main/java/io/fabric8/kubernetes/api/model/GenericKubernetesResource.java
Outdated
Show resolved
Hide resolved
@attilapiros @metacosm For caching my thoughts are along the lines of: // some fancy caffeine cache or similar. For the purposes of this example
// it just needs to provide a few map operations
Map<String, HasMetadata> cache = ... ;
// doesn't really matter the resource, just an informer keyed by uid
var resources = client.pods();
var informer = resources
.withStoreValueFields("metadata.ownerReferences")
.inform()
.removeNamespaceIndex();
// add just an event handler dealing with the cache
// in general there would be others dealing with the creation of the events to process
informer.addEventHandler(new ResourceEventHandler<HasMetadata>() {
@Override
public void onAdd(HasMetadata obj) {
onUpdate(null, obj);
}
@Override
public void onDelete(HasMetadata obj, boolean deletedFinalStateUnknown) {
cache.remove(Cache.metaUidKeyFunc(obj));
}
@Override
public void onUpdate(HasMetadata oldObj, HasMetadata newObj) {
cache.compute(Cache.metaUidKeyFunc(newObj), (k, v) -> {
if (v == null || Long.compare(Long.parseLong(v.getMetadata().getResourceVersion()),
Long.parseLong(newObj.getMetadata().getResourceVersion())) < 0) {
return newObj;
}
return v;
});
}
}); It's safe to always remove an object from the application level cache because we shouldn't be reusing uids. However the add/update is where things get trickier. With the assumption that updates to the cache are only made with responses from the server: cache.put(key, client...edit(...)); I think it's then safe as shown above to compare the resourceVersions. Any other usage model of the cache that involves putting in working state (especially values without resourceVersions), would not be safe. The informer store / indexes can still be used in this scenario for quick existence checks when a value is not in cache - that's not fool-proof as the pathological case is that the resource was dropped from cache before it was ever added to the informer store. Any full object not found in the cache would require a lookup again from the api server: cache.computeIfAbsent(key, k -> resources.withField("metadata.uid", k).list().getItems().stream().findFirst().orElse(null)) If we only consider this cache application, and not additional handlers for level event / triggers, the features of an informer beyond a watch that are being used here are:
With a cache keyed by uid you may not care about the latter - things that have been deleted will just roll out of the cache eventually by some eviction policy. So really only a Watch with more reconnection logic may be required to just do the above. However my understanding is that you want to provide something for both caching and eventing - and those events could either be edge triggered (requiring the entire prior state) or level triggered (requiring only a small amount of prior state). |
A couple of follow on thoughts. Even - cache.put(key, client...edit(...)); Is not safe from a concurrency perspective. I think all cache modifications would have to check the resourceVersion.
I haven't checked about if the uid field is indexed (there's an old reference in a 2016 post that it is not). If that's the case, then either the caller would also have to supply those fields, or would use the informer to also save the name/namespace so that when an item is not found in cache, you can lookup the name/namespace in the informer store to query the api server. |
What is shown in #3943 (comment) is essentially https://github.com/kubernetes/client-go/blob/f6ce18ae578c8cca64d14ab9687824d9e1305a67/tools/cache/mutation_cache.go - we could easily create a built in class for that. Another update needed on this pr is to expose a method on the Store to get the key for an object, so that if the key function is changeable callers will have a better place to reference it. @attilapiros @metacosm can we get a full understanding of what your usecase is?
That will help determine if we need to further refine or break this apart. |
This is very much needed +1 @shawkins do you think it would be possible to have the ability to attach arbitrary metadata to a key ? As example, in my application I need top track the time of the last update (without having to add it to the object itself). Right now I'm keeping a separate map which is fed by a dedicated |
@lburgazzoli With the current implementation attaching to the key would be hard. It's just flattened to a string.
If I understand correctly you are maintaining a separate map of key -> update time, but you want to put that into the informer store. Is this because you see the informer as system or record for the update time? That is are you only determining update time from when the informer events are processed? How much benefit is there to having this in the informer store vs maintaining a separate map? |
Sorry, I mean to maintain some metadata along with the value associated to a key
Yeah I'm not really interesting about the effective type of when a resource was update, just when the object reached the informer. About maintaining an external cache, that's perfectly fine and that's what I'm doing, but I was thinking that since the informer has to maintain a cache whatsoever, it could also include some metadata. But if this is outside the scope of the informer, I'm happy with my cache |
Ok, let's go with that for now. Of the other things that were proposed in this pr, which have not been committed, it seems like you have interest in using uuid as a key and in having a reduced state store. Is that correct? My hesitancy in committing those was that there wasn't confirmation that those proposals would work for the operator sdk scenarios. Can you confirm how these will be used? |
This is not in the context of the operator sdk but for another application where I need to periodically update the status of some resources to an external system. In such case, to avoid flooding the external system, I use a very basic mechanic based on the last update time and once in a while a full re-sync is performed. Having an informer when I can maintain some basic data (not the whole resource) and some metadata would make simplify a lot my code. |
@lburgazzoli and to double check you want to have the primary key be uid instead of namespace/name? |
Now that the major conflicting work has been committed, I'll revive this pr next week based upon @lburgazzoli stated needs. |
Updated the description to match the updated pr. An alternative approach to hide Basic/ReducedState stores from api would be to have higher level methods on the SharedIndexInformer - SharedIndexInformer.itemStore(keyFunction) rather than SharedIndexInformer.itemStore(new BasicItemStore(keyFunction)) , etc. |
@shawkins sorry "attilapiros", is an other Attila I guess, you ment me.
Ideally the informers will be wrapped to our InformerEventSource, and the underling implementation would be configurable, but but at the end it will be hiddent.
is it for both.
Yes, that is true, now it's done by Sorry for late response, will dig deeper. And try to explain one additional use case soon, what might be useful in general . |
Not sure if I understand this. Let's assume we have this cache where we just put items changes from the informer (using the event handler only). Even in that case it would not be safe to say that we store the latest version without comparing the resources? (AFAIK only the latest event should be receive in a single Thread, so that should be ok) What we do now in case we want to make sure that our The problem what this solves is that the So we check first if the resource is in this cache, if not then check the informers cache. (The resource is added after and add or update operation explicitly) Similarly as you say we compare the resource version (just an equality check is enough) in this case. The resource is removed from temp cache if any evet with same So basically the the point is that the resource version is not checked on received event, but by the other party that updates the cache. Not sure if something like this would make sense to put into the client directly. It solves our particular use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some comments what are basically clarifications. LGTM . We can always add out implementation of the item store mentioned there.
It would be simpler from the perspective of the store - that is you only have to prune on the way in. But it's not as cross-cutting of a solution. Per type you would need to use a builder to explicitly set things like the spec and status to null. It also doesn't necessarily play nice with extracting the additional fields from the key.
I should clarify what I mean by safe - if you always overwrite with what comes from the informer, it will be eventually consistent, but it will not be guaranteed to represent the latest state known to the application. We see situations where modifications happen very close in time together, typically from status updates. Consider the following: There's an incoming resource modification event for version v1. The application reacts to this and performs some other update of the resource, the returned resource is actually v3 and that is put into your application level cache. Almost immediately the application then sees the resource modification event for v2. As a side-effect of the event processing of v2, it will flip your application level cache from reporting the latest as v3 to v2. Depending on the other the details of the processing of the event, it will temporarily stay at v2, or become v3 again. It will then definitely flip back to v3 once that event is processed.
Yes that helps prevent stale writes into the cache, but won't guarantee that you will always see the latest.
I think we're still avoiding claiming any support for updates to store once the informer is running. |
Actually I think it does, I probably did not describe it clearly what happens there:
So we always see the latest (first checking the temporal cache and if not present then informer cache) in terms what we can see from out update / received by the process.
This cannot happen because we know there is not event between |
There is one more feature that we support, is to not propagate the event that was a result of our update. Like if I update an ingress for example in the reconciler and I have an informer registered for this ingress. There we will not propagate event for this update by the informer ( InformerEventSource). So no additional reconciliation will happen on our own update. This would make sense probably to put into informer, the problem is that it is quite complex code, and also from the users perspective. We can go into details in an other issue maybe. If there is a consensus that this might be useful in general. |
Yes, if you are doing optimistic locking that will indeed prevent the effect that I'm talking about because you are guaranteeing that you are just incrementing off a known version. Sorry, I had mentioned this on other issues/prs, but did not bring it up here. Since those updates are out of the control of the informer, it's hard to make assumptions about if they are enforced correctly - so I didn't think it was worth designing the store around. If you wanted to base a custom store off of the reduced state store, one modification that would be useful is retrieving the item resourceVersion without reconstituting the full item.
With the current pr store modifications are below the other cache functions (indexing and event notification). So for example if you update the store to the next state, and then the modification event comes in, it will produce a sync event - which will only be seen if you have resync turned on and are in a sync period. However that's not guaranteed - from the time you perform an update until the time you update the store, it's possible that the event has already been processed. Preventing that would require more locking. |
…ormer key and state
SonarCloud Quality Gate failed. |
Description
replaces #3479 and #3616
cc @attilapiros @lburgazzoli
Instead of adding additional Informable dsl methods, this has been implemented as logic on the SharedIndexInformer. It removes the separate SharedInformer interface so that the methods can act more like a builder without having to create additional overrides. While not as simple as top level dsl methods, it's still pretty straight-forward:
This allows for the full ItemStore to be supplied by the user as well.
Both itemStore and initialState can only be called before the informer is running, so you have use runnableInformer or the SharedInformerFactory to get a reference to the SharedIndexInformer before it is started. An alternative to this is to try and differentiate informers by whether they are running - SharedIndexInformer (presumed running) RunnableSharedIndexInformer (not yet running) and also exposes these additional methods.
I would also like to deprecate the Informable.withIndexers - as I've changed the logic to support add/remove of indexes even when the Informer is running directly from SharedIndexInformer.
From here the operator sdk or other users would introduce a fronting cache with whatever locking and retention semantics are required. We can talk about in later versions making that more of built-in feature (it's unfortunate that store and cache terms are used interchangeable already for informers - Store, Cache/CacheImpl, and now ItemStore - it would be nice if they were distinct concepts). The division of responsibility roughly envisioned is:
"Fronting Cache" - maintains a configurable cache of key to full object - this is only needed when using something other than the BasicItemStore (or when the user want to modify the state directly), so it could be coupled eventually with the ItemStore. On a cache miss, it would check for existence in the store (likely a reduced state store) and perform a get if needed. It could be writable by the user level logic, but would default to being overwritten by events from the informer. It could take a merging function if it were desirable to override that behavior. If this were directly used by the Informer, it could use items from the cache in events - but only when the cache entries are immutable and match the expected resourceVersion.
CacheImpl - maintains the (optional) secondary in-memory indexes, fronts the ItemStore.
ItemStore - Stores all the items, indexed by primary key.
Type of change
test, version modification, documentation, etc.)
Checklist