Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control recycling of objects stored from sessions #3

Closed
sageserpent-open opened this issue May 4, 2023 · 5 comments
Closed

Control recycling of objects stored from sessions #3

sageserpent-open opened this issue May 4, 2023 · 5 comments
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@sageserpent-open
Copy link
Owner

sageserpent-open commented May 4, 2023

This is in a performance bug, and is also a feature request.

Running ImmutableObjectStorageMeetsMap in this repository, (along with another benchmark elsewhere) shows that Curium holds on to objects that were created in a session via session code and then stored (as opposed to being retrieved from Curium). This is due to tranches that are defined in a session being cached - requests to retrieve objects are then trivially resolved against those cached tranches, and will yield exactly the same object that was stored in the first place.

This is great for performance, but means that it is very rare for a tranche to be loaded. Because of this, the objects in play in sessions rarely, if ever, pick up substructure that is proxied - unless a subsequent session causes a miss in the tranches cache, it will pick up the original object stored in a previous session, and that in turn will have been built from other original objects from previous sessions, and so on.

Curium does supports proxying of objects when it resolves inter-tranche references as tranches are loaded, but this won't happen because requests for objects directly via their tranche id always use the top level object associated with a tranche, so in this case there is no need to load a tranche and thus cause proxying of the substructure of the top level object - rather the original stored one is yielded.

Some experiments have been done on this, what is apparent is:

  1. In order for the tests on object identity in a session to pass, the identity of an object that is stored and then retrieved in the same session needs to be respected in that session. So we do need to cache the whole tranche load data from all the store operations performed in a session, which we get for free right now via caching.
  2. However, that same caching causes trouble in subsequent sessions, because it defeats proxying by recycling the original stored objects.
  3. Now, it is perfectly feasible to add an extra cache for tranche load data that is created by a store operation, so that the cache is cleared at the end of a session - a dynamic variable can be given a new scope for the session and used to access the cache by sessionInterpreter. This tranche load data does not get cached in the existing tranches cache - that is reserved for loaded tranches, which will cause proxying.
  4. There is some fancy footwork required in that retrieving a top-level object via its tranche id needs to respect whether there is a corresponding tranche either in the session-level cache or the longer lived cache, giving preference to the former.
  5. There is a heavy performance price to pay when tranches have to be reloaded in subsequent sessions!

It is worth revisiting this as the original experimentation was done without a clear understanding of the problem of recycling of objects from sessions (in general, recycling of objects that are loaded and thus contain proxies is a good thing). Perhaps more attention to detail is all that is required.

Failing that, there might be some value in allowing a blend of recycled objects created by sessions and ones loaded with proxy substructure - if we can detect whether a tranche was created by a store operation as opposed to a retrieve, we can tailor an ejection policy that favours dropping tranches from sessions, thus causing a gradual invasion of objects with proxies.

Another approach is to bar tranches from sessions from resolving requests to retrieve top level objects, but allow such tranches to participate in resolving inter-tranche references when a proxy needs its underlying object - this breaks the chain of having a recycled session object being composed solely out of other recycled session objects, but hopefully still avoids rampant reloading of tranches and the consequent performance overhead.

@sageserpent-open sageserpent-open added bug Something isn't working enhancement New feature or request labels May 4, 2023
@sageserpent-open sageserpent-open changed the title Control recycling of objects stored from sessions. Control recycling of objects stored from sessions May 4, 2023
@sageserpent-open
Copy link
Owner Author

The feature request aspect is simply to extend the configuration to support allowing some kind of recycling of objects stored from a session.

@sageserpent-open sageserpent-open self-assigned this May 5, 2023
@sageserpent-open
Copy link
Owner Author

sageserpent-open commented May 5, 2023

An observation on the performance hit seen when having to load tranches - in Benchmark (in another repository) and in ImmutableObjectStorageMeetsMap, this is almost always due to having to load a tranche that contains the underlying object needed by a proxy that is computing its own hash code.

Maybe all of this effort would be avoided by caching the hash code, as it should be invariant for any given immutable object?

Actually, before doing that, I need to be sure that these hash computations are simply due to set and map operations as they are changed by the benchmark - that is to be expected, and as long as they do not cause cascades of tranche reloads for each hash computation, then that at least is OK.

The other thing that comes to mind is - where exactly is the major source of performance loss when loading tranches? Is it deserialisation in Kryo code? RocksDb fetching tranches? Kryo extensions in Curium? Proxy support? Or simply just loading far too many tranches in a session, thus implying that proxies are being forced to load their underlying objects eagerly?

@sageserpent-open
Copy link
Owner Author

sageserpent-open commented May 7, 2023

As of commit SHA 1dcef1a we have the latest and greatest. There is a configuration option that enables recycling of objects stored in a session for use by retrievals in subsequent sessions.

As discussed above, this gives good performance, because it completely avoids the need to reload tranches.

Disabling such recycling yields pretty dismal performance in the two benchmarks - looking at the number of tranche reloads per session, the number increases from a reassuring average of around 1 (because the very first retrieval of a top level objects need to load its tranche) to around 40 tranches in a session. This is due to cache thrashing once the benchmarks have progressed sufficiently far as to have filled up the cache - inter-tranche references needed to satisfy hash computations cause misses in the tranches cache.

Giving a much more generous size to the tranches cache does help, but this reveals another problem - when objects are recycled, there is no need to load tranches, so although the entire object graph is loaded into memory, the memory usage doesn't spiral out of control - as long as the application doesn't just keep growing that object graph.

When recycling is disabled and we have large tranches (because of batching in the sessions), then fulfilling a single inter-tranche reference pulls in a whole swathe of unrelated bloat. If the inter-tranche references tend to bunch up into just a few tranches, or if the sessions are small, then this isn't a problem, but both benchmarks pack in 100 updates into each session and jump around all over the object graph.

So it might be the case that an application takes makes lots of distinct retrieve and store operations in the same session (as opposed to a single retrieve on a cosmic application object, the one hundred updates all over that object's graph, then a final store) could scale well.

The other thing is just to accept that recycling is a necessary evil, but use tranche loading and proxies as the application restarts to bring stuff in, then carry on in recycling mode - this happens by default.

It is also possible to manually clear the caches via ImmutableObjectStorage.clear every now and then to drop all recycled objects and this yield an object graph with just a few proxies here and there. This approach doesn't really work with the two benchmarks being discussed here, though.

@sageserpent-open
Copy link
Owner Author

Adding some rather hokey diagnostics to count the number of tranches loaded in a session, and hacking ImmutableObjectStorageMeetsMap to use a fixed size tranches cache that is the look-back limit divided by the number of steps in a session batch, what we see is:

Screenshot 2023-05-08 at 15 31 47

A graph of the minimum, average and maximum number of tranche loads per session against the number of steps executed:

Screenshot 2023-05-08 at 15 33 11

Prior to the 100 000 step the tranches cache is still growing up to its limit of 1000 (the look-back is set to 100 000 here).

@sageserpent-open
Copy link
Owner Author

It might be the case that the number of fetches is settling down - certainly reducing the look-back to around 1000 yields great performance, although this might gradually creep up over time.

What was very obvious from experimenting with the look-back size was just how much memory is grabbed by those tranches - as the batches are large, an awful lot of irrelevant objects can get pulled in to satisfy a single inter-tranche reference. This doesn't happen of course when recycling is enabled.

Observe the considerable GC churn as well - but while increasing the tranches cache size buys respite from both the large number of tranche loads and the GC churn, it requires a huge amount of memory - 16G won't cut it.

Swings and roundabouts....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant