-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xs performance investigation (array, Map/Set optimizations) #3012
Comments
Change the allocation/GC policy to keep growing memory up to a gc threshold, and only past that trigger a GC. index 70a9288b..267db5ba 100644
--- a/xs/sources/xsMemory.c
+++ b/xs/sources/xsMemory.c
@@ -1175,7 +1175,8 @@ again:
}
aBlock = aBlock->nextBlock;
}
- if (once) {
+ txSize gcThreshold = the->allocationLimit * 0.7;
+ if (once && the->allocatedSpace > gcThreshold) {
fxCollect(the, 1);
once = 0;
}
@@ -1210,7 +1211,8 @@ again:
the->peakHeapCount = the->currentHeapCount;
return aSlot;
}
- if (once) {
+ txSize gcThreshold = the->allocationLimit * 0.7;
+ if (once && the->allocatedSpace > gcThreshold) {
txBoolean wasThrashing = ((the->collectFlag & XS_TRASHING_FLAG) != 0), isThrashing;
fxCollect(the, 0); This should work well with the strategy of manually triggering GC at the end of cranks. |
Measurement tasks we're interested in:
Mitigations we're looking at:
Given |
The submodule part might need some manual intervention, but... https://github.com/Agoric/agoric-sdk/tree/3012-xsnap-meters
https://github.com/agoric-labs/moddable/tree/xsnap-gc-count 6291555 |
We are seeing O(n^2) in some collections in XS. The implementations are pretty simple because that's well-suited to small memory. In some cases we may need different algorithm or allocation policy. We have some investigation into Arrays and WeakMaps. Array usage (e.g., push) appears to only extend the array storage by 1 (via
|
Hi @phoddie @patrick-soquet are XS class private instance fields constant time, even for return override? I ask in order to get a sense whether endojs/endo#704 might help as a short term kludge --- to be retired asap of course. But is it worth trying? |
@warner: additions and removals from each WeakMap shares an implementation with (strong) Map and Set; I hope counting all of them suffices:
|
Once we're in the bad performance regime, if we snapshot and then restore from snapshot, does the restored one perform as badly as just continuing would have? I would suspect that snapshot and restore at least defragments memory. |
Recent measurements:
My current theory is that we're creating a lot more objects than I expected, the lack of kernel-level GC (most immediately represented by the liveslots "safety pins") is keeping a higher fraction of them alive than I expected, and thus we're subjecting XS to an unreasonable amount of objects, therefore XS GC is taking a long time (and not yielding much of a result). To gather evidence for/against that theory, I'd like to know:
Current mitigation ideas:
Deeper fixes that we can't implement in time but should be prioritised:
|
some slog analysis using pandas and such... https://gist.github.com/dckc/975d57a93a52f3f1f8485f38a87b5c5f |
Additional mitigation ideas:
|
One problem identified: I think XS is not allocating the heap in large chunks as it was supposed to do. I filed Moddable-OpenSource/moddable#636 to address it. Fixing that should reduce our GC calls to one every 256kB chunk allocations, which I think will mean once every 100-ish loadgen cycles. At 1800 loadgen cycles, I estimate each GC call is taking about 450ms, which should let us get to 20k cycles before causing blocks to be delayed. Fixing that in XS is an easier/more-direct/more-correct mitigation than any other changes we might make to the GC policy |
I'm inclined to investigate side note: using |
I tried this out; performance looked ~2x slower:
the numbers above are the duration between start hardening keys and done hardening: function bench() {
// 12 iterations, 65,536 objects, ~never finishes
for (let iter = 0; iter < 11; iter += 1) {
const size = 1024 * (1 << Math.round(iter / 2));
print('iter', iter, size);
event('prep iter', iter, size);
const keys = new Array(size);
for (let ix = 0; ix < size; ix += 1) {
keys[ix] = {};
}
event('start loop over keys', iter, size);
for (const _key of keys) {
// how long does looping take?
}
event('start hardening keys', iter, size);
for (const key of keys) {
harden(key);
}
event('done hardening', iter, size);
}
} |
brainstorm / plan
brainstorm hypothesis: problem is overlap... trend line... ~1gb to O(10k)
|
@warner concurs: we'll track remaining work in future milestones. |
On second thought, we're actively monitoring this issue in phase 3, so let's track it there for a while longer... |
@erights for WeakMap performance testing, the |
@michaelfig don't wait for XS array growth optimizations for today's release. p.s. from Dean Fri, Jun 11, 1:05 PM KC time: |
based on our July 12 discussion, I expect this to go in our upcoming metering milestone. a peek at the xs commit log shows no news just yet (last change: July 10 bigint fix |
looks like the latest from moddable 14e252e8 has relevant stuff... in particular, in void fxResizeEntries(txMachine* the, txSlot* table, txSlot* list) |
XS implementation now has conventional O(n log(n)) algorithm. - xs-meter-9 represents XS Map/Set optimizations - style: update indentaion in test-xs-perf.js fixes #3012
*BREAKING CHANGE:* XS Map/Set optimizations lead to compute meter changes; hence `METER_TYPE = 'xs-meter-9'` fixes #3012
We're looking at nonlinear performance characteristics of the core datatypes (specifically Array, Map, Set) in XS.
We originally focused on WeakMap, but that part has been addressed.
When we run our load-generator test, we see the zoe vat taking longer and longer with each load cycle. Our current implementation has the following properties:
wm.delete()
callsharden
(specifically to track which objects have already been frozen or not). Since our programming/safety style is toharden()
everything, this WeakMap will include everything that Zoe puts in the WeakMaps listed above, plus far more. Many of these objects will go away (they are not held by liveslots), but everything that liveslots is holding will probably be included in the harden WeakMap.We don't specifically know that WeakMap is involved, but given how Zoe works, it's a reasonable hypothesis for something that might grow over time.
Our current working hypotheses are:
fxNewSlot
the->freeHeap
holds a linked-list of free slots), no allocations are necessaryfxNewSlot
might trigger GC (fxCollect
) to make some available, and/or it might callfxGrowSlots
to allocate more by callingmalloc()
fxNewSlot
a lotTogether, this would explain a linear slowdown in the time each load-generator cycle takes. It would not explain the superlinear slowdown that we've measured.
The text was updated successfully, but these errors were encountered: