-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix double accounting of memory when spilling aggregation is used #413
Fix double accounting of memory when spilling aggregation is used #413
Conversation
4bd7cc7
to
f10654b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job @sopel39
(everything except last commit) lgtm % minor comments
@@ -272,6 +273,7 @@ public OperatorFactory duplicate() | |||
private final HashCollisionsCounter hashCollisionsCounter; | |||
|
|||
private HashAggregationBuilder aggregationBuilder; | |||
private LocalMemoryContext aggregationBuilderMemoryContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memoryContext
?
@@ -368,6 +370,10 @@ public void addInput(Page page) | |||
if (aggregationBuilder == null) { | |||
// TODO: We ignore spillEnabled here if any aggregate has ORDER BY clause or DISTINCT because they are not yet implemented for spilling. | |||
if (step.isOutputPartial() || !spillEnabled || hasOrderBy() || hasDistinct()) { | |||
aggregationBuilderMemoryContext = operatorContext.localUserMemoryContext(); | |||
if (useSystemMemory) { | |||
aggregationBuilderMemoryContext = operatorContext.localSystemMemoryContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not pass MemoryContext into InMemoryHashAggregationBuilder
?
Or, maybe better, why not have this assignment in the constructor?
As it is now, it looks like creating a Mem Context multiple times (which i think is not really the case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not pass MemoryContext into InMemoryHashAggregationBuilder?
I tried that approach, but since InMemoryHashAggregationBuilder
is used within SpillableHashAggregationBuilder
it cannot account memory on it's own. This is because SpillableHashAggregationBuilder
changes memory pools from revocable to user and InMemoryHashAggregationBuilder
doesn't (and shouldn't) know about this flip.
Ideally (TODO), SpillableHashAggregationBuilder
should use underlying accumulators and GroupByHash
directly and omit InMemoryHashAggregationBuilder
.
Or, maybe better, why not have this assignment in the constructor?
That makes sense.
@@ -510,6 +522,10 @@ private void closeAggregationBuilder() | |||
// aggregationBuilder.close() will release all memory reserved in memory accounting. | |||
// The reference must be set to null afterwards to avoid unaccounted memory. | |||
aggregationBuilder = null; | |||
if (aggregationBuilderMemoryContext != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can't be null at this point
true, | ||
useSystemMemory); | ||
() -> { | ||
aggregationBuilderMemoryContext.setBytes(((InMemoryHashAggregationBuilder) aggregationBuilder).getSizeInMemory()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InMemoryHashAggregationBuilder
instance gets a callback in its constructor, but the callback refers to that instance. Should InMemoryHashAggregationBuilder
use this callback in the ctor to update initial memory occupancy, this would fail.
To fix this, UpdateMemory
should take long bytes
parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fix this, UpdateMemory should take long bytes parameter.
GroupByHash
would need to be passed LocalMemoryContext
as an argument then. I also tried this approach, but it explodes the changes and still doesn't nicely solve problem of flipping memory from revocable to user (e.g: GroupByHash
would need to keep getEstimatedSize
method, but also accept LocalMemoryContext
as arg).
Objects should either:
- expose it's memory usage via
getEstimatedSize
method (if they don't "own" the memory) - account for memory themselves (via
LocalMemoryContext
). In this caseAggregatedMemoryContext
should be passed to constructor of such objects.
Mixing both usage patterns is confusing.
@@ -510,6 +522,10 @@ private void closeAggregationBuilder() | |||
// aggregationBuilder.close() will release all memory reserved in memory accounting. | |||
// The reference must be set to null afterwards to avoid unaccounted memory. | |||
aggregationBuilder = null; | |||
if (aggregationBuilderMemoryContext != null) { | |||
aggregationBuilderMemoryContext.setBytes(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're zeroing memory contexts at the end of this method.
Either add operatorContext.localSystemMemoryContext().setBytes(0);
there (instead of this line), or remove the all other .setBytes(0)
calls there. Or explain :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Didn't notice that.
Just memoryContext.setBytes(0);
at end of this method is enough
...n/src/main/java/io/prestosql/operator/aggregation/builder/MergingHashAggregationBuilder.java
Show resolved
Hide resolved
...src/main/java/io/prestosql/operator/aggregation/builder/SpillableHashAggregationBuilder.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/prestosql/operator/aggregation/builder/SpillableHashAggregationBuilder.java
Outdated
Show resolved
Hide resolved
// if before building result from hashAggregationBuilder we would convert it to "read only" version. | ||
// Read only version of GroupByHash from hashAggregationBuilder could be compacted by dropping | ||
// most of it's field, freeing up some memory that could be used for sorting. | ||
return hashAggregationBuilder.getSizeInMemory() + hashAggregationBuilder.getGroupIdsSortingSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why hashAggregationBuilder.getSizeInMemory()
does not include hashAggregationBuilder.getGroupIdsSortingSize()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hashAggregationBuilder.getGroupIdsSortingSize()
part is only materialized in memory when buildHashSortedResult
is called. This happens only when unspilling. Therefore getSizeInMemoryWhenUnspilling
is used to estimate aggregation size when unspilling only.
@@ -219,6 +210,15 @@ private boolean shouldMergeWithMemory(long memorySize) | |||
} | |||
} | |||
|
|||
private long getSizeInMemoryWhenUnspilling() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep the old method name getSizeInMemory
.
- the fact that it's only invoked after spiller is created seems not relevant to semantics of the method
- the method is invoked before spilling (ie before
getFutureValue(spillToDisk());
), so it's not "when unspilling"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the method is invoked before spilling (ie before getFutureValue(spillToDisk());), so it's not "when unspilling"
It is used to estimate size of memory during unspilling phase and to decide if we should merge with mem or not. This method has very limited use case and the previous name was confusing
InMemoryHashAggregationBuilder was doing it's own memory accounting for the purpose of GroupByHash yielding. When SpillableHashAggregationBuilder is used it also accounts InMemoryHashAggregationBuilder as a revocable memory. Therefore memory was accounted twice. This commit removes memory accoutning from InMemoryHashAggregationBuilder.
f10654b
to
84aae4d
Compare
84aae4d
to
0b20a46
Compare
No description provided.