-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colexecargs: fix recent memory leak #64453
Conversation
f8d1665
to
e05af1c
Compare
@nvanbenschoten @RaduBerinde I wonder if you could scrutinize the third commit a bit and possibly come up with an explanation for why it reduces the RAM usage noticeably. Some background info is that without that commit the RAM usage during Unfortunately, the heap profiling hasn't been very helpful in tracking that down. I only have guesses for why the third commit improves things (#62320 (comment)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first 2 commits LGTM, 3rd I will review again a bit later
Reviewed 2 of 2 files at r1, 1 of 1 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @yuzefovich)
pkg/sql/colfetcher/cfetcher.go, line 1499 at r1 (raw file):
// We need to set all values in "not needed" vectors to nulls because if the // batch is materialized (i.e. values are converted to datums), the // conversion of unset values might encounter an error.
bummer... can we instead teach the materializer to notice this case?
pkg/sql/logictest/testdata/logic_test/vectorize_overloads, line 713 at r1 (raw file):
└ Node 1 └ *rowexec.noopProcessor └ *colfetcher.ColBatchScan
wrong commit i think
e05af1c
to
c455ceb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)
pkg/sql/colfetcher/cfetcher.go, line 1499 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
bummer... can we instead teach the materializer to notice this case?
It'll be a bit harder and less clean IMO - the materializer has the built-in assumption that all vectors need to be converted, and ColBatchScan
is the only operator that might produce batches with unneeded vectors. We would need to examine the post-processing spec at the materializer level to learn which vectors are not needed, or we would have to plumb this information from the ColBatchScan somehow. This comes from the fact that we only the post-processing spec knows about projections and renders (TableReader core is an exception) that encoded this "unneediness" information.
I prefer the current approach because we're pushing this exceptional logic into the component that creates "exceptional" batches, that component already knows about not needed columns, and setting all unneeded vectors to all nulls works well for the conversion in the materializer.
pkg/sql/logictest/testdata/logic_test/vectorize_overloads, line 713 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
wrong commit i think
Yeah, I noticed it too - fixed.
Changed the third commit to a smaller, more proper fix, but I think keeping the first two commits is still worth it. |
1f20f91
to
b121d57
Compare
Added another commit auditing our slice reuse in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @yuzefovich)
pkg/sql/colconv/vec_to_datum_tmpl.go, line 103 at r6 (raw file):
for _, vec := range c.convertedVecs { if len(vec) > 0 { _ = vec[len(vec)-1]
What are we trying to do here? Why would for i := range vec
have bound checks in the first place? By the way, I believe that for i := range vec { vec[i] = nil }
has a fast path in Go and gets converted to a memset
.
pkg/sql/colexec/colexecargs/op_creation.go, line 180 at r6 (raw file):
// objects are still referenced by the corresponding sync.Pools, so the // references in r.Releasables will not be the reason for the objects to // not be garbage-collected.
Each sync.Pool grows and shrinks as necessary, it is possible that those other pools would shrink and leave this as the only reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @RaduBerinde)
pkg/sql/colconv/vec_to_datum_tmpl.go, line 103 at r6 (raw file):
Previously, RaduBerinde wrote…
What are we trying to do here? Why would
for i := range vec
have bound checks in the first place? By the way, I believe thatfor i := range vec { vec[i] = nil }
has a fast path in Go and gets converted to amemset
.
Good point - I was trying to eliminate bounds checks and blindly followed the way we usually do it in other places. The crucial difference is that here vec
is a local variable, so the compiler can prove that the loop for i := range vec
is always in bounds, whereas usually in other places the thing to iterate over is a field in the struct. I confirmed this here: https://godbolt.org/z/KevehETje. It does use memclrHasPointers
which is likely what you were referring to as an optimization.
(A side thought is that possibly in other places we should follow the same pattern of declaring a local variable to access the field in the struct, and then iterate over the local variable rather than our somewhat ugly [len(s)-1]
accesses outside of the for loop.)
Refactored.
pkg/sql/colexec/colexecargs/op_creation.go, line 180 at r6 (raw file):
Previously, RaduBerinde wrote…
Each sync.Pool grows and shrinks as necessary, it is possible that those other pools would shrink and leave this as the only reference.
Makes sense, fixed.
@jordanlewis @RaduBerinde I believe that both of are on board with this PR, can someone give it another look and stamp if satisfied? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @RaduBerinde, and @yuzefovich)
pkg/sql/colflow/vectorized_flow.go, line 1264 at r7 (raw file):
// slice). Unset the slot so that we don't keep the reference to the old // materializer. if len(r.processors) == 1 {
If this comment is true, why do we keep this as a slice at all? Should it just be a pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
bors r+
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @RaduBerinde)
pkg/sql/colflow/vectorized_flow.go, line 1264 at r7 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
If this comment is true, why do we keep this as a slice at all? Should it just be a pointer?
It's because in the row-based flows can have multiple processors, so we abstracted out SetProcessors
method that takes in a pointer. We do pool the slice as part of vectorizedFlowCreatorHelper
pooling and instantiate it with 1 capacity.
Build failed: |
The cFetcher creates a batch with the same schema as the table it is reading from. In many cases not all columns from the table are needed for the query, so as an important perfomance optimization the cFetcher doesn't decode the data for those columns. As a result, such unneeded columns are left "unset". This works ok in most cases, however, if we attempt to materialize such a batch with unset vectors, the conversion to datums might encounter errors (e.g. UUID values must either be NULL or have 16 bytes length). This commit improves this situation slightly by tracking the set of unneeded columns and setting those vectors to all NULL values. This will allow to simplify the planning code a bit in the follow-up commit. Release note: None
Previously, we had some custom code for the case when we supported the table reader core but not the post-processing spec - we attempted to revert to the core-pre-planning state and plan the whole table reader with render expressions on top. Given the previous commit, I think this is no longer necessary, so this commit removes that special code in favor of the general handling of only the post-processing spec via a noop processor. This commit was prompted by some complications because of this old code for the follow-up commit. Release note: None
In c3b1617 we introduced a new utility struct that keeps information about the meta objects in the operator tree. Those meta objects are tracked by several slices which are resliced to be of length 0 when the "outer" object is released back to the corresponding pool. However, the slices still end up holding references to the old meta objects prohibiting those from being GCed. Such a behavior results in a memory leak. This commit fixes the issue by explicitly resetting the slices for reuse. Release note: None
This commit performed the audit of all slices that are kept by components implementing `execinfra.Releasable` interface to make sure that the slices that might be referencing large objects are deeply reset. (By deep reset I mean all slots are set to `nil` so that the possibly large objects could be garbage-collected.) This was prompted by the previous commit which fixed a recent regression, but this commit seems like a good idea on its own, and it might be worth backporting it too. Release note: None
Rebased on top of master. bors r+ |
Build succeeded: |
colfetcher: set all unneeded vectors to all nulls
The cFetcher creates a batch with the same schema as the table it is
reading from. In many cases not all columns from the table are needed
for the query, so as an important perfomance optimization the cFetcher
doesn't decode the data for those columns. As a result, such unneeded
columns are left "unset". This works ok in most cases, however, if we
attempt to materialize such a batch with unset vectors, the conversion
to datums might encounter errors (e.g. UUID values must either be NULL
or have 16 bytes length).
This commit improves this situation slightly by tracking the set of
unneeded columns and setting those vectors to all NULL values. This
will allow to simplify the planning code a bit in the follow-up commit.
Release note: None
colbuilder: remove unnecessary complication when wrapping table reader
Previously, we had some custom code for the case when we supported the
table reader core but not the post-processing spec - we attempted to
revert to the core-pre-planning state and plan the whole table reader
with render expressions on top.
Given the previous commit, I think this is no longer necessary, so this
commit removes that special code in favor of the general handling of
only the post-processing spec via a noop processor. This commit was
prompted by some complications because of this old code for the
follow-up commit.
Release note: None
colexecargs: fix recent memory leak
In c3b1617 we introduced a new utility
struct that keeps information about the meta objects in the operator
tree. Those meta objects are tracked by several slices which are
resliced to be of length 0 when the "outer" object is released back to
the corresponding pool. However, the slices still end up holding
references to the old meta objects prohibiting those from being GCed.
Such a behavior results in a memory leak. This commit fixes the issue by
explicitly resetting the slices for reuse.
Fixes: #62320.
Fixes: #64093.
Release note: None
sql: audit implementations of Releasable interface of slices' reuse
This commit performed the audit of all slices that are kept by
components implementing
execinfra.Releasable
interface to make surethat the slices that might be referencing large objects are deeply
reset. (By deep reset I mean all slots are set to
nil
so that thepossibly large objects could be garbage-collected.) This was prompted by
the previous commit which fixed a recent regression, but this commit
seems like a good idea on its own, and it might be worth backporting it
too.
Release note: None