-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-22.2: sql: reduce the overhead of EXPLAIN ANALYZE #91208
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @yuzefovich)
In order to propagate the execution stats across the distributed query plan we use the tracing infrastructure, where each stats object is added as "structured metadata" to the trace. Thus, whenever we're collecting the exec stats for a statement, we must enable tracing. Previously, in many cases we would enable it at the highest verbosity level which has non-trivial overhead. In some cases this was an overkill (e.g. in `EXPLAIN ANALYZE` we don't really care about the trace containing all of the gory details - we won't expose it anyway), so this is now fixed by using the less verbose "structured" verbosity level. As a concrete example of the difference: for a stmt that without `EXPLAIN ANALYZE` takes around 190ms, with `EXPLAIN ANALYZE` it would previously run for about 1.8s and now it takes around 210ms. This required some minor changes to the row-by-row outbox and router setups to collect thats even if the recording is not verbose. Release note (performance improvement): The overhead of running `EXPLAIN ANALYZE` and `EXPLAIN ANALYZE (DISTSQL)` has been significantly reduced. The overhead of `EXPLAIN ANALYZE (DEBUG)` didn't change.
d13917e
to
7c79f05
Compare
Hm, we got some failures here because some of the execution stats are getting dropped from the trace. On master things work well due to recent @andreimatei work. In particular, once I revert 2d29175 on master I get a failure in Andrei, do you think it's feasible that we'll backport #89785 (which I assume will also pull in #88414) to 22.2? Those changes on their own would help with getting better tracing on 22.2 and would also unblock backporting the current change of lowering the overhead of EXPLAIN ANALYZE. |
@andreimatei do you have thoughts on #91208 (comment)? |
I'm reticent to backport those patches; they're a bit big and I'm not all that confident that they don't have unintended consequences. |
It's not a regression, but I'd argue that the fact that I'll play around with the constant limits and see whether at least simple tests in CI pass. |
I don't think I'll get to spending more time on this. |
Backport 1/1 commits from #91117.
/cc @cockroachdb/release
In order to propagate the execution stats across the distributed query plan we use the tracing infrastructure, where each stats object is added as "structured metadata" to the trace. Thus, whenever we're collecting the exec stats for a statement, we must enable tracing. Previously, in many cases we would enable it at the highest verbosity level which has non-trivial overhead. In some cases this was an overkill (e.g. in
EXPLAIN ANALYZE
we don't really care about the trace containing all of the gory details - we won't expose it anyway), so this is now fixed by using the less verbose "structured" verbosity level. As a concrete example of the difference: for a stmt that withoutEXPLAIN ANALYZE
takes around 190ms, withEXPLAIN ANALYZE
it would previously run for about 1.8s and now it takes around 210ms.This required some minor changes to the row-by-row outbox and router
setups to collect thats even if the recording is not verbose.
Addresses: #90739.
Epic: None
Release note (performance improvement): The overhead of running
EXPLAIN ANALYZE
andEXPLAIN ANALYZE (DISTSQL)
has been significantly reduced. The overhead ofEXPLAIN ANALYZE (DEBUG)
didn't change.Release justification: performance fix.