Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colexec: improve tracing in the vectorized engine #55821

Open
yuzefovich opened this issue Oct 21, 2020 · 1 comment
Open

colexec: improve tracing in the vectorized engine #55821

yuzefovich opened this issue Oct 21, 2020 · 1 comment
Labels
C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Oct 21, 2020

All processors in the row-by-row engine have decent tracing in place (ProcessorBase.StartInternal creates a span for the processor in Start method and then when moving to trailing meta state GetTraceData collects that information and propagates it as the metadata). We currently don't have anything like this for each operator, and it'd be nice to add that.

Jira issue: CRDB-3626

@yuzefovich yuzefovich added the C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. label Oct 21, 2020
craig bot pushed a commit that referenced this issue Apr 23, 2021
61223: colexecop: change Operator interface r=yuzefovich a=yuzefovich

This commit moves the context argument from `Next()` method into
`Init()`. The idea is that now the `Operator` interface looks similar to
`RowSource` with the following context management:
- in `Init`, all operators are expected to capture the passed-in
context. They could optionally derive their own context if necessary and
pass that derived context into its inputs. The contract of `Init` method
now is that all calls except for the first one are noops.
- any work performed in `Next` that needs a context should use the one
captured by the operator.

The follow-up commit will remove the context argument from `DrainMeta`
signature.

The motivation for this change is that we want to improve tracing in the
vectorized engine. In the row-by-row engine we create a tracing span for
every processor which allows us to more precisely attribute the time
spent during the execution to each of the processors. However, in the
vectorized engine we currently only create spans at root components
(either root materializer or outboxes) as well as in ColBatchScans (in
order to disambiguate the contention events coming from different
components). After this change we will be able to easily start tracing
spans in `Init` of the operators.

In order to simplify the handling of the context this commit introduces
several utility structs into `colexecop` package. Many operators that
don't need to perform any work in their `Init` method other than calling
`Init` on its single input can now embed `OneInputHelper` and "inherit"
the default implementation of `Init`.

A notable change is to an Inbox - we now remove `flowCtx` which was
captured temporarily until this change (of adding context argument to
`Init`) is made.

Addresses: #55821.

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@github-actions
Copy link

github-actions bot commented Sep 6, 2023

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

2 participants