-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dnm] tracing: investigate perf regression under sql.txn_stats.sample_rate=1 #61328
Commits on Mar 4, 2021
-
kv: permit merge transactions to refresh after subsumption
Fixes cockroachdb#59308. This commit adds support for range merge transactions to refresh. This is necessary to allow the merge transaction to have its write timestamp be bumped and still commit without retrying. In such cases, the transaction must refresh its reads, including its original read on the RHS's local range descriptor. If we were to block this refresh on the frozen RHS range, the merge would deadlock. On the surface, it seems unsafe to permit Refresh requests on an already subsumed RHS range, because the refresh's effect on the timestamp cache will never make it to the LHS leaseholder. This risks the future joint range serving a write that invalidates the Refresh. However, in this specific situation, we can be sure that such a serializability violation will not occur because the Range merge also writes to (deletes) this key. This means that if the Range merge transaction commits, its intent on the key will be resolved to the timestamp of the refresh and no future write will ever be able to violate the refresh. Conversely, if the Range merge transaction does not commit, then the merge will fail and the update to the RHS's timestamp cache will not be lost (not that this particularly matters in cases of aborted transactions). The same line of reasoning as the one above has motivated us to explore removing keys from a transaction's refresh spans when they are written to by the transaction, as the intents written by the transaction act as a form of pessimistic lock that obviate the need for the optimistic refresh. Such an improvement would eliminate the need for this special case, but until we generalize the mechanism to prune refresh spans based on intent spans, we're forced to live with this. See cockroachdb#59308 (comment) for why the original fix, which attempted to manually refresh the range merge transaction before it entered its critical phase, was not sufficient. Release justification: needed for new functionality.
Configuration menu - View commit details
-
Copy full SHA for 7d22d00 - Browse repository at this point
Copy the full SHA 7d22d00View commit details -
streamclient: add client for sinkless CREATE REPLICATOIN STREAM
This commit adds support for a stream client that can connect to a cluster and start replicating changes that happen on that stream. Release note: None Release justification: low-risk, high-benefit change
Configuration menu - View commit details
-
Copy full SHA for 42cbdee - Browse repository at this point
Copy the full SHA 42cbdeeView commit details -
sql: disallow creation of interleaved partitioned indexes
Previously, creating interleaved partitioned indexes panicked. This commit disallows their creation to prevent panicking. Fixes cockroachdb#60699 Release justification: This is a low risk change that prevents panics when attempting to create interleaved partitioned tables. Release note (bug fix): Creating interleaved partitioned indexes is now disallowed. Previously, the database would crash when trying to create one.
Configuration menu - View commit details
-
Copy full SHA for 1118904 - Browse repository at this point
Copy the full SHA 1118904View commit details -
execinfra: clean up ProcessorBase
This commit removes redundant context argument from `trailingMetaCallback` in favor of just using `ProcessorBase.Ctx` when needed. This was already actually the case since the callback was always called with `ProcessorBase.Ctx`. We also had the wrong sequences of actions when collecting trailing metadata in inverted joiner, join reader, and zigzag joiner, where we first called `InternalClose` and then generated the metadata. That is wrong because `InternalClose` replaces the context used throughout the operation of a processor with the "original" context (the one passed into `StartInternal`). This is now fixed. Additionally, a few unnecessary method overwrites are removed and `ProcessorBase` now provides the default implementation of `ConsumerClosed` (which simply calls `InternalClose`). This commit also makes a slight adjustment to context management of the change aggregator. Release justification: low-risk cleanup. Release note: None
Configuration menu - View commit details
-
Copy full SHA for ba35f1b - Browse repository at this point
Copy the full SHA ba35f1bView commit details -
server: remove unnecessary % escape
as of 1982047 this log line is now formatted with Go's templating system, which doesn't require that % is escaped. Before: ``` I210303 09:56:17.459802 311 2@server/status/runtime.go:553 ⋮ [n1] 76 runtime stats: 106 MiB RSS, 261 goroutines (stacks: 5.0 MiB), 29 MiB/63 MiB Go alloc/total (heap fragmentation: 6.8 MiB, heap reserved: 11 MiB, heap released: 12 MiB), 7.6 MiB/13 MiB CGO alloc/total (0.6 CGO/sec), 1.8/1.2 %%(u/s)time, 0.0 %%gc (0x ), 17 KiB/22 KiB (r/w)net ``` After: ``` I210303 09:59:17.607021 59 2@server/status/runtime.go:553 ⋮ [n1] 71 runtime stats: 92 MiB RSS, 255 goroutines (stacks: 3.7 MiB), 27 MiB/48 MiB Go alloc/total (heap fragmentation: 4.8 MiB, heap reserved: 1.3 MiB, heap released: 27 MiB), 10 MiB/15 MiB CGO alloc/total (0.0 CGO/sec), 0.0/0.0 %(u/s)time, 0.0 %gc (0x), 30 KiB/34 KiB (r/w)net ``` Release justification: Negligible risk. Release note: None
Configuration menu - View commit details
-
Copy full SHA for f85ff2a - Browse repository at this point
Copy the full SHA f85ff2aView commit details -
closedts/sidetransport: fix goroutine leak
Fix a goroutine leak caused by gRPC streams. It turns out that receiving an error from an RPC's stream is not enough to cleanup the stream, you need to do something else as described here https://pkg.go.dev/google.golang.org/grpc#ClientConn.NewStream This patch starts canceling the stream's context whenever the stream is discarded. Release note: None Release justification: Bug fix for new functionality.
Configuration menu - View commit details
-
Copy full SHA for 050241d - Browse repository at this point
Copy the full SHA 050241dView commit details -
colflow: clean up vectorized stats propagation
Previously, in order to propagate execution statistics we were creating temporary tracing spans, setting the stats on them, and finishing the spans right away. This allowed for using (to be more precise, abusing) the existing infrastructure. The root of the problem is that in the vectorized engine we do not start per-operator span if stats collection is enabled at the moment, so we had to get around that limitation. However, this way is not how tracing spans are intended to be used and creates some performance slowdown in the hot path, so this commit refactors the situation. Now we are ensuring that there is always a tracing span available at the "root" components (either root materializer or an outbox), so when root components are finishing the vectorized stats collectors for their subtree of operators, there is a span to record the stats into. This required the following minor adjustments: - in the materializer, we now delegate attachment of the stats to the tracing span to the drainHelper (which does so on `ConsumerDone`). Note that the drainHelper doesn't get the recording from the span and leaves that to the materializer (this is needed in order to avoid collecting duplicate trace data). - in the outbox, we now start a "remote child span" (if there is a span in the parent context) in the beginning of `Run` method, and we attach that stats in `sendMetadata`. Release justification: low-risk update to existing functionality. Release note: None
Configuration menu - View commit details
-
Copy full SHA for ec7d1d6 - Browse repository at this point
Copy the full SHA ec7d1d6View commit details -
[DNM] adjust KV benchmark to introduce sample rate
One could run the relevant benchmark with `make bench PKG=./pkg/sql/tests BENCHES=BenchmarkKV//SQL`. Release note: None
Configuration menu - View commit details
-
Copy full SHA for d94e463 - Browse repository at this point
Copy the full SHA d94e463View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7d843f3 - Browse repository at this point
Copy the full SHA 7d843f3View commit details -
Configuration menu - View commit details
-
Copy full SHA for f273b30 - Browse repository at this point
Copy the full SHA f273b30View commit details -
Configuration menu - View commit details
-
Copy full SHA for e31d115 - Browse repository at this point
Copy the full SHA e31d115View commit details -
sql: use stmt's span for exec stats propagation
Release justification: low-risk update to new functionality. Release note: None
Configuration menu - View commit details
-
Copy full SHA for 95970ec - Browse repository at this point
Copy the full SHA 95970ecView commit details -
metric: break dependency on 'log' package
Release justification: low-risk change and prerequisite for metrics improvement. Release note: None
Configuration menu - View commit details
-
Copy full SHA for bf5d4a4 - Browse repository at this point
Copy the full SHA bf5d4a4View commit details -
Release justification: low-risk, high benefit change to existing functionality Release note (ops change): Added the `trace.spans_created.count` and `trace.registry.size` metrics that account for the number of created trace spans and tracked local root spans, respectively.
Configuration menu - View commit details
-
Copy full SHA for 8b7fd22 - Browse repository at this point
Copy the full SHA 8b7fd22View commit details -
``` $ ./cmp.sh name old time/op new time/op delta KV/Scan/SQL/rows=1/sample_rate=X-16 267µs ± 9% 313µs ± 8% +17.32% (p=0.000 n=22+22) ``` Release justification: Release note (<category, see below>): <what> <show> <why>
Configuration menu - View commit details
-
Copy full SHA for 4953761 - Browse repository at this point
Copy the full SHA 4953761View commit details -
tracing: add synthetic SpanStats benchmark
``` $ go test -run - -bench SpanStats -count 1 ./pkg/util/tracing goos: linux goarch: amd64 pkg: github.com/cockroachdb/cockroach/pkg/util/tracing BenchmarkSpanStats/autoCollection=false,stats=0-16 280952 4326 ns/op BenchmarkSpanStats/autoCollection=false,stats=1-16 158040 7901 ns/op BenchmarkSpanStats/autoCollection=true,stats=0-16 471117 2631 ns/op BenchmarkSpanStats/autoCollection=true,stats=1-16 242348 5919 ns/op PASS ok github.com/cockroachdb/cockroach/pkg/util/tracing 8.414s ``` The case `autoCollection=false,stats=1` should roughly reflect the SQL usage in which stats are created, transported via a recording, ingested into a parent, and then retrieved and unmarshaled. The case `false,stats=0` should correspond to the non-sampled case. We are thus expecting roughly an overhead of 4.3µs/op when not sampled, 8µs/op when sampled, i.e. changing the sampling rate from 0 to 1 should add ~4µs/op. However, the macro-level benchmark shows much more: ``` KV/Scan/SQL/rows=1/sample_rate=X-16 268µs ± 9% 314µs ± 7% +17.03% (p=0.000 n=21+21) ``` That's a drop of 46µs incurred by changing the sampling rate from zero to one, much past what can be explained by this microbenchmark. The autoCollection=true numbers avoid an extra unmarshal-remarshal cycle of the recording. This is a possible optimization for SQL to investigate in the case in which the flow lives wholly on the gateway. Unfortunately, it is finicky to get right and causes lots of special casing, as SQL currently relies on all trace data arriving at the DistSQL RowReceiver via TrailingMeta (i.e. marshalled recordings). Release justification: Release note (<category, see below>): <what> <show> <why>
Configuration menu - View commit details
-
Copy full SHA for 68eceb0 - Browse repository at this point
Copy the full SHA 68eceb0View commit details -
add spans/op measurement to BenchmarkKV
This shows that in this benchmark, we're creating one additional span: 7 without sampling, 8 with sampling. This one additional trace span is unlikely to cost ~40us, but avoiding it would establish more parity between the tests. Can we?
Configuration menu - View commit details
-
Copy full SHA for a0e2cf8 - Browse repository at this point
Copy the full SHA a0e2cf8View commit details