server: log runtime stats as a structured event to the health channel #65008

cameronnunez · 2021-05-11T16:43:20Z

cockroach-teamcity · 2021-05-11T16:43:29Z

This change is

This commit filters out importSpans that have no associated data. File spans' start keys are set to be the .Next() of the EndKey of the previous file span. This was at least the case on the backup that is currently being run against restore2TB. This lead to a lot of importSpan entries that contained no files and would cover an empty key range. The precense of these import spans did not effect the performance of restores that much, but they did cause unnecessary splits and scatters which further contributed to the elevated NLHEs that were seen. This commit generally calms down the leaseholder errors, but do not eliminate them entirely. Future investigation is still needed. It also does not seem to have any performance impact on RESTORE performance (in terms of speed of the job). Release note: None

This change reworks how RESTORE splits and scatter's in 2 ways: 1. This commits updates the split and scatter mechanism that restore uses to split at the key of the next chunk/entry rather than at the current chunk/entry. This resolves a long-standing TODO that updates the split and scattering of RESTORE to perform a split at the key of the next chunk/entry. Previously, we would split at a the start key of the span, and then scatter that span. Consider a chunk with entries that we want to split and scatter. If we split at the start of each entry and scatter the entry we just split off (the right hand side of the split), we'll be continuously scattering the suffix keyspace of the chunk. It's preferrable to split out a prefix first (by splitting at the key of the next chunk) and scattering the prefix. We additionally refactor the split and scatter processor interface and stop scattering the individual entries, but rather just split them. 2. Individual entries are no longer scattered. They used to be scattered with the randomizeLeases option set to false, but there was not any indication that this was beneficial to performance, so it was removed for simplicity. Release note: None

When executing a split, it's surprisingly tricky to learn what the resulting left-hand and right-hand side is. This is because when you retrieve it "after the fact", other operations may have changed the split already (for example, the split could have been merged, or additional splits added) and while you would get descriptors back, they wouldn't be meaningfully connected to the split any more in all cases. Really one would want to return the descriptors from the split txn itself, but AdminSplit is a no-op when the split already exists and so we would need special logic that performs a range lookup on the left neighbor. It could all be done, but does not seem worth it. There's still a nice simplification here that lets us remove the ad-hoc code, and we'll just accept that when there are concurrent splits the return values may not exactly line up with the split. This came out of #64060 (comment). Release note: None

Some tests, like `//pkg/server:server_test`, are sharded, meaning that their `test.log`/`test.xml` files are split into many different folders in the `bazel-testlogs` directory, like: bazel-testlogs/pkg/server/server_test/shard_1_of_16/test.log Before this patch, `bazci` would only look in the `server_test` directory for the `test.{log,xml}`, and would therefore miss these. Now, `bazci` globs and mirrors the directory structure of the source test directory. Also add unit tests to cover this case. Resolves #63960. Release note: None

The node selection logic was untested. This commit adds the missing test. Release note: None

We have observed in practice that the majority of the size of generated `debug zip` outputs is attributable to the discrete files retrieved from the server: - log files - goroutine dumps - heap profiles As clusters get larger and older, these files increase in number, and the size of the `debug zip` output grows accordingly. Meanwhile, in practice, there are many support cases where we know ahead of running the `debug zip` command that only certain files are going to be useful. Therefore, we wish to be able to restrict which files get included / excluded from the generated archive. Hence the functional change described below. (Note: there is another expressed wish to limit the size of generated archives based on a *time or date range*, not just file names / types. This will be handled in a separate commit.) Release note (cli change): It is now possible to fine tune which files get retrieved from the server nodes by the `debug zip` command, using the new flag `--include-files` and `--exclude-files`. These flags take glob patterns that are applied to the file names server-side. For example, to include only log files, use `--include-files='*.log'`. The command `cockroach debug list-files` also accepts these flags and can thus be used to experiment with the new flags before running the `debug zip` command.

Release note: None

The size of generated `debug zip` outputs is mostly attributable to the discrete files retrieved from the server: - log files - goroutine dumps - heap profiles Even though we have various garbage collection techniques applied to these files, they still yield very large `debug zip` outputs: - the default retained cap for log files is 100MB per group, so with the default of up to 5 log groups per node, we are retaining up to 500MB of log files per node. - the default retained cap for goroutine dumps, heap profiles and memory stats is 128MB for each, so we are retaining up to 384MB of these files per node. For example, for a 10-node cluster we are looking at up to 7GB of data retained in these files in the default configuration. While this amount of data is relatively innocuous while it remains in the data directories server-side, the `debug zip` behavior which was previously to retrieve *all* of it was detrimental to the process of troubleshooting anomalies: this is a lot of data to move around, it cannot be e-mailed effectively, etc. In comparison, the other items retrieved by the `debug zip` command (range descriptors, SQL system tables) are typically just kilobytes large. Even a large cluster with tens of thousands of ranges and system tables typically incurs `zip` payloads no greater than a dozen megabytes. Hence the change described in the release note below. Release note (cli change): The `cockroach debug zip` command now retrieves only the log files, goroutine dumps and heap profiles pertaining to the last 48 hours prior to the command invocation. This behavior is supported entirely client-side, i.e. it is not necessary to upgrade the server nodes to effect these newly configurable limits. The other data items retrieved by `debug zip` are not affected by this time limit. This behavior can be customized by the two new flags `--files-from` and `--files-until`. Both are optional. See the command-line help text for details. The two new flags are also supported by `cockroach debug list-files`. It is advised to experiment with `list-files` prior to issuing a `debug zip` command that may retrieve a large amount of data.

We used to do this after `make buildshort` and `make generate`, but those can take a while. Do it before so obvious Bazel linter errors break the build sooner. Release note: None

`tpccbench` is set up to "handle" (ignore) crashes during its line search on the assumption that these are due to pushing CRDB into overload territory, which at the time of writing it does not handle gracefully. There was a special case in which this was broken, namely that of the line search terminating in a final step with a crash. In that case, the cluster would be left running with one node down, which roachtest checks and emits as an error. Unconditionally restart the cluster after the line search (assuming it found a passing warehouse count, i.e. didn't error out itself) to make roachtest happy. Closes #64187. Release note: None

See #61981 (comment). Release note: None

This commit makes the roachtest web ui link to respective clusters' admin UI. Release note: None

…tors Also probably fixes a bug due to bad transaction retry handling in import. Touches #64134. Release note: None

Previously, we were only storing the `stats.json` files for the `master` branch, meaning we'd fly blind for `release-*`. This change makes it so that we also upload the files for the release branches. We need to use a different artifacts dir so that roachperf does not mistakenly pick up `release-*` results as part of its data processing for the `master` branch[1]. At time of writing, roachperf is hard-wired to the `master` branch, so the newly collected information for `release-21.1` will not be picked up yet. [1]: https://github.com/cockroachdb/roachperf/blob/719f1e432762f8800886f36f0c5dcfd282e86aeb/main.go#L94-L97 Release note: None

Release note: None

Previously, we use the projections map directly, which violates abstraction bounds as we probably want to keep that map private. We remedy this by moving everything behind the a Projection function abstraction which prevents this leakage. Furthermore, make Projection return an error directly. This is marked, so we are able to reword error messages in other places. Release note: None

Release note: None

In debugging a few roachtest failures (#61541) and other issues (https://github.com/cockroachlabs/support/issues/893#issuecomment-818263346), we've gained a lot by deducing the type of lease acquisition that resulted in a given lease. This commit persists this information through to every replica's `leaseHistory` and adds it to the lease history table on the range report page in the Admin UI. Release note (ui change): The lease history section on the range report page now shows the type of lease acquisition event that resulted in a given lease.

Fixes #64101 Release note (bug fix): Fix a bug where view expressions created using an array enum, without a name for the column, could cause failures when dropping unrelated enum values.

This is helpful for error messages which allows the schema name to be output in the error message. Release note: None

Release note (sql change): Disallow ADD/DROP REGION whilst a REGIONAL BY ROW table has index changes underway or a table is transitioning to or from REGIONAL BY ROW.

Release note (sql change): Prevent index modification on REGIONAL BY ROW tables and locality to/from REGIONAL BY ROW changes during a ADD/DROP REGION.

This patch improves the respective assertion failure message. For no good reason we weren't including the request triggering the assertion if it wasn't an "intent write". Also include the command's Raft term and index. Release note: None

This patch improves the closed timestamp regression assertion we've seen fire in #61981 to include a tail of the Raft log. Hopefully we never see that assertion fire again, but still I'd like to introduce a precedent for easily printing the log programatically. Also, the assertion now tells people about COCKROACH_RAFT_CLOSEDTS_ASSERTIONS_ENABLED. If the assertion fires and crashes nodes, those nodes will continue crashing on restart as they try to apply the same entries over and over. Release note: None

This highlights a problem with the fast path: we use statistics that are derived using placeholders, which are usually terrible. Release note: None

This commit improves the FD generation for Select when we have filters like `x = $1`. Because these filters have placeholders, they do not generate constraints (which is the normal mechanism for detecting constant columns). This improves the cardinality property and row count estimate. The estimate will be used for the placeholder fast path. Release note: None

This change is best explained by this comment: ``` // We are dealing with a memo that still contains placeholders. The statistics // for such a memo can be wildly overestimated. Even if our plan is correct, // the estimated row count for a scan is passed to the execution engine which // uses it to make sizing decisions. Passing a very high count can affect // performance significantly (see #64214). So we only use the fast path if the // estimated row count is small; typically this will happen when we constrain // columns that form a key and we know there will be at most one row. ``` Fixes #64214. Release note (bug fix): fixed a performance regression for very simple queries.

This is a "forward-port" of regression tests from #64226. This commit contains no code changes because the bug was already been fixed on master by #59687. Release note: None

This change adds a few more errors that were exposed when stressing a previously skipped test, that can occur when a worker node running the import is shutdown. These errors should not cause the job to fail but instead trigger a replan. I ran 500+ iterations under stress of TestImportWorkerFailure. Release note: None

This change adds a unit test that shuts down a worker to test that the backup job retries when it sees certain retryable errors. This test uncovered a situation in which a backup data processor could be moved to state other than StatusRunning, that caused it to exit early from the Next() method. The goroutines responsible for exporting the spans get stuck attempting to push to the progCh that is only drained after the above StatusRunning check. This change drains the progCh at the beginning of the Next() method. Release note: None

This option allows the user to specify a target size per CSV file during an EXPORT. Once this target size is hit, we will upload the CSV file before processing anymore rows. Release note (sql change): Add `chunk_size` option to EXPORT CSV to control the target CSV file size.

Closes #64602. Release note: None

After executing a subquery, the result is materialized into a single tuple (or row, or value). In some cases that tuple can be arbitrarily large. This commit adds after-the-fact memory accounting for the size of the resulting datum. The burden of clearing the memory account is put onto the caller of `PlanAndRunSubqueries` method. Release note (bug fix): CockroachDB now should crash less often due to OOM conditions caused by the subqueries returning multiple rows of large size.

Crashes during random syntax tests now report more useful information to the issue including the error message, the SQL that produced the error, and the database schema. The report will be formatted as: > Random syntax error: > > ``` > rsg_test.go:755: Crash detected: server panic: pq: internal error: something bad > ``` > Query: > > ``` > SELECT > foo > FROM > bar > LIMIT > 33:::INT8; > ``` > Schema: > > ``` > rsg_test.go:575: To reproduce, use schema: > rsg_test.go:577: > CREATE TABLE table1 (col1_0 BOOL); > ; > rsg_test.go:577: > > CREATE TYPE greeting AS ENUM ('hello', 'howdy', 'hi', 'good day', 'morning'); > ; > rsg_test.go:579: > rsg_test.go:580: -- test log scope end -- > test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestRandomSyntaxSQLSmith460792454 > --- FAIL: TestRandomSyntaxSQLSmith (300.69s) > ``` Release note: None

previously 'show x in y' collection show syntax was only checking the scheme of x which in 'x in y' is empty. Release note: none.

Previously we had two methods that dug through ExternalStorage URI params: One, which extracted the params to create a structured config that would actually be used to open and interact with that storage, and a second that looked for authorization-releated params to decide if access was explicitly or implicitly authorized (i.e. had credentials specified, or made use of the access the node performing the action had). This changes the latter to instead be a function of the former, so there is only one method that extracts parameters from string URIs to produce a structured configuration object, and then the determination of if a given configuration is one that carries explicit authorization can be done from that config object. Release note: none.

This changes the redaction of URIs from being done from a pre-defined map of sensitive params to a runtime populated map, in which implementations register their sensitive params during their registration during init. Release note: none.

Release note (sql change): Collated strings may now have a locale that is a language tag, followed by a `-u-` suffix, followed by anything else. For example, any locale with a prefix of `en-US-u-` is now considered valid.

Fixing a corner case where we are requesting null rejection on a non-null Scan column. This causes a stack overflow. We also add an assertion that would have caught this (and returned an internal error instead of crashing the node). Fixes #64661. Release note (bug fix): fixed a stack overflow that can happen in some cornercases involving partial indexes with predicates containing `(x IS NOT NULL)`.

The math functions are not included in libc in BSD. Release note: None

intentInterleavingIter uses the *WithLimit() functions exposed by pebble.Iterator. This is motivated by the O(N^2) complexity we observe when doing coordinated iteration over the lock table and MVCC keys where all the intents in the lock table are deleted but have not yet been compacted away. The O(N^2) complexity arises in two cases: - Limited scans: the extreme is a limited scan with a limit of 1. Each successive scan traverses N, N-1, ... deleted locks. - Mix of forward/reverse iteration in a single scan: A Next/SeekGE will incur a cost of N and the Prev/SeekLT another cost of N. This situation could be potentially fixed with a narrower solution, given the constrained way that the combination of forward and backward iteration is used in pebbleMVCCScanner (forward iteration using Next and SeekGE when doing a reverse scan), however the solution is likely to be fragile since it would require the code in intentInterleavingIter to know how it is used in pebbleMVCCScanner. The limited iteration mode solves it in a more general way. By placing a limit when doing such coordinated iteration, we restore iteration to O(N) for N keys. Some preliminary benchmark numbers are included -- old is master. Note the speedup in the ScanOneAllIntentsResolved/separated=true cases which were taking many ms. The remaining are effectively unchanged (these are from single runs, so some of the deltas there may be noise). Note that there is a shift in where comparisons get performed in some of these benchmarks: the comparison i.intentKey.Compare(i.iterKey.Key) in intentInterleavingIter is replaced (when there is a live intent for a different key) with a corresponding comparison in pebble.Iterator's find{Next,Prev}Entry comparing with the limit parameter. However we are still paying the cost for extra comparisons when there are only dead intents within the iterator bounds: say there are N keys each with M versions, so N*M dead intents each with a SINGLEDEL,SET pair (so 2*N*M keys in Pebble, of which N*M are considered distinct UserKeys). There will eventually be N calls to pebble.Iterator.findNextEntry (when iterating forward), each of which will call Iterator.nextUserKey M times, after each of which it will do a comparison with the limit key. So a total of N*M extra comparisons to iterate over these 2*N*M keys in Pebble. The cost of these seems negligible based on the performance numbers below, which are not with a deep mergingIter stack in Pebble -- a more realistic setting with a deep mergingIter stack will have more comparisons for merging across iterators, which should further decrease the relative performance effect of the N*M comparisons we are paying in the top-level pebble.Iterator. name old time/op new time/op delta ScanOneAllIntentsResolved/separated=false/versions=200/percent-flushed=0-16 51.1µs ± 0% 49.9µs ± 0% -2.26% ScanOneAllIntentsResolved/separated=false/versions=200/percent-flushed=50-16 21.9µs ± 0% 24.2µs ± 0% +10.52% ScanOneAllIntentsResolved/separated=false/versions=200/percent-flushed=90-16 5.52µs ± 0% 5.50µs ± 0% -0.36% ScanOneAllIntentsResolved/separated=false/versions=200/percent-flushed=100-16 2.37µs ± 0% 2.15µs ± 0% -9.04% ScanOneAllIntentsResolved/separated=true/versions=200/percent-flushed=0-16 47.4ms ± 0% 0.1ms ± 0% -99.88% ScanOneAllIntentsResolved/separated=true/versions=200/percent-flushed=50-16 11.1ms ± 0% 0.0ms ± 0% -99.81% ScanOneAllIntentsResolved/separated=true/versions=200/percent-flushed=90-16 1.07ms ± 0% 0.01ms ± 0% -99.48% ScanOneAllIntentsResolved/separated=true/versions=200/percent-flushed=100-16 2.08µs ± 0% 2.04µs ± 0% -1.88% IntentScan/separated=false/versions=10/percent-flushed=0-16 2.96µs ± 0% 2.80µs ± 0% -5.67% IntentScan/separated=false/versions=10/percent-flushed=50-16 1.93µs ± 0% 1.80µs ± 0% -6.78% IntentScan/separated=false/versions=10/percent-flushed=80-16 1.55µs ± 0% 1.46µs ± 0% -5.73% IntentScan/separated=false/versions=10/percent-flushed=90-16 1.22µs ± 0% 1.20µs ± 0% -2.05% IntentScan/separated=false/versions=10/percent-flushed=100-16 937ns ± 0% 864ns ± 0% -7.79% IntentScan/separated=false/versions=100/percent-flushed=0-16 27.0µs ± 0% 19.8µs ± 0% -26.65% IntentScan/separated=false/versions=100/percent-flushed=50-16 13.3µs ± 0% 9.2µs ± 0% -31.13% IntentScan/separated=false/versions=100/percent-flushed=80-16 4.06µs ± 0% 3.71µs ± 0% -8.72% IntentScan/separated=false/versions=100/percent-flushed=90-16 2.69µs ± 0% 2.59µs ± 0% -3.93% IntentScan/separated=false/versions=100/percent-flushed=100-16 985ns ± 0% 931ns ± 0% -5.48% IntentScan/separated=false/versions=200/percent-flushed=0-16 50.1µs ± 0% 50.9µs ± 0% +1.76% IntentScan/separated=false/versions=200/percent-flushed=50-16 26.7µs ± 0% 19.7µs ± 0% -26.19% IntentScan/separated=false/versions=200/percent-flushed=80-16 8.76µs ± 0% 6.30µs ± 0% -28.06% IntentScan/separated=false/versions=200/percent-flushed=90-16 4.18µs ± 0% 3.83µs ± 0% -8.33% IntentScan/separated=false/versions=200/percent-flushed=100-16 1.04µs ± 0% 0.95µs ± 0% -8.47% IntentScan/separated=false/versions=400/percent-flushed=0-16 22.2µs ± 0% 15.5µs ± 0% -30.26% IntentScan/separated=false/versions=400/percent-flushed=50-16 9.76µs ± 0% 6.91µs ± 0% -29.22% IntentScan/separated=false/versions=400/percent-flushed=80-16 20.4µs ± 0% 12.3µs ± 0% -39.94% IntentScan/separated=false/versions=400/percent-flushed=90-16 9.09µs ± 0% 6.29µs ± 0% -30.87% IntentScan/separated=false/versions=400/percent-flushed=100-16 1.79µs ± 0% 1.68µs ± 0% -6.35% IntentScan/separated=true/versions=10/percent-flushed=0-16 3.47µs ± 0% 3.33µs ± 0% -4.21% IntentScan/separated=true/versions=10/percent-flushed=50-16 1.93µs ± 0% 1.83µs ± 0% -5.17% IntentScan/separated=true/versions=10/percent-flushed=80-16 1.38µs ± 0% 1.49µs ± 0% +8.15% IntentScan/separated=true/versions=10/percent-flushed=90-16 1.20µs ± 0% 1.36µs ± 0% +13.36% IntentScan/separated=true/versions=10/percent-flushed=100-16 923ns ± 0% 1002ns ± 0% +8.56% IntentScan/separated=true/versions=100/percent-flushed=0-16 25.3µs ± 0% 28.9µs ± 0% +14.20% IntentScan/separated=true/versions=100/percent-flushed=50-16 6.74µs ± 0% 7.27µs ± 0% +7.88% IntentScan/separated=true/versions=100/percent-flushed=80-16 3.43µs ± 0% 3.69µs ± 0% +7.43% IntentScan/separated=true/versions=100/percent-flushed=90-16 2.31µs ± 0% 2.60µs ± 0% +12.38% IntentScan/separated=true/versions=100/percent-flushed=100-16 973ns ± 0% 1056ns ± 0% +8.53% IntentScan/separated=true/versions=200/percent-flushed=0-16 46.9µs ± 0% 51.8µs ± 0% +10.41% IntentScan/separated=true/versions=200/percent-flushed=50-16 14.7µs ± 0% 13.8µs ± 0% -6.30% IntentScan/separated=true/versions=200/percent-flushed=80-16 5.70µs ± 0% 6.04µs ± 0% +6.02% IntentScan/separated=true/versions=200/percent-flushed=90-16 3.47µs ± 0% 3.69µs ± 0% +6.52% IntentScan/separated=true/versions=200/percent-flushed=100-16 0.99µs ± 0% 1.06µs ± 0% +6.83% IntentScan/separated=true/versions=400/percent-flushed=0-16 32.4µs ± 0% 26.3µs ± 0% -18.79% IntentScan/separated=true/versions=400/percent-flushed=50-16 37.4µs ± 0% 38.9µs ± 0% +4.03% IntentScan/separated=true/versions=400/percent-flushed=80-16 12.7µs ± 0% 11.8µs ± 0% -6.98% IntentScan/separated=true/versions=400/percent-flushed=90-16 5.56µs ± 0% 6.19µs ± 0% +11.30% IntentScan/separated=true/versions=400/percent-flushed=100-16 1.73µs ± 0% 1.87µs ± 0% +8.23% ScanAllIntentsResolved/separated=false/versions=200/percent-flushed=0-16 46.7µs ± 0% 47.9µs ± 0% +2.62% ScanAllIntentsResolved/separated=false/versions=200/percent-flushed=50-16 19.8µs ± 0% 19.2µs ± 0% -3.20% ScanAllIntentsResolved/separated=false/versions=200/percent-flushed=90-16 3.83µs ± 0% 3.89µs ± 0% +1.62% ScanAllIntentsResolved/separated=false/versions=200/percent-flushed=100-16 972ns ± 0% 993ns ± 0% +2.16% ScanAllIntentsResolved/separated=true/versions=200/percent-flushed=0-16 63.1µs ± 0% 59.6µs ± 0% -5.55% ScanAllIntentsResolved/separated=true/versions=200/percent-flushed=50-16 17.0µs ± 0% 15.1µs ± 0% -10.87% ScanAllIntentsResolved/separated=true/versions=200/percent-flushed=90-16 3.26µs ± 0% 3.61µs ± 0% +10.73% ScanAllIntentsResolved/separated=true/versions=200/percent-flushed=100-16 918ns ± 0% 945ns ± 0% +2.94% IntentResolution/separated=false/versions=10/percent-flushed=0-16 2.25µs ± 0% 2.23µs ± 0% -1.15% IntentResolution/separated=false/versions=10/percent-flushed=50-16 3.09µs ± 0% 2.89µs ± 0% -6.53% IntentResolution/separated=false/versions=10/percent-flushed=80-16 2.73µs ± 0% 2.44µs ± 0% -10.64% IntentResolution/separated=false/versions=10/percent-flushed=90-16 1.90µs ± 0% 1.88µs ± 0% -0.95% IntentResolution/separated=false/versions=10/percent-flushed=100-16 2.45µs ± 0% 2.41µs ± 0% -1.59% IntentResolution/separated=false/versions=100/percent-flushed=0-16 4.59µs ± 0% 4.70µs ± 0% +2.42% IntentResolution/separated=false/versions=100/percent-flushed=50-16 3.83µs ± 0% 3.91µs ± 0% +2.01% IntentResolution/separated=false/versions=100/percent-flushed=80-16 3.18µs ± 0% 3.14µs ± 0% -1.23% IntentResolution/separated=false/versions=100/percent-flushed=90-16 3.19µs ± 0% 3.06µs ± 0% -4.20% IntentResolution/separated=false/versions=100/percent-flushed=100-16 2.65µs ± 0% 2.34µs ± 0% -11.71% IntentResolution/separated=false/versions=200/percent-flushed=0-16 5.76µs ± 0% 5.70µs ± 0% -1.04% IntentResolution/separated=false/versions=200/percent-flushed=50-16 4.09µs ± 0% 4.01µs ± 0% -2.05% IntentResolution/separated=false/versions=200/percent-flushed=80-16 3.22µs ± 0% 3.15µs ± 0% -2.05% IntentResolution/separated=false/versions=200/percent-flushed=90-16 3.14µs ± 0% 3.17µs ± 0% +0.67% IntentResolution/separated=false/versions=200/percent-flushed=100-16 2.54µs ± 0% 2.50µs ± 0% -1.81% IntentResolution/separated=false/versions=400/percent-flushed=0-16 2.83µs ± 0% 2.94µs ± 0% +3.74% IntentResolution/separated=false/versions=400/percent-flushed=50-16 2.75µs ± 0% 2.71µs ± 0% -1.74% IntentResolution/separated=false/versions=400/percent-flushed=80-16 2.84µs ± 0% 2.83µs ± 0% -0.14% IntentResolution/separated=false/versions=400/percent-flushed=90-16 2.78µs ± 0% 2.72µs ± 0% -2.09% IntentResolution/separated=false/versions=400/percent-flushed=100-16 2.04µs ± 0% 2.03µs ± 0% -0.39% IntentResolution/separated=true/versions=10/percent-flushed=0-16 3.03µs ± 0% 3.07µs ± 0% +1.15% IntentResolution/separated=true/versions=10/percent-flushed=50-16 3.21µs ± 0% 3.24µs ± 0% +0.84% IntentResolution/separated=true/versions=10/percent-flushed=80-16 2.88µs ± 0% 2.77µs ± 0% -3.81% IntentResolution/separated=true/versions=10/percent-flushed=90-16 2.67µs ± 0% 2.67µs ± 0% -0.04% IntentResolution/separated=true/versions=10/percent-flushed=100-16 2.21µs ± 0% 2.22µs ± 0% +0.32% IntentResolution/separated=true/versions=100/percent-flushed=0-16 8.16µs ± 0% 8.36µs ± 0% +2.44% IntentResolution/separated=true/versions=100/percent-flushed=50-16 4.18µs ± 0% 3.98µs ± 0% -4.69% IntentResolution/separated=true/versions=100/percent-flushed=80-16 4.03µs ± 0% 4.10µs ± 0% +1.76% IntentResolution/separated=true/versions=100/percent-flushed=90-16 3.74µs ± 0% 3.78µs ± 0% +1.23% IntentResolution/separated=true/versions=100/percent-flushed=100-16 2.20µs ± 0% 2.26µs ± 0% +2.77% IntentResolution/separated=true/versions=200/percent-flushed=0-16 13.2µs ± 0% 12.8µs ± 0% -3.25% IntentResolution/separated=true/versions=200/percent-flushed=50-16 4.37µs ± 0% 4.28µs ± 0% -1.90% IntentResolution/separated=true/versions=200/percent-flushed=80-16 4.23µs ± 0% 4.38µs ± 0% +3.62% IntentResolution/separated=true/versions=200/percent-flushed=90-16 3.98µs ± 0% 4.08µs ± 0% +2.44% IntentResolution/separated=true/versions=200/percent-flushed=100-16 2.35µs ± 0% 2.28µs ± 0% -3.15% IntentResolution/separated=true/versions=400/percent-flushed=0-16 4.64µs ± 0% 4.59µs ± 0% -0.93% IntentResolution/separated=true/versions=400/percent-flushed=50-16 5.06µs ± 0% 4.77µs ± 0% -5.74% IntentResolution/separated=true/versions=400/percent-flushed=80-16 3.91µs ± 0% 3.78µs ± 0% -3.32% IntentResolution/separated=true/versions=400/percent-flushed=90-16 3.54µs ± 0% 3.50µs ± 0% -1.05% IntentResolution/separated=true/versions=400/percent-flushed=100-16 2.24µs ± 0% 2.28µs ± 0% +2.01% IntentRangeResolution/separated=false/versions=10/percent-flushed=0-16 265µs ± 0% 277µs ± 0% +4.64% IntentRangeResolution/separated=false/versions=10/percent-flushed=50-16 211µs ± 0% 217µs ± 0% +3.19% IntentRangeResolution/separated=false/versions=10/percent-flushed=80-16 202µs ± 0% 209µs ± 0% +3.61% IntentRangeResolution/separated=false/versions=10/percent-flushed=90-16 197µs ± 0% 201µs ± 0% +1.87% IntentRangeResolution/separated=false/versions=10/percent-flushed=100-16 150µs ± 0% 154µs ± 0% +3.07% IntentRangeResolution/separated=false/versions=100/percent-flushed=0-16 528µs ± 0% 531µs ± 0% +0.52% IntentRangeResolution/separated=false/versions=100/percent-flushed=50-16 311µs ± 0% 302µs ± 0% -2.88% IntentRangeResolution/separated=false/versions=100/percent-flushed=80-16 232µs ± 0% 230µs ± 0% -0.94% IntentRangeResolution/separated=false/versions=100/percent-flushed=90-16 227µs ± 0% 226µs ± 0% -0.35% IntentRangeResolution/separated=false/versions=100/percent-flushed=100-16 164µs ± 0% 161µs ± 0% -2.31% IntentRangeResolution/separated=false/versions=200/percent-flushed=0-16 980µs ± 0% 616µs ± 0% -37.09% IntentRangeResolution/separated=false/versions=200/percent-flushed=50-16 396µs ± 0% 323µs ± 0% -18.36% IntentRangeResolution/separated=false/versions=200/percent-flushed=80-16 263µs ± 0% 249µs ± 0% -5.28% IntentRangeResolution/separated=false/versions=200/percent-flushed=90-16 287µs ± 0% 236µs ± 0% -17.86% IntentRangeResolution/separated=false/versions=200/percent-flushed=100-16 161µs ± 0% 164µs ± 0% +2.08% IntentRangeResolution/separated=false/versions=400/percent-flushed=0-16 263µs ± 0% 267µs ± 0% +1.67% IntentRangeResolution/separated=false/versions=400/percent-flushed=50-16 254µs ± 0% 257µs ± 0% +1.37% IntentRangeResolution/separated=false/versions=400/percent-flushed=80-16 257µs ± 0% 259µs ± 0% +0.94% IntentRangeResolution/separated=false/versions=400/percent-flushed=90-16 246µs ± 0% 256µs ± 0% +4.22% IntentRangeResolution/separated=false/versions=400/percent-flushed=100-16 172µs ± 0% 175µs ± 0% +1.46% IntentRangeResolution/separated=true/versions=10/percent-flushed=0-16 443µs ± 0% 447µs ± 0% +0.77% IntentRangeResolution/separated=true/versions=10/percent-flushed=50-16 295µs ± 0% 287µs ± 0% -2.64% IntentRangeResolution/separated=true/versions=10/percent-flushed=80-16 241µs ± 0% 254µs ± 0% +5.27% IntentRangeResolution/separated=true/versions=10/percent-flushed=90-16 238µs ± 0% 240µs ± 0% +0.82% IntentRangeResolution/separated=true/versions=10/percent-flushed=100-16 220µs ± 0% 225µs ± 0% +2.17% IntentRangeResolution/separated=true/versions=100/percent-flushed=0-16 1.84ms ± 0% 2.04ms ± 0% +10.39% IntentRangeResolution/separated=true/versions=100/percent-flushed=50-16 554µs ± 0% 583µs ± 0% +5.11% IntentRangeResolution/separated=true/versions=100/percent-flushed=80-16 377µs ± 0% 389µs ± 0% +3.31% IntentRangeResolution/separated=true/versions=100/percent-flushed=90-16 309µs ± 0% 316µs ± 0% +2.30% IntentRangeResolution/separated=true/versions=100/percent-flushed=100-16 225µs ± 0% 235µs ± 0% +4.27% IntentRangeResolution/separated=true/versions=200/percent-flushed=0-16 3.59ms ± 0% 3.84ms ± 0% +7.21% IntentRangeResolution/separated=true/versions=200/percent-flushed=50-16 875µs ± 0% 998µs ± 0% +14.00% IntentRangeResolution/separated=true/versions=200/percent-flushed=80-16 508µs ± 0% 525µs ± 0% +3.33% IntentRangeResolution/separated=true/versions=200/percent-flushed=90-16 382µs ± 0% 391µs ± 0% +2.22% IntentRangeResolution/separated=true/versions=200/percent-flushed=100-16 266µs ± 0% 240µs ± 0% -9.75% IntentRangeResolution/separated=true/versions=400/percent-flushed=0-16 1.54ms ± 0% 1.69ms ± 0% +9.22% IntentRangeResolution/separated=true/versions=400/percent-flushed=50-16 1.93ms ± 0% 2.13ms ± 0% +10.45% IntentRangeResolution/separated=true/versions=400/percent-flushed=80-16 769µs ± 0% 829µs ± 0% +7.84% IntentRangeResolution/separated=true/versions=400/percent-flushed=90-16 506µs ± 0% 535µs ± 0% +5.61% IntentRangeResolution/separated=true/versions=400/percent-flushed=100-16 248µs ± 0% 252µs ± 0% +1.52% Release justification: Change is localized to intentInterleavingIter and pebble.Iterator. It may be risky given it is involved in all reads of MVCC data. One possibility would be to adopt the change since it only really affects separated intents (the intentIter is the only one using the limited iteration), which would unblock turning on separated intents in a randomized manner for SQL tests and roachtests, while leaving it off by default for 21.1. This would reduce the possibility of code rot for separated intents. Release note: None

Release note: None

This reverts commit 5421a3a. Release note (<category, see below>): <what> <show> <why>

cameronnunez requested a review from knz May 11, 2021 16:43

pbardea and others added 28 commits May 11, 2021 13:49

cli: add a missing unit test

cf59740

The node selection logic was untested. This commit adds the missing test. Release note: None

cli: fix a comment

f4888f6

Release note: None

lint: check make bazel-generate earlier in the teamcity-check script

0382999

We used to do this after `make buildshort` and `make generate`, but those can take a while. Do it before so obvious Bazel linter errors break the build sooner. Release note: None

roachtest: tpccbench: ignore error from c.Reset

f396c73

See #61981 (comment). Release note: None

roachtest: add link to clusters' admin ui from the roachtest webui

bc1c9c8

This commit makes the roachtest web ui link to respective clusters' admin UI. Release note: None

sql,importccl: remove a use of catalogkv when creating schema descrip…

2edbf6b

…tors Also probably fixes a bug due to bad transaction retry handling in import. Touches #64134. Release note: None

Added SHOW REGIONS syntax diagram

2430d20

Release note: None

roachtest: add roachtest for postgres nodejs driver

0f65b77

Release note: None

sql: fix bug where view exprs using enum arrays could prevent enum drops

5caaa46

Fixes #64101 Release note (bug fix): Fix a bug where view expressions created using an array enum, without a name for the column, could cause failures when dropping unrelated enum values.

sql: add schemaName to getAllMutableTableDescriptors

b3075b3

This is helpful for error messages which allows the schema name to be output in the error message. Release note: None

sql: disallow ADD/DROP REGION during certain REGIONAL BY ROW ops

c11a670

Release note (sql change): Disallow ADD/DROP REGION whilst a REGIONAL BY ROW table has index changes underway or a table is transitioning to or from REGIONAL BY ROW.

sql: prevent certain operations if regions are being added or dropped

9a1897c

Release note (sql change): Prevent index modification on REGIONAL BY ROW tables and locality to/from REGIONAL BY ROW changes during a ADD/DROP REGION.

opt: show statistics in placeholder-fast-path tests

ac54e38

This highlights a problem with the fast path: we use statistics that are derived using placeholders, which are usually terrible. Release note: None

sql: add CREATE STATISTICS regression test

a667d71

This is a "forward-port" of regression tests from #64226. This commit contains no code changes because the bug was already been fixed on master by #59687. Release note: None

adityamaru and others added 17 commits May 11, 2021 13:49

roachtest: maybe fix inconsistency

dab0599

Closes #64602. Release note: None

backupccl: check permissions against full destination

e6026b8

previously 'show x in y' collection show syntax was only checking the scheme of x which in 'x in y' is empty. Release note: none.

deps: update golang.org/x/text to 0.3.6

9788f08

Release note (sql change): Collated strings may now have a locale that is a language tag, followed by a `-u-` suffix, followed by anything else. For example, any locale with a prefix of `en-US-u-` is now considered valid.

geos/BUILD.bzl: missing -lm

2b5798c

The math functions are not included in libc in BSD. Release note: None

roachtest: own quit-all-nodes to Server

66fa81f

Release note: None

kvserver: clean up miscellaneous comments over replicaRankings

4b0c169

Release note: None

server: log runtime stats as structured events to the health channel

1f8abfb

Revert "opt: add a test file for use in GoLand class"

3b65c9f

This reverts commit 5421a3a. Release note (<category, see below>): <what> <show> <why>

cameronnunez requested a review from a team May 11, 2021 17:51

cameronnunez requested a review from a team as a code owner May 11, 2021 17:51

cameronnunez requested a review from a team May 11, 2021 17:51

cameronnunez requested a review from a team as a code owner May 11, 2021 17:51

cameronnunez requested review from a team and dt and removed request for a team May 11, 2021 17:51

cameronnunez closed this May 11, 2021

cameronnunez removed request for a team and dt May 11, 2021 17:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: log runtime stats as a structured event to the health channel #65008

server: log runtime stats as a structured event to the health channel #65008

cameronnunez commented May 11, 2021

cockroach-teamcity commented May 11, 2021

server: log runtime stats as a structured event to the health channel #65008

server: log runtime stats as a structured event to the health channel #65008

Conversation

cameronnunez commented May 11, 2021

cockroach-teamcity commented May 11, 2021