-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util/log,cli: channel abstraction; new configuration system #57134
Conversation
ff43fdc
to
4480ed0
Compare
e06b4e5
to
b4ddda2
Compare
b4ddda2
to
c1e8176
Compare
A linter change in cockroachdb#57134 made me discover this bug. Release note: None
A linter change in cockroachdb#57134 made me discover this bug. Release note: None
c1e8176
to
5662523
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a massive overhaul and my experience with a lot of this code isn't all that deep, but overall it looks great save for some minor comments.
Reviewed 79 of 85 files at r1, 13 of 13 files at r2, 9 of 9 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @itsbilal, and @knz)
pkg/cli/log_flags.go, line 45 at r1 (raw file):
if cliCtx.deprecatedLogOverrides.anySet() && cliCtx.logConfigInput.isSet { return errors.New("--" + cliflags.Log.Name + " is incompatible with legacy discrete logging flags")
Nit: Can use Newf
here.
pkg/cli/log_flags_test.go, line 89 at r1 (raw file):
const defaultLogDir = `PWD/cockroach-data/logs` testData := []struct {
Did you consider doing these (and the string generation methods above) as datadriven tests? They might be slightly easier to understand and extend that way. I'm fine with them this way too, just wondering if you thought about it.
pkg/util/log/test_log_scope.go, line 92 at r2 (raw file):
// not just debugLog. This would be necessary because loggers don't // necessarily have the same config. Until that is implemented, // we prevent the use case altogether.
Nice
pkg/util/log/logconfig/config.go, line 135 at r2 (raw file):
NoColor bool `yaml:"no-color,omitempty"` CommonSinkConfig `yaml:",inline"`
Shouldn't this be at the top by convention?
pkg/util/log/logconfig/config.go, line 203 at r2 (raw file):
// MaxFileSize indicates the approximate maximum size of // individual files generated by this sinks.
s/sinks/sink/
pkg/util/log/logconfig/validate.go, line 198 at r2 (raw file):
c.Sinks.sortedFileGroupNames = fileGroupNames return nil
You'd have to do return resErr
here for the defer-time set to work, right?
Looks like you did a rebase push right as I submitted my review. Let me know if parts of the old review aren't valid anymore / there are large parts still left to review after the force-push. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now with the post-rebase review.
Reviewed 16 of 91 files at r4, 85 of 89 files at r5, 9 of 9 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru and @knz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 13 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru and @itsbilal)
pkg/cli/log_flags.go, line 45 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Nit: Can use
Newf
here.
Not just a nit - in fact I have to, the linter tells me so. Done.
pkg/cli/log_flags_test.go, line 89 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Did you consider doing these (and the string generation methods above) as datadriven tests? They might be slightly easier to understand and extend that way. I'm fine with them this way too, just wondering if you thought about it.
I'll try it.
pkg/util/log/logconfig/config.go, line 135 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Shouldn't this be at the top by convention?
Explained in comment.
pkg/util/log/logconfig/config.go, line 203 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
s/sinks/sink/
Fixed.
pkg/util/log/logconfig/validate.go, line 198 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
You'd have to do
return resErr
here for the defer-time set to work, right?
Nope: the defer is run after the assignment to the return value by the return
statement.
57149: opt: don't hold on to evalCtx from detached Memo r=RaduBerinde a=RaduBerinde This change adds more "cleanup" code when detaching a Memo (a detached memo is stored in the query cache and is reused later in a "read-only" fashion). In particular, we clear the EvalContext in logicalPropsBuilder which can lead to inadvertently holding on to a lot of memory. Fixes #57059. Release note: None 57153: optbuilder: fix ambiguous column references for FK cascades r=RaduBerinde a=mgartner This commit fixes an issue in optbuilder that caused "ambiguous column reference" errors. This error would be produced during cascading updates if a child table's reference column name was equal to the parent column name concatenated with `_new`, and the child table had a check constraint, computed column, or partial index predicate that referenced the column. For example, the following `UPDATE` statement would produce an error. The expected behavior is a successful `UPDATE`. Notice that `p_new` of the child table references `p` of the parent table. CREATE TABLE parent (p INT PRIMARY KEY) CREATE TABLE child ( c INT PRIMARY KEY, p_new INT REFERENCES parent(p) ON UPDATE CASCADE, CHECK (p_new > 0) ) UPDATE parent SET p = p * 10 WHERE p > 1 This issue was the result of incorrect scoping while building foreign key cascading update expressions. A column with the same name and column ID was added to the update expression's input scope. Because the `mutationBuilder.disambiguateColumns` function is unable to disambiguate columns with the same name and column ID, building any expression that referenced the duplicated column would result in an error. This commit fixes the issue by no longer duplicating columns in the update expression's input scope. `mutationBuilder.addUpdateCols` now detects the special case when the update expression is a `*scopeColumn` and avoids duplicating it in the generated projection scope. Fixes #57148 Release note (bug fix): A bug has been fix that caused an "ambiguous column reference" error during foreign key cascading updates. This error was incorrectly produced when the child table's reference column name was equal to the concatenation of the parent's reference column name and "_new", and when the child table had a CHECK constraint, computed column, or partial index predicate expression that referenced the column. This bug was introduce in version 20.2. 57242: kvserver: avoid serving an unsafe string to log.Fatalf r=irfansharif a=knz A linter change in #57134 made me discover this bug. Release note: None 57246: rowenc: de-flake TestEncodeContainingArrayInvertedIndexSpans r=rytaft a=mgartner `TestEncodeContainingArrayInvertedIndexSpans` was failing sporadically because of randomized test cases that were incorrectly determining the expected value for the `unique` return value from `EncodeContainingInvertedIndexSpans`. The test was using the `reflect.DeepEqual` function to check for `Datum` equality, which does not return true in all cases where `Datum.Compare` returns `0`. Fixes #57237 Release note: None Co-authored-by: Radu Berinde <radu@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
tldr: this patch refactors the logging code to use channels instead of secondary loggers, and applies a new YAML-based configuration system. **Background** Previously, log channels were defined via secondary loggers: there were separate secondary loggers for audit logs, for Pebble, etc. This infrastructure was unsatisfactory: - certain users have expressed the desire to refine logging streams, by splitting e.g. configuration changes and monitoring data into their dedicated logging outputs. As we foresee that the number of distinct streams may increase over time, the requirement to define a secondary logger object for each stream becomes cumbersome: we then need to hold into each logger's state (multiple `*SecondaryLogger` pointers stored at every component that logs multiple streams) and for each make separate choices wrt logging directory, etc. - the secondary logger abstraction excessively conflates logical event channels, which correspond to different “categories” of events, together with the separation of logging outputs. In practice, we find that some users want multiple event categories go to e.g. the same file. Later, as we introduce more various sinks, we want to offer the option to redirect multiple event categories to the same sink. - it was possible to define multiple loggers using the same filenames and the same directory destination, but with competing GC and rotation parameters. **Channels** To address these shortcomings, this patch de-publicizes the secondary logger abstraction and supersedes it with the notion of logging *channel*, which is a logical/virtual log stream. Each event in a given logger (logging output) belongs to exactly one channel. There can be events from multiple channels emitted to one logger. In accordance with expressed user requirements, the following streams are defined here: - DEV (current “default” log): development logging. - STORAGE (current RocksDB/pebble log): low-level storage events. This will include events currently sent to the current "pebble" and "rocksdb" secondary logger. - SESSIONS: client connections and session/query cancellation. This will include events currently sent to the current "auth" secondary logger. - SENSITIVE_ACCESS: operations that access sensitive data (when enabled). This will include events currently sent to the current "sql audit" secondary logger. - SQL_EXEC: SQL execution logging (when enabled). This will include events currently sent to the current "sql exec" secondary logger. - SQL_PERF: SQL performance events (when enabled) This will include events currently sent to the current "sql slow" secondary logger ("Slow query log"). - SQL_INTERNAL_PERF: ditto SQL_PERF, for internal queries. **File output** The file logging format now embeds the channel ID numerically as a prefix to the filename. The location of the prefix on the log line is chosen to be backward-compatible with log parsers built for previous versions of CockroachDB, which would be unable to handle additional fields on the line. For example: ``` E200727 11:57:22.907515 1392 1@sql/conn_executor.go:492 [n1] panic: woo ^^ channel ID 1 = OPS ``` This prefix is omitted if the ID is 0 (DEV). Conversely, when parsing an existing log file, a missing ID is considered to be an event on the DEV channel. This makes the parser in the new version able to parse entries emitted by previous versions. **Configuration format** The configuration can now be specified via YAML, using a new command line flag `--log`. The format works as follows: ``` file-defaults: #optional dir: <path> # output directory, defaults to first store dir filter: <severity> # min severity level for file output, default INFO redact: <bool> # whether to remove sensitive info, default false redactable: <bool> # whether to strip redaction markers, default false max-file-size: <sz> # max log file size, default 10MB max-group-size: <sz> # max log file group size, default 100MB sinks: #optional stderr: #optional filter: <severity> # min severity level for stderr output, default NONE channels: <chans> # channel selection for stderr output, default ALL redact: <bool> # whether to remove sensitive info, default false redactable: <bool> # whether to strip redaction markers, default false nocolor: <bool> # disable VT color codes, default is to enable color file-groups: #optional <filename>: channels: <chans> # channel selection for this file output, mandatory filter: <severity> # defaults to file-defaults.filter dir: <path> # defaults to file-defaults.dir redact: <bool> # defaults to file-defaults.redact redactable: <bool> # defaults to file-defaults.redactable max-file-size: <sz> # defaults to file-defaults.max-file-size max-group-size: <sz> # defaults to file-defaults.max-group-size ... repeat ... capture-stray-errors: #optional enable: <bool> # whether to enable internal fd2 capture dir: <path> # output directory, defaults to file-defaults.dir ``` The channel selection (`channels` in each sink) can be: - `ALL` for all channels; - a comma-delimited list of specific channels; - `ALL EXCEPT <list>` for all channels except the provided channel names. **Incremental changes** The change was carried out as follows (this may guide the review): 1. `logpb/log.proto`: introduce a new protobuf enum Channels 2. `log/gen.sh`: auto-generate channel aliases in new sub-package `channel` 3. `log/gen.sh`: auto-generate per-channel APIs in `log_channels.go` 4. `log`: rework `clog.go` and other files to pass the channel as argument 5. `log/logconfig`: new package for log config data structures + parser + tests 6. `log/flags.go`: apply the configuration 7. `cli/{context,flags}.go`: read log config via `--log`, mark other flags deprecated Release note (cli change): The logging configuration can now be specified via the `--log` parameter. See the documentation for details. The flags `--log-dir`, `--log-file-max-size`, `--log-file-verbosity`, `--log-group-max-size` are now deprecated. Release note (doc change): A report of the possible logging severities and channels is now automatically generated in `docs/generated/logging.md`.
Release note (cli change): A new command `cockroch debug check-log-config` prints out the logging configuration that results from the provided combination of `--store`, `--log` and other logging-related flags on the command line.
5662523
to
8ba3806
Compare
@itsbilal I changed the test to datadriven. Can you have another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like how the tests turned out! Thanks for all the text processing stuff there.
Reviewed 18 of 18 files at r7, 9 of 9 files at r8.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru and @itsbilal)
thanks :) bors r=itsbilal |
Build succeeded: |
As of cockroachdb#57134 passing `-logtostderr=false` as a `TESTFLAG` in benchmarks errs: `flag provided but not defined: -logtostderr`. The preferred method for suppressing logs in tests and benchmarks to is add `defer log.Scope(t).Close(t)` to the top of the test/benchmark (see cockroachdb#57979). This commit uses this new method to suppress logs in optimizer benchmarks. Release note: None
As of cockroachdb#57134 passing `-logtostderr=false` as a `TESTFLAG` in benchmarks errs: `flag provided but not defined: -logtostderr`. The preferred method for suppressing logs in tests and benchmarks to is add `defer log.Scope(t).Close(t)` to the top of the test/benchmark (see cockroachdb#57979). This commit uses this new method to suppress logs in optimizer benchmarks. Release note: None
As of cockroachdb#57134 passing `-logtostderr=false` as a `TESTFLAG` in benchmarks errs: `flag provided but not defined: -logtostderr`. The preferred method for suppressing logs in tests and benchmarks to is add `defer log.Scope(t).Close(t)` to the top of the test/benchmark (see cockroachdb#57979). This commit uses this new method to suppress logs in optimizer benchmarks. Release note: None
58897: tracing: enable always-on tracing by default r=irfansharif a=irfansharif This is follow-up work from #58712, where we measured the overhead for always-on tracing and found it to be minimal/acceptable. Lets switch this on by default to shake the implications of doing so. We can reasonably expect two kinds of fallout: 1. Unexpected blow up in memory usage due to resource leakage (which is a can be problem now that we're always maintaining open spans in an internal registry, see #58721) 2. Performance degradataion due to tracing overhead per-request (something #58712) was spot checking for. For 1 we'll introduce a future test in a separate PR. For 2, we'll monitor roachperf over the next few weeks. --- Also moved some of the documentation for the cluster setting into a comment form above. Looking at what's rendered in our other cluster settings (`SHOW ALL CLUSTER SETTINGS`), we default to a very pity, unwrapped description. Release note: None 58974: opt: suppress logs in benchmarks r=rytaft a=mgartner As of #57134 passing `-logtostderr=false` as a `TESTFLAG` in benchmarks errs: `flag provided but not defined: -logtostderr`. The preferred method for suppressing logs in tests and benchmarks to is add `defer log.Scope(t).Close(t)` to the top of the test/benchmark (see #57979). This commit uses this new method to suppress logs in optimizer benchmarks. Release note: None 59009: kv/batcheval: only expose immutable range state to commands r=nvanbenschoten a=nvanbenschoten The DeclareKeysFunc has always included a full RangeDescriptor, but it has never been clear which fields in this descriptor are safe to use and which are not when declaring keys for a request. The concern is that any property of the RangeDescriptor that is not immutable may change between the time that a request declares its keys to latch and the time that it evaluates, so any assumptions based on these mutable fields may not hold. The quintessential example of a property of a Range that is not immutable is its end key. It would be incorrect to declare keys between a Range's start key and its current end key as a means of latching the entire range, because a merge of a right-hand neighbor could complete in between the time that a request declares its keys and the time that it evaluates. This could lead to a violation of the mutual exclusion that the command was expecting to have. This commit makes these kinds of mistakes impossible to make by putting the RangeDescriptor behind an interface that only exposes the properties of a Range that cannot change across a Range's lifetime. 59099: kvserver: fix rangelog event logging for non-voter additions r=aayushshah15 a=aayushshah15 Before this patch, we were incorrectly logging non-voter additions as removals. Release note: None 59142: sql: fix indentation in information_schema.columns schema r=otan a=arulajmani Reviewable lied to me. Release note: None Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Aayush Shah <aayush.shah15@gmail.com> Co-authored-by: arulajmani <arulajmani@gmail.com>
(This PR chips away at #51987 to make it smaller. Still, it's a large-ish change. To ease the review, I added an "incremental changes" summary below which the reviewer can use as reading guide.)
tldr: this patch refactors the logging code to use channels instead of
secondary loggers, and applies a new YAML-based configuration system.
Background
Previously, log channels were defined via secondary loggers: there
were separate secondary loggers for audit logs, for
Pebble, etc. This infrastructure was unsatisfactory:
certain users have expressed the desire to refine logging streams,
by splitting e.g. configuration changes and monitoring data
into their dedicated logging outputs.
As we foresee that the number of distinct streams may increase over
time, the requirement to define a secondary logger object for each
stream becomes cumbersome: we then need to hold into each logger's
state (multiple
*SecondaryLogger
pointers stored at everycomponent that logs multiple streams) and for each make separate
choices wrt logging directory, etc.
the secondary logger abstraction excessively conflates logical event
channels, which correspond to different “categories” of events,
together with the separation of logging outputs.
In practice, we find that some users want multiple event categories go
to e.g. the same file. Later, as we introduce more various sinks,
we want to offer the option to redirect multiple event categories
to the same sink.
it was possible to define multiple loggers using the same filenames
and the same directory destination, but with competing GC and
rotation parameters.
Channels
To address these shortcomings, this patch de-publicizes the secondary
logger abstraction and supersedes it with the notion of
logging channel, which is a logical/virtual log stream. Each event
in a given logger (logging output) belongs to exactly one
channel. There can be events from multiple channels emitted to one
logger.
In accordance with expressed user requirements, the following streams
are defined here:
This will include events currently sent to the current "pebble"
and "rocksdb" secondary logger.
This will include events currently sent to the current "auth"
secondary logger.
This will include events currently sent to the current "sql audit"
secondary logger.
This will include events currently sent to the current "sql exec"
secondary logger.
This will include events currently sent to the current "sql slow"
secondary logger ("Slow query log").
File output
The file logging format now embeds the channel ID numerically as a
prefix to the filename.
The location of the prefix on the log line is chosen to be
backward-compatible with log parsers built for previous versions of
CockroachDB, which would be unable to handle additional fields on the
line.
For example:
This prefix is omitted if the ID is 0 (DEV). Conversely, when parsing
an existing log file, a missing ID is considered to be an event on the
DEV channel. This makes the parser in the new version able to parse
entries emitted by previous versions.
Configuration format
The configuration can now be specified via YAML,
using a new command line flag
--log
.The format works as follows:
The channel selection (
channels
in each sink) can be:ALL
for all channels;ALL EXCEPT <list>
for all channels except the providedchannel names.
A CLI utility to validate the configuration is also provided:
cockroach debug check-log-config
.Incremental changes
The change was carried out as follows (this may guide the review):
logpb/log.proto
: introduce a new protobuf enum Channelslog/gen.sh
: auto-generate channel aliases in new sub-packagechannel
log/gen.sh
: auto-generate per-channel APIs inlog_channels.go
log
: reworkclog.go
and other files to pass the channel as argumentlog/logconfig
: new package for log config data structures + parser + testslog/flags.go
: apply the configurationcli/{context,flags}.go
: read log config via--log
, mark other flags deprecatedcli/logconfig.go
: new commandcockroach check-log-config
for demos & checksRelease note (cli change): The logging configuration can now be
specified via the
--log
parameter. See the documentation fordetails. The flags
--log-dir
,--log-file-max-size
,--log-file-verbosity
,--log-group-max-size
are now deprecated.Release note (cli change): A new command
cockroch debug check-log-config
prints out the logging configuration that resultsfrom the provided combination of
--store
,--log
and otherlogging-related flags on the command line.
Release note (doc change): A report of the possible logging severities
and channels is now automatically generated in
docs/generated/logging.md
.