-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util/log: new logging channels SQL_SCHEMA, USER_ADMIN and PRIVILEGES #51987
Conversation
c89d421
to
3d43e41
Compare
3d43e41
to
a557353
Compare
fixed the broken log tests. The issue was a missing clearing of the Next step is to fix the TCL logging tests since the file names are changing. Additioanlly the logging directory for the SQL audit log needs to be configurable. |
a557353
to
5861464
Compare
c8436a0
to
9d384e7
Compare
56336: util/log: various simplifications towards logging channels r=tbg,itsbilal a=knz These are general enhancements that I extracted from my work on #51987 into piecemeal commits. I think they are worth merging regardless of whether we adopt the channel proposal. Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
48a2b3c
to
0237cbe
Compare
0237cbe
to
23d9025
Compare
0fb2349
to
3b1fcad
Compare
2c39ba0
to
341f99c
Compare
341f99c
to
2a93c1e
Compare
I have extracted the individual bits from the original change in this PR to the following new PRs:
As a result the original PR has been greatly slimmed down, so much it is now just a draft for a follow-up project to bring |
2a93c1e
to
37098a3
Compare
57134: util/log,cli: channel abstraction; new configuration system r=itsbilal a=knz (This PR chips away at #51987 to make it smaller. Still, it's a large-ish change. To ease the review, I added an "incremental changes" summary below which the reviewer can use as reading guide.) tldr: this patch refactors the logging code to use channels instead of secondary loggers, and applies a new YAML-based configuration system. **Background** Previously, log channels were defined via secondary loggers: there were separate secondary loggers for audit logs, for Pebble, etc. This infrastructure was unsatisfactory: - certain users have expressed the desire to refine logging streams, by splitting e.g. configuration changes and monitoring data into their dedicated logging outputs. As we foresee that the number of distinct streams may increase over time, the requirement to define a secondary logger object for each stream becomes cumbersome: we then need to hold into each logger's state (multiple `*SecondaryLogger` pointers stored at every component that logs multiple streams) and for each make separate choices wrt logging directory, etc. - the secondary logger abstraction excessively conflates logical event channels, which correspond to different “categories” of events, together with the separation of logging outputs. In practice, we find that some users want multiple event categories go to e.g. the same file. Later, as we introduce more various sinks, we want to offer the option to redirect multiple event categories to the same sink. - it was possible to define multiple loggers using the same filenames and the same directory destination, but with competing GC and rotation parameters. **Channels** To address these shortcomings, this patch de-publicizes the secondary logger abstraction and supersedes it with the notion of logging *channel*, which is a logical/virtual log stream. Each event in a given logger (logging output) belongs to exactly one channel. There can be events from multiple channels emitted to one logger. In accordance with expressed user requirements, the following streams are defined here: - DEV (current “default” log): development logging. - STORAGE (current RocksDB/pebble log): low-level storage events. This will include events currently sent to the current "pebble" and "rocksdb" secondary logger. - SESSIONS: client connections and session/query cancellation. This will include events currently sent to the current "auth" secondary logger. - SENSITIVE_ACCESS: operations that access sensitive data (when enabled). This will include events currently sent to the current "sql audit" secondary logger. - SQL_EXEC: SQL execution logging (when enabled). This will include events currently sent to the current "sql exec" secondary logger. - SQL_PERF: SQL performance events (when enabled) This will include events currently sent to the current "sql slow" secondary logger ("Slow query log"). - SQL_INTERNAL_PERF: ditto SQL_PERF, for internal queries. **File output** The file logging format now embeds the channel ID numerically as a prefix to the filename. The location of the prefix on the log line is chosen to be backward-compatible with log parsers built for previous versions of CockroachDB, which would be unable to handle additional fields on the line. For example: ``` E200727 11:57:22.907515 1392 1@sql/conn_executor.go:492 [n1] panic: woo ^^ channel ID 1 = OPS ``` This prefix is omitted if the ID is 0 (DEV). Conversely, when parsing an existing log file, a missing ID is considered to be an event on the DEV channel. This makes the parser in the new version able to parse entries emitted by previous versions. **Configuration format** The configuration can now be specified via YAML, using a new command line flag `--log`. The format works as follows: ``` file-defaults: #optional dir: <path> # output directory, defaults to first store dir filter: <severity> # min severity level for file output, default INFO redact: <bool> # whether to remove sensitive info, default false redactable: <bool> # whether to strip redaction markers, default false max-file-size: <sz> # max log file size, default 10MB max-group-size: <sz> # max log file group size, default 100MB sinks: #optional stderr: #optional filter: <severity> # min severity level for stderr output, default NONE channels: <chans> # channel selection for stderr output, default ALL redact: <bool> # whether to remove sensitive info, default false redactable: <bool> # whether to strip redaction markers, default false nocolor: <bool> # disable VT color codes, default is to enable color file-groups: #optional <filename>: channels: <chans> # channel selection for this file output, mandatory filter: <severity> # defaults to file-defaults.filter dir: <path> # defaults to file-defaults.dir redact: <bool> # defaults to file-defaults.redact redactable: <bool> # defaults to file-defaults.redactable max-file-size: <sz> # defaults to file-defaults.max-file-size max-group-size: <sz> # defaults to file-defaults.max-group-size ... repeat ... capture-stray-errors: #optional enable: <bool> # whether to enable internal fd2 capture dir: <path> # output directory, defaults to file-defaults.dir ``` The channel selection (`channels` in each sink) can be: - `ALL` for all channels; - a comma-delimited list of specific channels; - `ALL EXCEPT <list>` for all channels except the provided channel names. A CLI utility to validate the configuration is also provided: `cockroach debug check-log-config`. **Incremental changes** The change was carried out as follows (this may guide the review): 1. `logpb/log.proto`: introduce a new protobuf enum Channels 2. `log/gen.sh`: auto-generate channel aliases in new sub-package `channel` 3. `log/gen.sh`: auto-generate per-channel APIs in `log_channels.go` 4. `log`: rework `clog.go` and other files to pass the channel as argument 5. `log/logconfig`: new package for log config data structures + parser + tests 6. `log/flags.go`: apply the configuration 7. `cli/{context,flags}.go`: read log config via `--log`, mark other flags deprecated 8. `cli/logconfig.go`: new command `cockroach check-log-config` for demos & checks Release note (cli change): The logging configuration can now be specified via the `--log` parameter. See the documentation for details. The flags `--log-dir`, `--log-file-max-size`, `--log-file-verbosity`, `--log-group-max-size` are now deprecated. Release note (cli change): A new command `cockroch debug check-log-config` prints out the logging configuration that results from the provided combination of `--store`, `--log` and other logging-related flags on the command line. Release note (doc change): A report of the possible logging severities and channels is now automatically generated in `docs/generated/logging.md`. Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
37098a3
to
e9c59a1
Compare
e9c59a1
to
7de6ba9
Compare
Rebased this on top of @itsbilal RFAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 16 of 16 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @knz)
pkg/util/log/logpb/log.proto, line 98 at r1 (raw file):
// This is typically configured in "audit" mode, with event // numbering and synchronous writes. SENSITIVE_ACCESS = 7;
The renumbering is unfortunate. I'm assuming this is a non-issue as log messages aren't meant to transcend nodes.
I can revert that. IT's not mandatory. My reasoning is that we have no released versions with these channel numbers embedded in log files yet, so it's ok to do this renaming now (it won't be after the next stable release), for the sake of readability. WDYT? |
Sure, that sounds good to me. We haven't "released" this so a renumbering is fine. |
Release note (cli change): Notable events that pertain to SQL schema, user and privilege changes are now sent on the new SQL_SCHEMA, USER_ADMIN and PRIVILEGES channels. These can now be redirected to different sinks from the other log entries. **SQL_SCHEMA** The SQL_SCHEMA channel is used to report changes to the SQL logical schema, excluding privilege and ownership changes (which are reported on the separate channel PRIVILEGES) and zone config changes (which go to OPS). This includes: - database/schema/table/sequence/view/type creation - adding/removing/changing table columns - changing sequence parameters etc., more generally changes to the schema that affect the functional behavior of client apps using stored objects. **USER_ADMIN** The USER_ADMIN channel is the channel used to report changes in users and roles, including: - users added/dropped. - changes to authentication credentials, incl passwords, validity etc. - role grants/revocations. - role option grants/revocations. This is typically configured in "audit" mode, with event numbering and synchronous writes. **PRIVILEGES** The PRIVILEGES channel is the channel used to report data authorization changes, including: - privilege grants/revocations on database, objects etc. - object ownership changes. This is typically configured in "audit" mode, with event numbering and synchronous writes.
7de6ba9
to
b046f5b
Compare
thanks! bors r=itsbilal |
Build succeeded: |
Release note (cli change): Notable events that pertain to SQL schema,
user and privilege changes are now sent on the new SQL_SCHEMA,
USER_ADMIN and PRIVILEGES channels. These can now be redirected to
different sinks from the other log entries.
SQL_SCHEMA
The SQL_SCHEMA channel is used to report changes to the
SQL logical schema, excluding privilege and ownership changes
(which are reported on the separate channel PRIVILEGES) and
zone config changes (which go to OPS).
This includes:
etc., more generally changes to the schema that affect the
functional behavior of client apps using stored objects.
USER_ADMIN
The USER_ADMIN channel is the channel used to report changes
in users and roles, including:
This is typically configured in "audit" mode, with event
numbering and synchronous writes.
PRIVILEGES
The PRIVILEGES channel is the channel used to report data
authorization changes, including:
This is typically configured in "audit" mode, with event
numbering and synchronous writes.