Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: report runtime stats as a structured event #63750

Closed
knz opened this issue Apr 15, 2021 · 0 comments · Fixed by #65024
Closed

server: report runtime stats as a structured event #63750

knz opened this issue Apr 15, 2021 · 0 comments · Fixed by #65024
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@knz
Copy link
Contributor

knz commented Apr 15, 2021

Every 10 seconds, the server reports health metrics in the HEALTH logging channel, for example:

I210323 20:53:53.714286 232 2@server/status/runtime.go:553 ⋮ [n1] 1  runtime stats: 106 MiB RSS, 248 goroutines (stacks: 4.3 MiB), 28 MiB/51 MiB Go alloc/total (heap fragmentation: 5.2 MiB, heap reserved: 2.5 MiB, heap released: 24 MiB), 18 MiB/30 MiB CGO alloc/total (0.0 CGO/sec), 0.0/0.0 %%(u/s)time, 0.0 %%gc (0x), 46 KiB/107 KiB (r/w)net

(this logging occurs in server/status/runtime.go)

Our users have expressed interest in making this information more reliably parsable. We should use a structured event for this purpose.

To do this requires approximately the following steps:

  1. define a new category for HEALTH events in pkg/util/log/eventpb, possibly in cluster_events.go
  2. define a new event type for the runtime stats. Use descriptive names for the fields and document them
  3. use log.Structured instead of log.Health.Infof in runtime.go.
@knz knz added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Apr 15, 2021
@cameronnunez cameronnunez linked a pull request May 11, 2021 that will close this issue
craig bot pushed a commit that referenced this issue May 14, 2021
65024: server: log runtime stats as a structured event r=knz,rauchenstein a=cameronnunez

Fixes [#63750](#63750).

Runtime stats were logged using a flat text format.
This is not stable and not easy to parse.
It has now been converted to structured logging.

Release note (cli change): server health metrics are now a structured event sent to
the HEALTH logging channel. Refer to the reference docs for details about the event
payload.

Co-authored-by: Cameron Nunez <cameron@cockroachlabs.com>
@craig craig bot closed this as completed in #65024 May 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
1 participant