Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v15] Add a guide to metrics for monitoring Teleport #47412

Merged
merged 2 commits into from
Oct 10, 2024

Conversation

ptgott
Copy link
Contributor

@ptgott ptgott commented Oct 9, 2024

Backport #46645 to branch/v15

Closes #40664

This change turns the Metrics guide in `admin-guides` into a conceptual
guide to the most important metrics for monitoring a Teleport cluster.

Since Agent metrics have inconsistent comprehensiveness across Teleport
services--and to reduce the scope of this change--this guide focuses on
self-hosted clusters.

To make this a conceptual guide instead of a reference, this change
removes the reference table from the `admin-guides` metrics page. There
is already a table in the dedicated metrics reference guide.

Note that, while the new metrics guide is specific to self-hosted
clusters, this change does not move the guide to the subsection of Admin
Guides related to self-hosting Teleport. Doing this would mean having
one subsection of Admin Guides for diagnostics-related guides and one
subsection for self-hosted-specific diagnostics, which is potentially
confusing. We may also want to add Agent-specific metrics eventually.

Finally, this change does not include alert thresholds for the metrics
it describes. We can define these in a subsequent change.
- Describe `backend_write_requests_failed_precondition_total`
- Include the precondition metric in the write availability formula.
- Turn the `registered_servers` discussion into a discussion of Teleport
  instance version, since it's not possible to group this metric by
  service and subtract the count of Auth Service/Proxy Service instances
  from the count of all registered services.
@ptgott ptgott added the no-changelog Indicates that a PR does not require a changelog entry label Oct 9, 2024
Copy link

github-actions bot commented Oct 9, 2024

🤖 Vercel preview here: https://docs-koqmsap0k-goteleport.vercel.app/docs/ver/preview

@ptgott ptgott added this pull request to the merge queue Oct 10, 2024
Merged via the queue into branch/v15 with commit 5063cbd Oct 10, 2024
36 checks passed
@ptgott ptgott deleted the bot/backport-46645-branch/v15 branch October 10, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport documentation no-changelog Indicates that a PR does not require a changelog entry size/md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants