Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

Commit

Permalink
add doc
Browse files Browse the repository at this point in the history
  • Loading branch information
xudifsd committed Apr 28, 2019
1 parent 813547f commit 1364259
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions docs/alerting/watchdog-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,14 @@ vi watchdog-xx.log
| ---------- | ----------- |
| k8s_api_server_count | use label `error` to represent status, if `error` != "ok", means k8s api server is not functioning correctly |

## K8s resource Metrics
| Metric name| Description |
| ---------- | ----------- |
| k8s_node_gpu_total | Total Gpu |
| k8s_node_gpu_available | Total gpu count - used gpu count |
| k8s_node_gpu_reserved | If node is marked as unschedulable via `kubectl cordon $node` all unused gpus are deemed as reserved |


## Other Metrics
| Metric name| Description |
| ---------- | ----------- |
Expand Down

0 comments on commit 1364259

Please sign in to comment.