-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1661 from DSD-DBS/kube-alerts
feat: Add alerts for system administrators in Grafana
- Loading branch information
Showing
28 changed files
with
311 additions
and
56 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
<!-- | ||
~ SPDX-FileCopyrightText: Copyright DB InfraGO AG and contributors | ||
~ SPDX-License-Identifier: Apache-2.0 | ||
--> | ||
|
||
# Alerts in unexpected situations | ||
|
||
If something doesn't work as expected, it's important that the system | ||
administrators will receive a notification. | ||
|
||
We use the Grafana Alertmanager to send alerts for some pre-defined error | ||
cases. If you're missing an alert rule, let us know via | ||
[GitHub issues](https://github.com/DSD-DBS/capella-collab-manager/issues) or | ||
open a PR and add it to the list of pre-defined rules. | ||
|
||
## Configure alerting | ||
|
||
By default, firing alerts can only be viewed in the Grafana UI. You can | ||
configure additional contact points depending on your needs. | ||
|
||
A list of available contact points is available in the | ||
[official Grafana documentation](https://grafana.com/docs/grafana/latest/alerting/configure-notifications/manage-contact-points/). | ||
The list includes chat services like Microsoft Teams but also email and webhook | ||
notifications. | ||
|
||
!!! info "Configure SMTP server for email alerting" | ||
|
||
For email alerting, you need to configure an SMTP server in the | ||
`values.yaml` in the Helm chart. Have a look at the `alerting.email` | ||
configuration. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
<!-- | ||
~ SPDX-FileCopyrightText: Copyright DB InfraGO AG and contributors | ||
~ SPDX-License-Identifier: Apache-2.0 | ||
--> | ||
|
||
# Grafana Dashboards | ||
|
||
We provide a few pre-configured Grafana dashboards to monitor the sessions and | ||
TeamForCapella licenses. | ||
|
||
The Grafana dashboards are available to administrators and can be accessed via | ||
the "Grafana" link in the main menu. Select Dashboards to see a list of | ||
available dashboards: | ||
|
||
![Dashboard in the main Grafana menu](./dashboards.png){:style="width:300px"} | ||
|
||
You can add additional dashboards depending on your needs. If you think the | ||
dashboard could be helpful for others, please add the dashboard to the | ||
[list of pre-defined dashboards](https://github.com/DSD-DBS/capella-collab-manager/tree/main/helm/config/grafana) | ||
and [open a PR](https://github.com/DSD-DBS/capella-collab-manager/pulls). |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
<!-- | ||
~ SPDX-FileCopyrightText: Copyright DB InfraGO AG and contributors | ||
~ SPDX-License-Identifier: Apache-2.0 | ||
--> | ||
|
||
# Pipeline and Model Modifier Monitoring | ||
|
||
Metrics connected to projects and registered models are available in a custom | ||
dashboard in the frontend. | ||
|
||
In the dashboard, you can get a general overview of the status of pipelines and | ||
model modifiers registered models. | ||
|
||
You can find it by navigating to `Menu` > `Settings` > `Monitoring` |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,3 +7,5 @@ | |
.venv | ||
initdb.sql | ||
p_options.yaml | ||
config/certs/* | ||
!config/certs/.gitkeep |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# SPDX-FileCopyrightText: Copyright DB InfraGO AG and contributors | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
apiVersion: 1 | ||
groups: | ||
- orgId: 1 | ||
name: Deployment | ||
folder: Alerts | ||
interval: 1m | ||
rules: | ||
- uid: a32cdf13-990a-438e-a451-11f9185e97b2 | ||
title: Session container unhealthy | ||
condition: A | ||
data: | ||
- refId: A | ||
relativeTimeRange: | ||
from: 3600 | ||
to: 0 | ||
datasourceUid: prometheus_ccm | ||
model: | ||
datasource: | ||
type: prometheus | ||
uid: prometheus_ccm | ||
editorMode: code | ||
expr: | ||
sum by(namespace, pod, phase, | ||
annotation_capellacollab_session_id) | ||
(kube_pod_status_phase{phase=~"Pending|Unknown|Failed"} * on | ||
(uid) group_left kube_pod_labels{label_workload="session"} * on | ||
(uid) group_left (annotation_capellacollab_session_id) | ||
kube_pod_annotations) > 0 | ||
instant: true | ||
intervalMs: 1000 | ||
legendFormat: '{{deployment}}' | ||
maxDataPoints: 43200 | ||
range: false | ||
refId: A | ||
noDataState: OK | ||
execErrState: Error | ||
for: 10m | ||
annotations: | ||
description: A session container is in an unexpected state. | ||
runbook_url: '' | ||
summary: | ||
Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is not | ||
ready for over 10 minutes. | ||
labels: | ||
'': '' | ||
isPaused: false | ||
- uid: d7609318-1289-443a-908c-bada900079cc | ||
title: Job has failed | ||
condition: A | ||
data: | ||
- refId: A | ||
relativeTimeRange: | ||
from: 86400 | ||
to: 0 | ||
datasourceUid: prometheus_ccm | ||
model: | ||
datasource: | ||
type: prometheus | ||
uid: prometheus_ccm | ||
editorMode: builder | ||
expr: kube_job_status_failed > 0 | ||
hide: false | ||
instant: true | ||
intervalMs: 1000 | ||
maxDataPoints: 43200 | ||
range: false | ||
refId: A | ||
noDataState: OK | ||
execErrState: Error | ||
for: 5m | ||
annotations: | ||
summary: A job has failed | ||
isPaused: false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
SPDX-FileCopyrightText: Copyright DB InfraGO AG and contributors | ||
SPDX-License-Identifier: Apache-2.0 |
Oops, something went wrong.