Skip to content

Commit

Permalink
feat: add CPU threshold variable
Browse files Browse the repository at this point in the history
  • Loading branch information
Matthieu Borgognon committed Nov 17, 2023
1 parent f1bf6c3 commit 8d2de46
Show file tree
Hide file tree
Showing 5 changed files with 15 additions and 4 deletions.
2 changes: 2 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ BLACKBOX_URL_TO_PROBE="http://localhost:9500/metrics, http://localhost:9560/,"
ALERTMANAGER_REPEAT_INTERVAL=4h
# [OPTIONAL] Set another Temperature threshold alarm [°C]
ALERTMANAGER_TEMPERATURE_THRESHOLD=77
# [OPTIONAL] Set another CPU threshold alarm [%]
ALERTMANAGER_CPU_THRESHOLD=80

# [OPTIONAL] Add supplementary services to be notified from (space separated!)
# (see https://pingme.lmno.pk/ - except Email and Zulip NOT SUPPORTED HERE because provided by default, see above)
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ ansible/group_vars/*
ansible/host_vars/*
!ansible/host_vars/.gitkeep
!.env.example
.vscode
7 changes: 6 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
## [Unreleased]
...

## [5.2.0] - 2023-11-17
### Add
- Add possibility to adapt CPU threshold for alarms firing with ALERTMANAGER_CPU_THRESHOLD

## [5.1.4] - 2023-06-17
### Fix
- Fix node_exporter goroutine error
Expand Down Expand Up @@ -131,7 +135,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
### Removed
- [WARNING] Breaking changes for script arguments

[Unreleased]: https://github.com/matbgn/prommanager/compare/v5.1.4...HEAD
[Unreleased]: https://github.com/matbgn/prommanager/compare/v5.2.0...HEAD
[5.1.4]: https://github.com/matbgn/prommanager/compare/v5.1.4...v5.2.0
[5.1.2]: https://github.com/matbgn/prommanager/compare/v5.1.2...v5.1.4
[5.1.2]: https://github.com/matbgn/prommanager/compare/v5.1.1...v5.1.2
[5.1.1]: https://github.com/matbgn/prommanager/compare/v5.1.0...v5.1.1
Expand Down
2 changes: 2 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,8 @@ BLACKBOX_URL_TO_PROBE="http://localhost:9500/metrics, http://localhost:9560/,"
ALERTMANAGER_REPEAT_INTERVAL=4h
# [OPTIONAL] Set another Temperature threshold alarm [°C]
ALERTMANAGER_TEMPERATURE_THRESHOLD=77
# [OPTIONAL] Set another CPU threshold alarm [%]
ALERTMANAGER_CPU_THRESHOLD=80
```

### Tweaking alert rules
Expand Down
7 changes: 4 additions & 3 deletions prommanager
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
MOTD="Welcome to Prometheus manager v5.1.4 !"
MOTD="Welcome to Prometheus manager v5.2.0 !"

# Set default values
SYSTEM_ARCH=amd64 # -> can be changed by script argument --arch arm64
Expand Down Expand Up @@ -32,6 +32,7 @@ PROMETHEUS_PORT=9590

ALERTMANAGER_REPEAT_INTERVAL=4h
ALERTMANAGER_TEMPERATURE_THRESHOLD=77
ALERTMANAGER_CPU_THRESHOLD=80

EXECUTE=false
KILL_APPS=false
Expand Down Expand Up @@ -829,13 +830,13 @@ groups:
summary: HTTP probe failed (instance {{ \$labels.instance }})
description: "Probe failed\n VALUE = {{ \$value }}\n LABELS = {{ \$labels }}"
- alert: HostHighCpuLoad
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100) > 80
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100) > $ALERTMANAGER_CPU_THRESHOLD
for: 0m
labels:
severity: warning
annotations:
summary: Host high CPU load (instance {{ \$labels.instance }})
description: "CPU load is greater than 80%\n VALUE = {{ \$value }}\n LABELS = {{ \$labels }}"
description: "CPU load is greater than $ALERTMANAGER_CPU_THRESHOLD%\n VALUE = {{ \$value }}\n LABELS = {{ \$labels }}"
- alert: HostOutOfMemory
expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 15
for: 2m
Expand Down

0 comments on commit 8d2de46

Please sign in to comment.