Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Prometheus alert manager exporter #437

Merged
merged 5 commits into from
Dec 23, 2020

Conversation

NajmudheenCT
Copy link
Member

@NajmudheenCT NajmudheenCT commented Dec 22, 2020

What this PR does / why we need it:
New alert exporter to prometheus alert manager
recieved alert in delfin ( through trap and sync ) will be pushed to alert manager through this exporter.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #NA

Special notes for your reviewer:
This is rwork of PR #429
Release note:

  1. Exporter for Prometheus alert manager

Test Report:

  1. Sync alerts for storage <storage_id>
    image

  2. Check task process logs
    image

  3. Check alert manager for alerts
    image

@codecov
Copy link

codecov bot commented Dec 22, 2020

Codecov Report

Merging #437 (495b31a) into master (7cbfc3b) will increase coverage by 0.10%.
The diff coverage is 39.39%.

@@            Coverage Diff             @@
##           master     #437      +/-   ##
==========================================
+ Coverage   67.80%   67.91%   +0.10%     
==========================================
  Files         103      104       +1     
  Lines        7148     7178      +30     
  Branches      795      798       +3     
==========================================
+ Hits         4847     4875      +28     
- Misses       2064     2066       +2     
  Partials      237      237              
Impacted Files Coverage Δ
delfin/drivers/fake_storage/__init__.py 81.71% <0.00%> (-0.47%) ⬇️
delfin/exporter/prometheus/exporter.py 66.66% <33.33%> (+66.66%) ⬆️
delfin/exporter/prometheus/alert_manager.py 42.85% <42.85%> (ø)
delfin/exporter/prometheus/prometheus.py 32.00% <0.00%> (+32.00%) ⬆️

host = alert_cfg.alert_manager_host
port = alert_cfg.alert_manager_port
for alert in alerts:
print(alert)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this print. Or Is someone listening on stdout?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, Thanks

kumarashit
kumarashit previously approved these changes Dec 23, 2020
Copy link
Collaborator

@kumarashit kumarashit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -67,6 +69,11 @@ def show(self, req, id):
for alert in alert_list:
alert_util.fill_storage_attributes(alert, storage)

try:
self.alert_exporter.dispatch(ctx, alert_list)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for adding alert exporter dispatch in wsgi server? Can we handle it in task manager server?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this. there sync API in alert to handle alert pull task from task manager. GET Alert will respond directly to client through REST.

@@ -280,7 +280,56 @@ def clear_alert(self, context, alert):
pass

def list_alerts(self, context, query_para=None):
pass
alert_list = [{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make some fields random ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, alert id and occur_time is random

]

CONF.register_opts(alert_mngr_opts, "PROMETHEUS_ALERT_MANAGER_EXPORTER")
alert_cfg = CONF.PROMETHEUS_ALERT_MANAGER_EXPORTER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to make alert_cfg global ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Values are read once and never changed after that.. No other specific reason, all conf read in delfin is global to file scope AFIK.

dict["annotations"]["summary"] = alert.get("description")

self.alerts.append(dict)
requests.post('http://' + host + ":" + port + '/api/v1/alerts',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check post response status and report?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done,

requirements.txt Outdated Show resolved Hide resolved
etc/delfin/delfin.conf Outdated Show resolved Hide resolved
delfin/exporter/prometheus/alert_manager.py Show resolved Hide resolved
Copy link
Collaborator

@wisererik wisererik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@kumarashit kumarashit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@joseph-v
Copy link
Collaborator

LGTM

@kumarashit kumarashit merged commit 0c967f2 into sodafoundation:master Dec 23, 2020
amanr032 pushed a commit to amanr032/delfin that referenced this pull request Dec 5, 2022
Get K8s Cluster IP before deploying Multi-Cloud Pods.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants