Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Email alerts coming without subject (and alert name too long) #10468

Closed
amaltaro opened this issue Apr 26, 2021 · 3 comments · Fixed by #10475
Closed

Email alerts coming without subject (and alert name too long) #10468

amaltaro opened this issue Apr 26, 2021 · 3 comments · Fixed by #10475

Comments

@amaltaro
Copy link
Contributor

amaltaro commented Apr 26, 2021

Impact of the bug
Any system using the new AlertManager module (only MicroServices so far)

Describe the bug
Over the weekend I received 3 MSTransferor email alerts with no subject (no subject). We should adopt the correct alert arguments such that alert emails are sent with a meaningful and short subject.

How to reproduce it
I guess it's just a matter of creating an alert in AlertManager.

Expected behavior
There are a couple of things to be fixed in this issue:

  • email alerts must contain a short and meaningful subject. Something like "MSTransferor: input transfer over threshold for workflow ABC..."
  • the set of rule ids should likely not be part of the alert name (otherwise it's very hard to see them in Slack)

Additional context and error message
Follow up of: #10308
Screenshot (cropped): http://amaltaro.web.cern.ch/amaltaro/forWMCore/Issue_10468/wrong_alert_email.png

@amaltaro
Copy link
Contributor Author

amaltaro commented May 3, 2021

While testing it for the current testbed HG2105 validation, I noticed that emails were actually sent with a valid subject this time (this morning):

[FIRING:1] ms-transferor: Transfer over threshold: set([u'503f3d45c6474a42a8ade2f1135c58a5']) (ms-transferor high wmcore)

So, it could be that there is a limit on the email subject and those production alerts generated a very long subject.

Anyhow, since I'm on it now, I'm about to provide further changes to improve it.

@amaltaro
Copy link
Contributor Author

amaltaro commented May 3, 2021

@vkuznet Valentin, just FYI, there seems to be some limits to the subject length. I ran some tests this morning, with the same code that is currently in production, and the email alerts were sent with the correct subject.

There is nothing for you to work on though, since we are changing how the alert name is constructed, it's just so you know in case it happens in the future.

@vkuznet
Copy link
Contributor

vkuznet commented May 3, 2021

ok, good to know. Since it is alerting system rather the email one, it is totally proper choice. All details should go to annotations, summary descriptions instead of subject.

Please remember, AM has very flexible matching rules which can be used to route alerts to different destination, e.g. we can apply regexp to any attribute and route differently alerts. Therefore, I suggest that you better exploit those attributes and if you come up with a "schema" or "tags" we can apply them to route your alerts to appropriate channels. For instance, you may consider database failures to one channel, while workflow failures to another, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants