-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Stack Monitoring] Alerting Phase -1 #42960
Comments
Pinging @elastic/stack-monitoring |
Update here. I found a couple of blockers while taking a first stab at this and raised them here: #45571 |
The effort is going well here. I don't have a PR ready yet, but I hope to have it this week. (Update: Draft PR available) Some updated notes on this effort:
|
Nice work @chrisronline 💪 Can't wait to see it!
Once "Kibana Alerting" is live are we completely deprecating/removing the current/old Alerting? I think we might still want a new index, just in case some setups still have the old
💯
I prefer in the Kibana UI, just because it's more UI friendly, and they can modify the info without restarting, but I don't mind continuing the yml trend. |
Thanks for the thoughts @igoristic!
I guess it depends on if we want a slow rollout of these migrations. If so, we will be living in a world where both are running and exist at the same time (not for the same alert check, but we'll have some watcher based cluster alerts and some kibana alerts)
You don't think we can accomplish the same UI from just using the state provided by the alerting framework? I think that's really all we need since we'll store data in there that tells us when the alert fired and if it's been resolved yet.
Yea I agree the UI route is better, but if we do a slow rollout, it might be confusing for folks who already have the |
I guess I don't really know how the current implantation well enough to validate my concern. My worry is that if an ES Alert is triggered it'll be added to the index which will then be picked up by both ES Alerts and KB Alerts which might duplicate some actions like sending two emails etc... I just think a new index can help avoid any of this issues we might not yet foresee (maybe for the same reason Metricbeat has its own This is all based on speculation though |
Ah, I see the confusion here. Part of this work involves disabling (or blacklisting per @cachedout's idea) the cluster alert when we enable the Kibana alert. We'd never have a situation (intentionally) where both the cluster alert for xpack license expiration, and the Kibana alert for xpack license expiration are running at the same time. |
I think that gradually merging these and leaving them disabled until we are ready to switch the new alerting on in the application is the right thing to do. It gives us time to develop and test the alerts while minimizing the disruption for the user. |
I was forwarded to this issue from elastic/elasticsearch#34814 (comment). The "Phase -1 which is outlined in the proposal document." is not linked so I don’t have knowledge of that so excuse me if this is beyond the scope of "Phase 1". As a Elastic Stack admin, I feel the "Stack Monitoring" falls short compared to other Monitoring systems. For example, there is no concept of Hard and Soft States. And I am not convinced that it would be a good idea to replicate this using Elastic watcher (I tried for my own use and failed). See elastic/elasticsearch#34814 (comment) for more details. |
Thank you @ypid-geberit for your feedback
I think this is a good request feature, but perhaps out of scope within the context of this ticket. @ravikesarwani Maybe this is something we can add a ticket for in SM feature requests roadmap |
Many of the out of the box stack monitoring alerts provide users the full flexibility to control the notifications (including what method to get notified with based on license level) and when they are generated. For example "CPU Usage" has the default to alert when CPU is over 85% looking at average over last 5 minutes. Both 85% and 5 minutes duration can easily be adjusted by the users. Also with #91145 we will allow users to create multiple alerts and be able to handle feature similar to soft and hard states. For example "Say user wants to alert when CPU is 75% for last 5 minutes and send an email. When its 85% for last 10 minutes they want to send a pagerduty alert." |
Sounds like what @ravikesarwani wrote addresses it. I am looking forward to it :) |
This ticket tracks the work which needs to be completed to achieve Phase -1 which is outlined in the proposal document.
To complete this phase, we need to build out the plumbing to connect to the Stack Monitoring application to the Kibana Alerting Framework.
All watches need to be present and functional using the new framework:
elasticsearch_cluster_status[Monitoring] Cluster state watch to Kibana alerting #61685xpack_license_expiration.json[Monitoring] Migrate license expiration alert to Kibana alerting #54306Prevent access to UI unless gold+ since that is required to make email actionNo longer the case, since the merge of [Monitoring] Migrate data source for legacy alerts to monitoring data directly #87377The text was updated successfully, but these errors were encountered: