-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notifications lost during restart #7086
Comments
Also, please add the logs and configs required to reproduce the problem. |
@Crunsher Please also write down the ideas you've collected and why they'd break the cluster. |
@dnsmichi What's your opinion about behaving as if there's a split-brain on startup until the first connection (object authority)? We could send two notifications sometimes, but IMO it's better than zero. |
Imho notifications should be suppressed up until everything is running again. This involves two things:
|
How to reproduce
Icinga will send (and log) a problem notification, but not a recovery one – ex.
|
Steps to reproduce:
Why?
A Notification will not be sent if it is paused (HA, other instance is responsible). Due to the large number of Objects the HA state has not been computed yet for our Notification in question during startup and therefore returns the default: Notification is paused. So even in a single instance setup, Icinga thinks someone else in the cluster is taking care of it and the Notification is just discarded.
Possible solutions:
A lot of the ideas I collected have a high chance of breaking the cluster. Disregarding those I came to the conclusion that reworking the NotficationComponent to deal with this case is the most stable way to go about this.
The NotficationComponent could keep a
workqueuemulti index for all the Notifications to be sent. Execution can then easily be ties to the HA state by either waiting for the first Object Authority run. This would also require changes to the APiListener/ObjectAuthority and while we are at it, it could make sense to split UpdateObjectAuthority for types.refs #5521
ref/NC/601223
The text was updated successfully, but these errors were encountered: