-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Icinga 2.11rc1 unable to be connected & crash (after upgrade) #7470
Comments
Please test that with the current snapshot packages, rc1 received quite a few fixes in this regard. |
Hi @dnsmichi As these servers don't have direct internet access and are managed via spacewalk, can you provide the repo URLs? The package listed is not very useful. After checking http://packages.icinga.com/, i suspect the correct URL is http://packages.icinga.com/epel/$releasever/snapshot/ ? Also, if I may suggest, if your response to a bug report is just telling people that rc1 is untrustworthy/buggy, remove that package, or at least add a note informing people about it. Thanks. |
Fixes for #7431 might influence this. |
I have now tackled another upgrade to 2.11.1 and did not get the "uncommunicative" problem any more. So that seems to be fixed. I still got long unresponsive periods and regular crashes with 2.11.1 and 2.11.2 on random master & satellite hosts after each reload. This problem was greatly reduced after setting "log_duration=0" in all agent/client "Endpoint" definitions in the (top-down) configuration. Not sure if you are interested in chasing/debugging those crashes. |
Describe the bug
After upgrading from 2.10 to 2.11rc1 the resulting server was uncommunicative on the :5665 port.
Additionally the server crashed after ~15 min of runtime.
Details
To clarify the "uncommunicative" part. The server listened on 5665, and accepted connections, but did not respond to anything after the SSL handshake (with credentials)
Without credentials, the expected error message was given:
icinga2 console --connect https://munnvmonpmac11:5665/
behaved similarly (returning a timeout error in response to any line)Additionally, the server crashed after some runtime, although I'm unsure if this is related to the other issue or not.
To Reproduce
happened after a yum upgrade to 2.11rc1
Environment
~1000 Hosts with a total of 15000 Services.
~200 Hosts directly connecting to the master, the rest via satellites.
Additional context
All our agents are configured to connect to he master server. Logs for one of the hosts looked like this:
Crashlogs:
...
...
The text was updated successfully, but these errors were encountered: