Skip to content

Commit

Permalink
IcingaDB Check: Multiple Responsible Instances
Browse files Browse the repository at this point in the history
By design, only one Icinga 2 instance should be responsible in the HA
context. If this promise is broken, the Icinga 2 IcingaDB check should
report it.

The code did not check for invalid data in icingadb:telemetry:heartbeat.
With this change, it will go CRITICAL with a descriptive message and
report the actual number of icingadb_responsible_instances in the
performance data.
  • Loading branch information
oxzi committed Nov 15, 2024
1 parent 211bae8 commit 0bbe7a9
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions lib/icingadb/icingadbchecktask.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -227,15 +227,17 @@ void IcingadbCheckTask::ScriptFunc(const Checkable::Ptr& checkable, const CheckR
perfdata->Add(new PerfdataValue("icinga2_heartbeat_age", heartbeatLag, false, "seconds", heartbeatLagWarning, Empty, 0));
}

if (weResponsible) {
if (weResponsible && otherResponsible) {
critmsgs << " Both this instance and another instance are responsible!";
} else if (weResponsible) {
idbokmsgs << "\n* Responsible";
} else if (otherResponsible) {
idbokmsgs << "\n* Not responsible, but another instance is";
} else {
critmsgs << " No instance is responsible!";
}

perfdata->Add(new PerfdataValue("icingadb_responsible_instances", int(weResponsible || otherResponsible), false, "", Empty, Empty, 0, 1));
perfdata->Add(new PerfdataValue("icingadb_responsible_instances", int(weResponsible) + int(otherResponsible), false, "", Empty, Empty, 0, 1));

const auto clockDriftWarning (5);
const auto clockDriftCritical (30);
Expand Down

0 comments on commit 0bbe7a9

Please sign in to comment.