Director is a config management module for Icinga Web 2 . Icinga Web 2 enhanced with director enables users to use it to create Icinga 2 configuration.
This summarizes director configuration. All the configuration is done via Icinga Web 2 director section. Because the configuration is done in gui, it is hard to document it. Director provides an export and import features through icingacli which help documentation process a lot. Nevertheless it is also documented where to do this in Icinga Web 2.
Using director together with all tools in this setup enables the configuration to be fully automated. This means that when the source data changes, it propagates all the changes to whole monitoring setup and deploys the changes made without any human interaction.
One of really good features of this setup is that the configuration management can be either automated or the monitoring server administrator can deploy configuration manually when needed. All the configuration director modifications are tracked in activity log. The deployment log can provide a diff between specific files from deployed configurations. This can also be used on some fileshipper configuration in case that there would be any changes. This could help diagnose any encountered problems with apply rules or so.
Import sources are available in the automation section in director section.
Import sources are the most important part of director configuration. Import sources define a source (Icinga Web 2 resource). Import source type can be ldap, sql, core api or fileshipper. After defining the import source define a sync rule which used to synchronize the data.
For our import sources we chose the sql type. At first we tried to use ldap type because our evidence uses ldap. There were several problems with ldap import source and data transformations, so we decided to add an extra "compatibility" layer in form of a database. This database is filled by our synchronization scripts.
This was a good way to overcome some of the problems with data transformations and also this abstracts our evidence structure a lot, so anyone who undestrands the designed database structure can use this.
The import sources themselves are highly tied to database structure used. Please read carefully the documentation about database schema and follow it.
Our import sources import_sources.json You can use these exported data for import using:
icingacli director import importsource
There is no import source for services. This import source was the most problematic one. The sync rule tied to this import source was constantly having serious problems (really long runs, errors, ...). This import source and sync rule was replaced by fileshipper module and its static and dynamic configuration which does not have any of these problems.
This import source defines source data for synchronization of hosts.
A number of modifiers is used to transform some of the values to desired data types.
The most important modifier is the one, which fills variable radius_ip
.
This modifier looks up host by name. It returns IPv4 address. (see this)
It sometimes happens, that some RADIUS host names are not resolvable from DNS. When this happens someone should be notified about this.
In case this happens without any deployments,
all the tests except PING will be working fine.
In case this happens with new configuration deployment,
the modifier mentioned above will set variable radius_ip
as null.
All the tests which use this variable will get to state UNKNOWN since
the variable will not be available.
In both cases the PING test will return someting like:
Invalid hostname/address - radius.domain.tld
To be able to notify administrators about this, simple script was created.
This script should be run at regular intervals by cron daemon.
It checks all the ping services using icingacli
.
If there are some services which report the error message mentioned above,
the script notifies administrators about this.
DNS failure checking is currently disabled due to change ping parameters - IP address is used instead of DNS hostname.
This import source defines source data for synchronization of hostgroups and servicegroups.
This import source defines source data for synchronization of users.
Sync rules define how data from import sources are mapped to Icinga 2 objects. The mapping is done for every row from the import source data. For each row from import source data an object of selected type is created. Object properties can be filled by selecting mapping of fields from import source data.
Our sync rules syncrules.json
You can use these exported data for import using:
icingacli director import syncrule
It is important that syncrules are ran in correct order. If they are run in incorrect order, the dependent sync rule will fail with some error. The order of rules should be clear to any experienced administrators. The order is defined in synchronization scripts configuration, but it not very meaningful (the numbers are id's of single rules).
Uses import source for realms. Creates hostgroups which are named as realms which the servers handle. This sync rule must be run before sync rule for hosts.
Uses import source for hosts. Creates hosts with DNS names of RADIUS servers.
Uses import source for realms. Creates servicegroups which are named as realms which the servers handle.
Uses import source for hosts. Creates usergroups with DNS names of RADIUS servers. This sync rule must be run before sync rule for users.
Uses import source for users. Creates users with their names. Names are set only to ascii (for more see this).
Host templates are used to simplify configuration by extracting common configuration to template. We use just one host template for all the servers:
template Host "generic eduroam radius server" {
check_command = "dummy"
check_interval = 365d
}
This template uses a dummy check which does nothing and always returns successfull state. This is done on purpose. We just want to monitor the services on the host specifically and not the host itself. Check interval is set to 365 days, there is no need to do a dummy check every 5 minutes.
There is currently no way that this configuration can be exported or imported, so this has to be done manually in the director.
Service templates are assigned to service definitions. This is used in static_config, in services.conf and in director service apply rules.
Some of the templates may indicate that the are not in use when viewed in icingaweb2. This may not be completely true, because some files from fileshipper may still use these templates. Director has no way of knowing if some fileshipper configration uses these or not.
There is currently no way that this configuration can be exported or imported, so this has to be done manually in the director.
All the used templates are defined below.
template Service "big packet template" {
check_command = "check_rad_eap"
max_check_attempts = "3"
check_interval = 1d
retry_interval = 12h
enable_notifications = true
enable_flapping = true
command_endpoint = null
}
template Service "chargeable user identity template" {
check_command = "check_cui"
max_check_attempts = "3"
check_interval = 2d
retry_interval = 3h
enable_notifications = true
enable_flapping = true
command_endpoint = null
}
template Service "compromised users template" {
check_command = "check_compromised"
max_check_attempts = "3"
check_interval = 2d
retry_interval = 12h
enable_notifications = true
command_endpoint = null
}
template Service "concurrent inst template" {
check_command = "check_concurrent"
max_check_attempts = "3"
check_interval = 2d
retry_interval = 12h
enable_notifications = true
command_endpoint = null
}
template Service "cve-2017-9148 template" {
check_command = "check_cve_2017_9148"
max_check_attempts = "3"
check_interval = 2d
retry_interval = 3h
check_timeout = 75s
enable_notifications = true
enable_flapping = true
command_endpoint = null
}
template Service "fake uid template" {
check_command = "check_fake_uid"
max_check_attempts = "3"
check_interval = 2d
retry_interval = 3h
enable_notifications = true
enable_flapping = true
command_endpoint = null
}
template Service "home realm alive template" {
check_command = "check_home_realm_alive"
max_check_attempts = "3"
check_interval = 5m
retry_interval = 10m
}
template Service "ping template" {
check_command = "ping4"
max_check_attempts = "10"
check_interval = 5m
retry_interval = 1m
enable_notifications = true
enable_flapping = true
groups = [ "PING" ]
command_endpoint = null
}
template Service "vcelka maja template" {
check_command = "check_vcelka_maja"
max_check_attempts = "3"
check_interval = 2d
retry_interval = 3h
enable_notifications = true
enable_flapping = true
command_endpoint = null
}
template Service "visitors template" {
check_command = "check_visitors"
max_check_attempts = "3"
check_interval = 3h
retry_interval = 2h
command_endpoint = null
vars.doc_url = "https://www.eduroam.cz/cs/spravce/monitoring/end2end_monitoring_new#visitors"
}
template Service "calling station id template" {
check_command = "check_csi"
max_check_attempts = "3"
check_interval = 1d
retry_interval = 12h
enable_notifications = true
enable_flapping = true
command_endpoint = null
}
template Service "operator name template" {
check_command = "check_operator_name"
max_check_attempts = "3"
check_interval = 1d
retry_interval = 12h
enable_notifications = true
enable_flapping = true
command_endpoint = null
}
template Service "coverage info template" {
check_command = "check_coverage_info"
max_check_attempts = "3"
check_interval = 1d
retry_interval = 3h
command_endpoint = null
}
template Service "eduroam cat template" {
check_command = "check_eduroam_cat"
max_check_attempts = "3"
check_interval = 1d
retry_interval = 1d
command_endpoint = null
}
template Service "eap certificate template" {
check_command = "check_eap_cert"
max_check_attempts = "3"
check_interval = 2h
retry_interval = 2h
command_endpoint = null
}
Most of the service groups are defined in groups.conf. Only two groups are defined in director, because service apply rules for these services are also configured in director.
Servicegroup for PING service:
object ServiceGroup "PING" {
display_name = "PING"
}
Servicegroup for VISITORS service:
object ServiceGroup "VISITORS" {
display_name = "VISITORS"
}
Export of our service groups servicegroups.json The import is not possible currently.
Most of the service apply rules are defined in static_config.conf. Only two service apply rules are defined in director.
Service apply rule for PING:
apply Service "PING" {
import "ping template"
assign where host.name
import DirectorOverrideTemplate
}
Service apply rule for VISITORS:
apply Service "VISITORS" {
import "visitors template"
assign where host.vars.mon_realm
groups = [ "VISITORS" ]
import DirectorOverrideTemplate
}
There is currently no way that this configuration can be exported or imported, so this has to be done manually in the director.
Commands represent the plugins used for checks.
There is currently no way that this configuration can be exported or imported, so this has to be done manually in the director.
Some of the commands will show that they are not used when viewed in Icinga Web 2. This is because they are used in Icinga 2 configuration. Director has no way of knowing this.
All the used commands are defined below.
object CheckCommand "check_compromised" {
import "plugin-check-command"
command = [ PluginDir + "/compromised_users.sh" ]
arguments += {
"(no key)" = {
order = 2
required = true
skip_key = true
value = "$host.vars.all_realms$"
}
"(no key.2)" = {
order = 1
required = true
skip_key = true
value = "60"
}
}
}
object CheckCommand "check_concurrent" {
import "plugin-check-command"
command = [ PluginDir + "/concurrent_inst.sh" ]
arguments += {
"(no key)" = {
order = 1
required = true
skip_key = true
value = "$service.vars.realm$"
}
"(no key.2)" = {
order = 2
required = true
skip_key = true
value = "20"
}
"(no key.3)" = {
order = 3
required = true
skip_key = true
value = "10"
}
"(no key.4)" = {
order = 4
required = true
skip_key = true
value = "20"
}
}
}
object CheckCommand "check_coverage_info" {
import "plugin-check-command"
command = [ PluginDir + "/coverage_info.sh" ]
arguments += {
"(no key)" = {
order = 1
required = true
skip_key = true
value = "$service.vars.realm$"
}
}
}
object CheckCommand "check_csi" {
import "plugin-check-command"
command = [ PluginDir + "/test-Calling-Station-Id-v2.pl" ]
timeout = 1m
arguments += {
"-F" = {
required = true
value = " /var/log/radius1edu-radius.ON_CSI"
}
"-H" = {
required = true
value = "$host.vars.radius_ip$"
}
}
}
object CheckCommand "check_cui" {
import "plugin-check-command"
command = [ PluginDir + "/test-Chargeable-User-Identity.pl" ]
arguments += {
"-H" = {
required = true
value = "$host.vars.radius_ip$"
}
"-M" = {
required = true
value = "$service.vars.mac_address$"
}
"-P" = {
required = true
value = "1812"
}
"-S" = "$host.vars.mon_radius_secret$"
"-p" = {
required = true
value = "$service.vars.testing_password$"
}
"-u" = {
required = true
value = "$service.vars.testing_id$"
}
}
}
object CheckCommand "check_cve_2017_9148" {
import "plugin-check-command"
command = [ PluginDir + "/tls-resume-expl" ]
arguments += {
"(no key)" = {
order = 1
required = true
skip_key = true
value = "$service.vars.testing_id$"
}
"(no key.2)" = {
order = 2
required = true
skip_key = true
value = "$host.vars.radius_ip$"
}
"(no key.3)" = {
order = 3
required = true
skip_key = true
value = "$host.vars.mon_radius_secret$"
}
"(no key.4)" = {
order = 4
required = true
skip_key = true
value = "$service.vars.mac_address1$"
}
"(no key.5)" = {
order = 5
required = true
skip_key = true
value = "$service.vars.mac_address2$"
}
}
}
object CheckCommand "check_eap_cert" {
import "plugin-check-command"
command = [ PluginDir + "/eap_cert.sh" ]
timeout = 1m
arguments += {
"(no key)" = {
order = -5
required = true
skip_key = true
value = "$service.vars.mon_realm$"
}
"(no key.2)" = {
order = -4
required = true
skip_key = true
value = "$host.name$"
}
"-H" = {
required = true
value = "$host.vars.radius_ip$"
}
"-M" = {
required = true
value = "$service.vars.mac_address$"
}
"-P" = {
required = true
value = "1812"
}
"-S" = {
required = true
value = "$host.vars.mon_radius_secret$"
}
"-X" = "14"
"-e" = {
required = true
value = "PEAP"
}
"-f" = {
set_if = "$service.vars.require_fragmented$"
}
"-i" = {
required = false
value = "$service.vars.info$"
}
"-m" = {
required = true
value = "WPA-EAP"
}
"-p" = {
required = true
value = "$service.vars.testing_password$"
}
"-t" = {
required = true
value = "50"
}
"-u" = {
required = true
value = "$service.vars.testing_id$"
}
}
}
object CheckCommand "check_eduroam_cat" {
import "plugin-check-command"
command = [ PluginDir + "/eduroam_cat.sh" ]
arguments += {
"(no key)" = {
order = 1
required = true
skip_key = true
value = "$service.vars.realm$"
}
}
}
object CheckCommand "check_fake_uid" {
import "plugin-check-command"
command = [ PluginDir + "/check-fake-id" ]
arguments += {
"-A" = {
required = true
value = "$service.vars.anon_id$"
}
"-H" = {
required = true
value = "$host.vars.radius_ip$"
}
"-M" = {
required = true
value = "$service.vars.mac_address$"
}
"-P" = {
required = true
value = "1812"
}
"-S" = {
required = true
value = "$host.vars.mon_radius_secret$"
}
"-e" = {
required = true
value = "PEAP"
}
"-m" = {
required = true
value = "WPA-EAP"
}
"-p" = {
required = true
value = "$service.vars.testing_password$"
}
"-t" = {
required = true
value = "50"
}
"-u" = {
required = true
value = "$service.vars.testing_id$"
}
}
}
object CheckCommand "check_home_realm_alive" {
import "plugin-check-command"
command = [ PluginDir + "/home_realm_alive.sh" ]
arguments += {
"(no key)" = {
order = 1
required = true
skip_key = true
value = "$service.vars.realm$"
}
"(no key.2)" = {
order = 2
required = true
skip_key = true
value = "$service.vars.home_servers$"
}
}
}
object CheckCommand "check_ipsec" {
import "plugin-check-command"
command = [ "/usr/local/bin/check_ipsec" ]
timeout = 1m
arguments += {
"(no key)" = {
order = 1
required = true
skip_key = true
value = "$host.vars.radius_ip$"
}
}
}
object CheckCommand "check_operator_name" {
import "plugin-check-command"
command = [ PluginDir + "/test-Operator-Name.pl" ]
arguments += {
"-F" = {
required = true
value = "/var/log/radius1edu-radius.ON_CSI"
}
"-H" = {
required = true
value = "$host.vars.radius_ip$"
}
"-R" = {
required = true
value = "$host.vars.all_realms$"
}
}
}
RADSEC command is configured twice - once for servers with IdP+SP role and once for server with SP only role.
object CheckCommand "check_radsec" {
import "plugin-check-command"
command = [ "/usr/local/bin/check_radsec.pl" ]
timeout = 1m
arguments += {
"-H" = {
order = 1
required = true
value = "$host.vars.radius_ip$"
}
}
}
object CheckCommand "check_radsec_sp" {
import "plugin-check-command"
command = [ "/usr/local/bin/check_radsec.pl" ]
timeout = 1m
arguments += {
"--SPonly" = {}
"-H" = {
order = 1
required = true
value = "$host.vars.radius_ip$"
}
}
}
object CheckCommand "check_rad_eap" {
import "plugin-check-command"
command = [ PluginDir + "/rad_eap_test" ]
timeout = 1m
arguments += {
"-H" = {
required = true
value = "$host.vars.radius_ip$"
}
"-M" = {
required = true
value = "$service.vars.mac_address$"
}
"-P" = {
required = true
value = "1812"
}
"-S" = {
required = true
value = "$host.vars.mon_radius_secret$"
}
"-e" = {
required = true
value = "PEAP"
}
"-i" = {
required = false
value = "$service.vars.info$"
}
"-m" = {
required = true
value = "WPA-EAP"
}
"-p" = {
required = true
value = "$service.vars.testing_password$"
}
"-t" = {
required = true
value = "50"
}
"-u" = {
required = true
value = "$service.vars.testing_id$"
}
}
}
object CheckCommand "check_vcelka_maja" {
import "plugin-check-command"
command = [ PluginDir + "/vcelka-maja" ]
arguments += {
"(no key)" = {
order = 1
required = true
skip_key = true
value = "test001@cesnet.cz"
}
"(no key.2)" = {
order = 2
required = true
skip_key = true
value = "password" // password removed for being sensitive infomation
}
"(no key.3)" = {
order = 3
required = true
skip_key = true
value = "$host.vars.radius_ip$"
}
"(no key.4)" = {
order = 4
required = true
skip_key = true
value = "$host.vars.mon_radius_secret$"
}
"(no key.5)" = {
order = 5
required = true
skip_key = true
value = "$service.vars.realm$"
}
"(no key.6)" = {
order = 6
required = true
skip_key = true
value = "$service.vars.mac_address1$"
}
"(no key.7)" = {
order = 7
required = true
skip_key = true
value = "$service.vars.mac_address2$"
}
}
}
object CheckCommand "check_visitors" {
import "plugin-check-command"
command = [ PluginDir + "/check_visitors.sh" ]
arguments += {
"(no key)" = {
order = 1
required = true
skip_key = true
value = "$host.name$"
}
}
}
We only use notifications for services. One notification template is used for most services:
template Notification "generic notification" {
command = "mail-service-notification"
interval = 1d
states = [ Critical, OK, Unknown, Warning ]
types = [ Acknowledgement, Custom, Problem, Recovery ]
}
For OPERATOR-NAME another template is used because notification should not be sent on warning state:
template Notification "operator name notification" {
command = "mail-service-notification"
interval = 1d
states = [ Critical, OK, Unknown ]
types = [ Acknowledgement, Custom, Problem, Recovery ]
}
We defined two endpoints - one for czech top level RADIUS server and for for monitoring itself:
Monitoring endpoint:
object Endpoint "ermon2.cesnet.cz" {
host = "ermon2.cesnet.cz"
port = "5665"
log_duration = 1d
}
Czech top level RADIUS server endpoint:
object Endpoint "radius1.eduroam.cz" {
host = "radius1.eduroam.cz"
port = "5665"
}
There is currently no way that this configuration can be exported or imported, so this has to be done manually in the director.
We defined two zones - one for czech top level RADIUS server and for for monitoring itself:
Monitoring zone:
object Zone "ermon2.cesnet.cz" {
endpoints = [ "ermon2.cesnet.cz" ]
}
Czech top level RADIUS server zone:
object Zone "radius1.eduroam.cz" {
parent = "ermon2.cesnet.cz"
endpoints = [ "radius1.eduroam.cz" ]
}
There is currently no way that this configuration can be exported or imported, so this has to be done manually in the director.
Data fields enable icinga objects to be to have custom variables. These variables may be set using director.
Our sync rules datafields.json The import is not possible currently.
The national top level RADIUS server needs to be added manually using director. This host is not in our evidence.
The top level national RADIUS server represents a client in icinga cluster protocol. Some of the tests must be ran on the client, because the monitoring server itself does not have data needed for the tests. For more info see tests
The host is added manualy in icingaweb2 director menu. No special settings are set.
Client setup may be done using script provided in icingaweb2. Download this script on client and run it. It should do all the work and the master and the client should be able to communicate.