Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pre-install and diagnostic checks sections #4511

Merged
merged 1 commit into from
Jun 5, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions admin_guide/diagnostics_tool.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,78 @@ A client with *cluster-admin* access available (for any user, but only the
current master) should be able to diagnose the status of infrastructure such as
nodes, registry, and router. In each case, running `oc adm diagnostics` looks
for the client configuration in its standard location and uses it if available.

[[additional-cluster-health-checks]]
== Additional Diagnostic Checks via Ansible

// TODO: add link to OCP image once it is available

Some additional diagnostic checks are available through the *openshift-ansible*
container image. See the image's link:https://github.com/openshift/openshift-ansible/blob/master/README_CONTAINER_IMAGE.md[source repository] for usage information.

The following health checks belong to a diagnostic task meant to be run against
the Ansible inventory file for a deployed {product-title} cluster. They can
report common problems for the current {product-title} installation.

[[admin-guide-diagnostics-tool-ansible-checks]]
.Diagnostic Checks
[options="header"]
|===

|Check Name |Purpose

|`ovs_version`
|This check ensures that a host has the correct version of Open vSwitch installed
for the currently deployed version of {product-title}.

|`kibana`, `curator`, `elasticsearch`, `fluentd`
|This set of checks verifies that Elasticsearch, Fluentd, and Curator pods have
been deployed and are in a `running` state, and that a connection can be
established between the control host and the exposed Kibana URL. These checks
will only run if the `openshift_hosted_logging_deploy` inventory variable is set
to `true`, to ensure that they are executed in a deployment where a logging
stack has been deployed.

|`etcd_imagedata_size`
|This check measures the total size of {product-title} image data in an etcd
cluster. The check fails if the calculated size exceeds a user-defined limit. If
no limit is specified, this check will fail if the size of image data amounts to
50% or more of the currently used space in the etcd cluster.

A failure from this check indicates that a significant amount of space in etcd
is being taken up by {product-title} image data, which can eventually result in
your etcd cluster crashing.

A user-defined limit may be set by passing the variable
`etcd_max_image_data_size_bytes=400000000` to the `openshift_health_checker`
role.

|`etcd_volume`
|This check ensures that the volume usage for an etcd cluster is below a maximum
user-specified threshold. If no maximum threshold value is specified, it is
defaulted to `90%` of the total volume size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ticks around 90% ??


A user-defined limit may be set by passing the variable
`etcd_device_usage_threshold_percent=90` to the `openshift_health_checker` role.

|`docker_storage`
|Only runs on hosts that depend on the *docker* damon (nodes and containerized
installations). Checks that *docker*'s total usage does not exceed a
user-defined limit. If no user-defined limit is set, *docker*'s maximum usage
threshold defaults to 90% of the total size available. The threshold
limit for total percent usage can be set with a variable in your inventory file:
`max_thinpool_data_usage_percent=90`.
|===

To disable specific checks, include the variable `openshift_disable_check` with
a comma-delimited list of check names in your inventory file. For example:

----
openshift_disable_check=ovs_version,etcd_volume
----

A similar set of checks meant to run as part of the installation process can be
found in
xref:../install_config/install/advanced_install.adoc#configuring-cluster-pre-install-checks[Configuring Cluster Pre-install Checks]. Another set of checks for checking certificate
expiration can be found in
xref:../install_config/redeploying_certificates.adoc#install-config-redeploying-certificates[Redeploying Certificates].
90 changes: 90 additions & 0 deletions install_config/install/advanced_install.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -419,6 +419,96 @@ meaning that it is available for placement of new pods. See
xref:marking-masters-as-unschedulable-nodes[Configuring Schedulability on Masters].
|===

[[configuring-cluster-pre-install-checks]]
=== Configuring Cluster Pre-install Checks

Pre-install checks are a set of diagnostic tasks that run as part of the
*openshift_health_checker* Ansible role. They run prior to an Ansible
installation of {product-title}, ensure that required inventory values are set,
and identify potential issues on a host that can prevent or interfere with a
successful installation.

The following table describes available pre-install checks that will run before
every Ansible installation of {product-title}:

[[configuring-cluster-pre-install-checks-pre-install-checks]]
.Pre-install Checks
[options="header"]
|===

|Check Name |Purpose

|`memory_availability`
|This check ensures that a host has the recommended amount of memory for the
specific deployment of {product-title}. Default values have been derived from
the
xref:../../install_config/install/prerequisites.html#system-requirements[latest
installation documentation]. A user-defined value for minimum memory
requirements may be set by setting the `openshift_check_min_host_memory_gb`
cluster variable in your inventory file.

|`disk_availability`
|This check only runs on etcd, master, and node hosts. It ensures that the mount
path for an {product-title} installation has sufficient disk space remaining.
Recommended disk values are taken from the
xref:../../install_config/install/prerequisites.html#system-requirements[latest
installation documentation]. A user-defined value for minimum disk space
requirements may be set by setting `openshift_check_min_host_disk_gb` cluster
variable in your inventory file.

|`docker_storage`
|Only runs on hosts that depend on the *docker* daemon (nodes and containerized
installations). Checks that *docker*'s total usage does not exceed a
user-defined limit. If no user-defined limit is set, *docker*'s maximum usage
threshold defaults to 90% of the total size available. The threshold limit for
total percent usage can be set with a variable in your inventory file:
`max_thinpool_data_usage_percent=90`. A user-defined limit for maximum thinpool
usage may be set by setting the `max_thinpool_data_usage_percent` cluster
variable in your inventory file.

|`docker_storage_driver`
|Ensures that the *docker* daemon is using a storage driver supported by
{product-title}. If the
https://docs.docker.com/engine/userguide/storagedriver/device-mapper-driver[`devicemapper`]
storage driver is being used, the check additionally ensures that a loopback
device is not being used.

|`docker_image_availability`
|Attempts to ensure that images required by an {product-title} installation are
available either locally or in at least one of the configured container image
registries on the host machine.

|`package_version`
|Runs on `yum`-based systems determining if multiple releases of a required
{product-title} package are available. Having multiple releases of a package
available during an `enterprise` installation of OpenShift suggests that there
are multiple `yum` repositories enabled for different releases, which may lead
to installation problems. This check is skipped if the `openshift_release`
variable is not defined in the inventory file.

|`package_availability`
|Runs prior to non-containerized installations of {product-title}. Ensures that
RPM packages required for the current installation are available.

|`package_update`
|Checks whether a `yum` update or package installation will succeed, without
actually performing it or running `yum` on the host.
|===

To disable specific pre-install checks, include the variable
`openshift_disable_check` with a comma-delimited list of check names in your
inventory file. For example:

----
openshift_disable_check=memory_availability,disk_availability
----

A similar set of checks meant to run for diagnostic on existing clusters can be
found in
xref:../../admin_guide/diagnostics_tool.adoc#additional-cluster-health-checks[Additional Diagnostic Checks via Ansible]. Another set of checks for checking certificate
expiration can be found in
xref:../../install_config/redeploying_certificates.adoc#install-config-redeploying-certificates[Redeploying Certificates].

[[advanced-install-configuring-registry-location]]
=== Configuring a Registry Location

Expand Down