Overview of the Observability for Kubernetes Operator

The Observability for Kubernetes Operator deploys the necessary agents to monitor your clusters and workloads in Kubernetes. This Operator is based on kubebuilder SDK.

Important: Logs (Beta) is enabled only for selected customers. If you’d like to participate, contact your Observability account representative.

Why Use the Observability for Kubernetes Operator?

The Operator simplifies operational aspects of managing the Kubernetes Integration for VMware Aria Operations for Applications (formerly known as Tanzu Observability by Wavefront). Here are some examples, with more to come!

Enhanced status reporting of the Kubernetes Integration so that users can ensure their cluster and Kubernetes resources are reporting data.
Kubernetes Operator features provide a declarative mechanism for deploying the necessary agents in a Kubernetes environment.
Centralized configuration.
Enhanced configuration validation to surface what needs to be corrected in order to deploy successfully.
Efficient Kubernetes resource usage supports scaling out the cluster (leader) node and worker nodes independently.

Note: The Kubernetes Metrics Collector that is deployed by this Operator still supports configuration via configmap. For example, Istio and MySQL metrics, Telegraf configuration, etc. are still supported. For details on the Collector, see collector.md.

Architecture

Installation

Note: The Observability for Kubernetes Operator Helm chart is deprecated and no longer supported. Use the deploy, upgrade, and removal instructions below instead.

Prerequisites

To install the integration, you must use the kubectl tool.

Deploy the Monitoring Agents with the Observability for Kubernetes Operator

Install the Observability for Kubernetes Operator into the observability-system namespace.

Note: If you already have the deprecated Kubernetes Integration installed by using Helm or manual deployment, uninstall it before you install the Operator.
```
kubectl apply -f https://raw.githubusercontent.com/wavefrontHQ/observability-for-kubernetes/main/deploy/wavefront-operator.yaml
```

Create a Kubernetes secret with your Wavefront API token. See Managing API Tokens page.

kubectl create -n observability-system secret generic wavefront-secret --from-literal token=YOUR_WAVEFRONT_TOKEN

Create a wavefront.yaml file with your Wavefront Custom Resource configuration. The simplest configuration is:

# Need to change YOUR_CLUSTER_NAME and YOUR_WAVEFRONT_URL
apiVersion: wavefront.com/v1alpha1
kind: Wavefront
metadata:
  name: wavefront
  namespace: observability-system
spec:
  clusterName: YOUR_CLUSTER_NAME
  wavefrontUrl: YOUR_WAVEFRONT_URL
  dataCollection:
    metrics:
      enable: true
  dataExport:
    wavefrontProxy:
      enable: true

See the Configuration section below for details.

(Logging Beta) Optionally add the configuration for logging to the wavefront.yaml file. For example:

# Need to change YOUR_CLUSTER_NAME, YOUR_WAVEFRONT_URL accordingly
apiVersion: wavefront.com/v1alpha1
kind: Wavefront
metadata:
  name: wavefront
  namespace: observability-system
spec:
  clusterName: YOUR_CLUSTER_NAME
  wavefrontUrl: YOUR_WAVEFRONT_URL
  dataCollection:
    metrics:
      enable: true
    logging:
      enable: true
  dataExport:
    wavefrontProxy:
      enable: true

See Logs Overview (Beta) for an overview and some links to more doc about the logging beta.

See Bring Your Own Logs Shipper for an overview of how to use the Operator with your own logs shipper.

Deploy the agents with your configuration

kubectl apply -f <path_to_your_wavefront.yaml>

Run the following command to get status of the Kubernetes integration:

kubectl get wavefront -n observability-system

The command should return a table like the following, displaying Operator instance health:

NAME        STATUS    PROXY           CLUSTER-COLLECTOR   NODE-COLLECTOR   LOGGING        AGE    MESSAGE
wavefront   Healthy   Running (1/1)   Running (1/1)       Running (3/3)    Running (3/3)  2m4s   All components are healthy

If STATUS is Unhealthy, check troubleshooting.

Note: For details on migrating from existing helm chart or manual deploy, see Migration.

Configuration

You configure the Observability for Kubernetes Operator with a custom resource file.

When you update the resource file, the Operator picks up the changes and updates the integration deployment accordingly.

To update the custom resource file:

Open the custom resource file for edit.
Change one or more options and save the file.
Run kubectl apply -f <path_to_your_config_file.yaml>.

See below for configuration options.

We have templates for common scenarios. See the comments in each file for usage instructions.

You can see all configuration options in the wavefront-full-config.yaml.

Creating Alerts

We have alerts on common Kubernetes issues. For details on creating alerts, see alerts.md.

Observability Failures

Alert name	Description
Observability Status is Unhealthy	The status of the Observability for Kubernetes is unhealthy.

Pod Failures

Alert name	Description
Pod Stuck in Pending	Workload has pod stuck in pending.
Pod Stuck in Terminating	Workload has pod stuck in terminating.
Pod Backoff Event	Workload has pod with container status `ImagePullBackOff` or `CrashLoopBackOff`.
Workload Not Ready	Workload has pods that are not ready.
Pod Out-of-memory Kills	Workload has pod with container status `OOMKilled`.
Container CPU Throttling	Workload has a container with high CPU throttling.
Container CPU Overutilization	Workload has a container with high CPU utilization.
Container Memory Overutilization	Workload has a container with high memory utilization.
Missing etcd leader	etcd cannot elect a leader.

Persistent Volume Failures

Alert name	Description
Persistent Volumes No Claim	Persistent Volume has no claim.
Persistent Volumes Error	Persistent Volume has issues with provisioning.
Persistent Volume Claim Overutilization	Workload has low available disk space for a claimed Persistent Volume.

Node Failures

Alert name	Description
Node Memory Overutilization	Node has high memory utilization.
Node CPU Overutilization	Node has high CPU utilization.
Node Filesystem Overutilization	Node storage is almost full.
Node CPU-request Saturation	Node has overcommitted cpu resource requests.
Node Memory-request Saturation	Node has overcommitted memory resource requests.
Node Disk Pressure	Node has problematic `DiskPressure` condition.
Node Memory Pressure	Node has problematic `MemoryPressure` condition.
Node Condition Not Ready	Node Condition not in Ready state.

Bring Your Own Logs Shipper

The operator deploys a data export component (wavefront-proxy) which can receive log data and forward it to the Operations for Applications service. You will need to configure your logs shipper to send logs to the data export component (wavefront-proxy) deployed by the Operator.

Here is a Wavefront Custom Resource example config for this scenario.

To make the best use of your logging solution on Kubernetes, we recommend having the below Kubernetes log attributes:

Log attribute key	Description
`cluster`	The kubernetes cluster name
`pod_name`	The pod name
`container_name`	The container name
`namespace_name`	The namespace name
`pod_id`	The pod id
`container_id`	The container id

In addition to these, here are some general log attributes to configure your logs shipper based on your use case.

Upgrade

Upgrade the Observability for Kubernetes Operator and underlying agents to a new version by running the following command :

kubectl apply -f https://raw.githubusercontent.com/wavefrontHQ/observability-for-kubernetes/main/deploy/wavefront-operator.yaml

Note: This command will not upgrade any existing deprecated Helm or manual installations. See migration.md for migration instructions.

Downgrade

Go to Releases, and find the previous release version number, for example v2.0.3. Use this value to replace PREVIOUS_VERSION in the following command:


kubectl apply -f https://github.com/wavefrontHQ/observability-for-kubernetes/releases/download/PREVIOUS_VERSION/wavefront-operator.yaml

Removal

To remove the Observability for Kubernetes Operator from your environment, run the following command:

kubectl delete -f https://raw.githubusercontent.com/wavefrontHQ/observability-for-kubernetes/main/deploy/wavefront-operator.yaml

Contribution

See the Contribution page

Name		Name	Last commit message	Last commit date
Latest commit History 1,499 Commits
.github		.github
adr		adr
bin		bin
ci/jenkins		ci/jenkins
collector		collector
deploy		deploy
docs		docs
make		make
operator		operator
osspi		osspi
scripts		scripts
test-proxy		test-proxy
.editorconfig		.editorconfig
.envrc		.envrc
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
ci.Jenkinsfile		ci.Jenkinsfile
release.Jenkinsfile		release.Jenkinsfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview of the Observability for Kubernetes Operator

Quick Reference

Why Use the Observability for Kubernetes Operator?

Architecture

Installation

Prerequisites

Deploy the Monitoring Agents with the Observability for Kubernetes Operator

Configuration

Creating Alerts

Observability Failures

Pod Failures

Persistent Volume Failures

Node Failures

Bring Your Own Logs Shipper

Upgrade

Downgrade

Removal

Contribution

About

Releases 19

Packages

Contributors 26

Languages

License

wavefrontHQ/observability-for-kubernetes

Folders and files

Latest commit

History

Repository files navigation

Overview of the Observability for Kubernetes Operator

Quick Reference

Why Use the Observability for Kubernetes Operator?

Architecture

Installation

Prerequisites

Deploy the Monitoring Agents with the Observability for Kubernetes Operator

Configuration

Creating Alerts

Observability Failures

Pod Failures

Persistent Volume Failures

Node Failures

Bring Your Own Logs Shipper

Upgrade

Downgrade

Removal

Contribution

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 19

Packages 0

Contributors 26

Languages

Packages