The Observability for Kubernetes Operator deploys the necessary agents to monitor your clusters and workloads in Kubernetes. This Operator is based on kubebuilder SDK.
Important: Logs (Beta) is enabled only for selected customers. If you’d like to participate, contact your Observability account representative.
The Operator simplifies operational aspects of managing the Kubernetes Integration for VMware Aria Operations for Applications (formerly known as Tanzu Observability by Wavefront). Here are some examples, with more to come!
- Enhanced status reporting of the Kubernetes Integration so that users can ensure their cluster and Kubernetes resources are reporting data.
- Kubernetes Operator features provide a declarative mechanism for deploying the necessary agents in a Kubernetes environment.
- Centralized configuration.
- Enhanced configuration validation to surface what needs to be corrected in order to deploy successfully.
- Efficient Kubernetes resource usage supports scaling out the cluster (leader) node and worker nodes independently.
Note: The Kubernetes Metrics Collector that is deployed by this Operator still supports configuration via configmap. For example, Istio and MySQL metrics, Telegraf configuration, etc. are still supported. For details on the Collector, see collector.md.
Note: The Observability for Kubernetes Operator Helm chart is deprecated and no longer supported. Use the deploy, upgrade, and removal instructions below instead.
To install the integration, you must use the kubectl tool.
-
Install the Observability for Kubernetes Operator into the
observability-system
namespace.Note: If you already have the deprecated Kubernetes Integration installed by using Helm or manual deployment, uninstall it before you install the Operator.
kubectl apply -f https://raw.githubusercontent.com/wavefrontHQ/observability-for-kubernetes/main/deploy/wavefront-operator.yaml
-
Create a Kubernetes secret with your Wavefront API token. See Managing API Tokens page.
kubectl create -n observability-system secret generic wavefront-secret --from-literal token=YOUR_WAVEFRONT_TOKEN
-
Create a
wavefront.yaml
file with yourWavefront
Custom Resource configuration. The simplest configuration is:# Need to change YOUR_CLUSTER_NAME and YOUR_WAVEFRONT_URL apiVersion: wavefront.com/v1alpha1 kind: Wavefront metadata: name: wavefront namespace: observability-system spec: clusterName: YOUR_CLUSTER_NAME wavefrontUrl: YOUR_WAVEFRONT_URL dataCollection: metrics: enable: true dataExport: wavefrontProxy: enable: true
See the Configuration section below for details.
-
(Logging Beta) Optionally add the configuration for logging to the
wavefront.yaml
file. For example:# Need to change YOUR_CLUSTER_NAME, YOUR_WAVEFRONT_URL accordingly apiVersion: wavefront.com/v1alpha1 kind: Wavefront metadata: name: wavefront namespace: observability-system spec: clusterName: YOUR_CLUSTER_NAME wavefrontUrl: YOUR_WAVEFRONT_URL dataCollection: metrics: enable: true logging: enable: true dataExport: wavefrontProxy: enable: true
See Logs Overview (Beta) for an overview and some links to more doc about the logging beta.
See Bring Your Own Logs Shipper for an overview of how to use the Operator with your own logs shipper.
-
Deploy the agents with your configuration
kubectl apply -f <path_to_your_wavefront.yaml>
-
Run the following command to get status of the Kubernetes integration:
kubectl get wavefront -n observability-system
The command should return a table like the following, displaying Operator instance health:
NAME STATUS PROXY CLUSTER-COLLECTOR NODE-COLLECTOR LOGGING AGE MESSAGE wavefront Healthy Running (1/1) Running (1/1) Running (3/3) Running (3/3) 2m4s All components are healthy
If
STATUS
isUnhealthy
, check troubleshooting.
Note: For details on migrating from existing helm chart or manual deploy, see Migration.
You configure the Observability for Kubernetes Operator with a custom resource file.
When you update the resource file, the Operator picks up the changes and updates the integration deployment accordingly.
To update the custom resource file:
- Open the custom resource file for edit.
- Change one or more options and save the file.
- Run
kubectl apply -f <path_to_your_config_file.yaml>
.
See below for configuration options.
We have templates for common scenarios. See the comments in each file for usage instructions.
- Using a custom private registry
- With plugin configuration in a secret
- Filtering metrics upon collection
- Disabling control plane metrics
- Collecting metrics from ETCD
- Defining Kubernetes resource limits
- Defining data collection pod tolerations
- Defining proxy pre-processor rules
- Enabling proxy histogram support
- Enabling proxy tracing support
- Using an HTTP Proxy
- Getting started with logging configuration
- Full logging configuration
- Bring your own logs shipper
You can see all configuration options in the wavefront-full-config.yaml.
We have alerts on common Kubernetes issues. For details on creating alerts, see alerts.md.
Alert name | Description |
---|---|
Observability Status is Unhealthy | The status of the Observability for Kubernetes is unhealthy. |
Alert name | Description |
---|---|
Pod Stuck in Pending | Workload has pod stuck in pending. |
Pod Stuck in Terminating | Workload has pod stuck in terminating. |
Pod Backoff Event | Workload has pod with container status ImagePullBackOff or CrashLoopBackOff . |
Workload Not Ready | Workload has pods that are not ready. |
Pod Out-of-memory Kills | Workload has pod with container status OOMKilled . |
Container CPU Throttling | Workload has a container with high CPU throttling. |
Container CPU Overutilization | Workload has a container with high CPU utilization. |
Container Memory Overutilization | Workload has a container with high memory utilization. |
Missing etcd leader | etcd cannot elect a leader. |
Alert name | Description |
---|---|
Persistent Volumes No Claim | Persistent Volume has no claim. |
Persistent Volumes Error | Persistent Volume has issues with provisioning. |
Persistent Volume Claim Overutilization | Workload has low available disk space for a claimed Persistent Volume. |
Alert name | Description |
---|---|
Node Memory Overutilization | Node has high memory utilization. |
Node CPU Overutilization | Node has high CPU utilization. |
Node Filesystem Overutilization | Node storage is almost full. |
Node CPU-request Saturation | Node has overcommitted cpu resource requests. |
Node Memory-request Saturation | Node has overcommitted memory resource requests. |
Node Disk Pressure | Node has problematic DiskPressure condition. |
Node Memory Pressure | Node has problematic MemoryPressure condition. |
Node Condition Not Ready | Node Condition not in Ready state. |
The operator deploys a data export component (wavefront-proxy) which can receive log data and forward it to the Operations for Applications service. You will need to configure your logs shipper to send logs to the data export component (wavefront-proxy) deployed by the Operator.
Here is a Wavefront
Custom Resource example config for this scenario.
To make the best use of your logging solution on Kubernetes, we recommend having the below Kubernetes log attributes:
Log attribute key | Description |
---|---|
cluster |
The kubernetes cluster name |
pod_name |
The pod name |
container_name |
The container name |
namespace_name |
The namespace name |
pod_id |
The pod id |
container_id |
The container id |
In addition to these, here are some general log attributes to configure your logs shipper based on your use case.
Upgrade the Observability for Kubernetes Operator and underlying agents to a new version by running the following command :
kubectl apply -f https://raw.githubusercontent.com/wavefrontHQ/observability-for-kubernetes/main/deploy/wavefront-operator.yaml
Note: This command will not upgrade any existing deprecated Helm or manual installations. See migration.md for migration instructions.
Go to Releases, and find the previous release version number, for example v2.0.3. Use this value to replace PREVIOUS_VERSION in the following command:
kubectl apply -f https://github.com/wavefrontHQ/observability-for-kubernetes/releases/download/PREVIOUS_VERSION/wavefront-operator.yaml
To remove the Observability for Kubernetes Operator from your environment, run the following command:
kubectl delete -f https://raw.githubusercontent.com/wavefrontHQ/observability-for-kubernetes/main/deploy/wavefront-operator.yaml
See the Contribution page