Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Metrics, Logs and Events docs #1631

Merged
merged 4 commits into from
Aug 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 30 additions & 22 deletions content/en/flux/monitoring/custom-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,29 +16,27 @@ as these are more useful for the people administering and maintaining Flux. Most
of the time, the users of Flux who interact with Flux through the Flux custom
resources want to know about the resources they work with. For example, the
state of GitRepositories and their branches or tag references. These metrics can
be scraped by using [kube-state-metrics
(KSM)](https://github.com/kubernetes/kube-state-metrics), which is part of the
[kube-prometheus-stack](https://github.com/prometheus-operator/kube-prometheus).
KSM can be configured to add custom labels to the resource metrics, for example,
some value from the status of a resource or some arbitrary value like a team
name, department name, etc.
be scraped by using [kube-state-metrics (KSM)][kube-state-metrics], which is
part of the [kube-prometheus-stack][kube-prometheus-stack]. KSM can be
configured to add custom labels to the resource metrics, for example, some value
from the status of a resource or some arbitrary value like a team name, department name, etc.

## Set up kube-state-metrics

Kube-state-metrics can be installed along with the whole monitoring stack using
kube-prometheus-stack. The
[fluxcd/flux2-monitoring-example](https://github.com/fluxcd/flux2-monitoring-example)
repository contains example configurations for deploying and configuring
kube-prometheus-stack to monitor Flux. These configurations will be discussed in
detail in the following sections to show how they can be customized. Refer to
[Monitoring with Prometheus](monitoring.md) for detailed installation
instructions.

The [Helm chart values for
kube-prometheus-stack](https://github.com/fluxcd/flux2-monitoring-example/blob/main/kube-prometheus-stack/release.yaml)
configure KSM to run in `custom-resource-state-only` mode. In this state, KSM
will not collect metrics for any of the Kubernetes core resources. The `rbac`
section provides KSM access to list and watch Flux custom resources. If
[fluxcd/flux2-monitoring-example][monitoring-example-repo] repository contains
example configurations for deploying and configuring kube-prometheus-stack to
monitor Flux. These configurations will be discussed in detail in the following
sections to show how they can be customized.

The Kube-prometheus-stack Helm chart is used to install the monitoring stack.
The kube-state-metrics related configuration in the chart values exists in a
separate file called
[kube-state-metrics-config.yaml](https://github.com/fluxcd/flux2-monitoring-example/blob/main/monitoring/controllers/kube-prometheus-stack/kube-state-metrics-config.yaml).
It configures KSM to run in `custom-resource-state-only` mode. In this state,
KSM will not collect metrics for any of the Kubernetes core resources. The
`rbac` section provides KSM access to list and watch Flux custom resources. If
image-reflector-controller and image-automation-controllers are not used, the
API group (`image.toolkit.fluxcd.io`) and resources (`imagerepositories`,
`imagepolicies`, `imageupdateautomations`) can be removed. The
Expand All @@ -50,8 +48,8 @@ kube-apiserver and exporting them as configured.
## Adding custom metrics

The example `customResourceState` values used in the above setup add a metric
called `gotk_resource_info` with labels `name`, `exported_namespace`, and
`ready`.
called `gotk_resource_info` with labels `name`, `exported_namespace`,
`suspended`, `ready`, etc.

```yaml
- name: "resource_info"
Expand All @@ -63,7 +61,9 @@ called `gotk_resource_info` with labels `name`, `exported_namespace`, and
name: [metadata, name]
labelsFromPath:
exported_namespace: [metadata, namespace]
suspended: [spec, suspend]
ready: [status, conditions, "[type=Ready]", status]
...
```

This provides the current state of the Flux resources. It can be used to monitor
Expand Down Expand Up @@ -95,6 +95,7 @@ customResourceState:
name: [metadata, name]
labelsFromPath:
exported_namespace: [metadata, namespace]
suspended: [spec, suspend]
ready: [status, conditions, "[type=Ready]", status]
- name: "helmrelease_version_info"
help: "The version information of helm release resource."
Expand Down Expand Up @@ -176,6 +177,7 @@ customResourceState:
name: [metadata, name]
labelsFromPath:
exported_namespace: [metadata, namespace]
suspended: [spec, suspend]
ready: [status, conditions, "[type=Ready]", status]
branch: [spec, ref, branch]
...
Expand All @@ -191,5 +193,11 @@ It contains the `ownedBy="teamA"`, `department="baz"` and `branch="main"`
labels. Similarly, more custom labels can be added depending on the need.

Refer to the [kube-state-metrics custom-resource state configuration
docs](https://github.com/kubernetes/kube-state-metrics/blob/main/docs/customresourcestate-metrics.md)
to learn more about customizing the metrics.
docs][ksm-customresourcestate-metrics] to learn more about customizing the
metrics.


[kube-state-metrics]: https://github.com/kubernetes/kube-state-metrics
[monitoring-example-repo]: https://github.com/fluxcd/flux2-monitoring-example
[kube-prometheus-stack]: https://github.com/prometheus-operator/kube-prometheus
[ksm-customresourcestate-metrics]: https://github.com/kubernetes/kube-state-metrics/blob/main/docs/customresourcestate-metrics.md
133 changes: 126 additions & 7 deletions content/en/flux/monitoring/events.md
darkowlzz marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,19 @@ description: "How to monitor the Flux events"
weight: 4
---

The Flux controllers emit Kubernetes events for every reconciliation operation.
The Flux controllers emit [Kubernetes events][kubernetes-events] during the
reconciliation operation to provide information about the object being
reconciled. Unlike logs, events are always associated with an object, which is a
Flux resource in this case. Events are supplemental data that can be used along
with logs to provide a complete picture of controllers' operations. Some of
the events emitted by Flux controllers are also used to send notifications.
See the [Alerts docs](/flux/monitoring/alerts/) to learn more about the Flux
Alerts based on events from controllers. In the following sections, we will go
through the Flux events and how to interpret them.

## Kubernetes events

The Flux controllers events contain the following fields:
The Flux controller events about a resource contain the following fields:

- `type` can be `Normal` or `Warning`
- `firstTimestamp` timestamp in the ISO 8601 format
Expand All @@ -18,11 +26,11 @@ The Flux controllers events contain the following fields:
- `reason` short machine understandable string
- `involvedObject` the API version, kind, name and namespace of the Flux object
- `metadata.annotations` the Flux specific metadata e.g. source revision
- `source.component` the Flux controller name
- `source.component` the Flux controller name where the event originated from.

### Samples
### Examples

Sample of a `Normal` event produced by kustomize-controller:
Example of a `Normal` event produced by kustomize-controller:

```json
{
Expand Down Expand Up @@ -52,8 +60,119 @@ Sample of a `Normal` event produced by kustomize-controller:
}
```

In the above example:
- The event is about a `Kustomization` named `flux-system` in the `flux-system`
namespace, indicated by the `involvedObject` field.
- The event originates from `kustomize-controller`, indicated by the
`source.component` field.
- The event is a `Normal` type event about a successful reconciliation,
indicated by the `reason` and `message` fields.
- The `metadata.annotations` field `kustomize.toolkit.fluxcd.io/revision`
contains information about the source revision that was successfully applied
as a result of successful reconciliation of the Kustomization.

Example of a `Warning` event produced by source-controller:

```json
{
"apiVersion": "v1",
"count": 4,
"eventTime": null,
"firstTimestamp": "2023-08-22T20:24:06Z",
"involvedObject": {
"apiVersion": "source.toolkit.fluxcd.io/v1",
"kind": "GitRepository",
"name": "podinfo",
"namespace": "default",
"resourceVersion": "1284973",
"uid": "2c2ed1da-556f-4793-863d-7d96e8bab3f5"
},
"kind": "Event",
"lastTimestamp": "2023-08-22T20:24:18Z",
"message": "failed to checkout and determine revision: unable to clone 'https://github.com/stefanprodan/podinfo': couldn't find remote ref \"refs/tags/v1.8.9\"",
"metadata": {
"creationTimestamp": "2023-08-22T20:24:06Z",
"name": "podinfo.177dce48bc7db3a4",
"namespace": "default",
"resourceVersion": "1285016",
"uid": "3c8f568a-c99b-4279-8093-6ef08fae325b"
},
"reason": "GitOperationFailed",
"reportingComponent": "",
"reportingInstance": "",
"source": {
"component": "source-controller"
},
"type": "Warning"
}
```

In the above example:
- The event is about a `GitRepository` named `podinfo` in the `default`
namespace, indicated by the `involvedObject` field.
- The event originates from `source-controller`, indicated by the
`source.component` field.
- The event is a `Warning` type event about a failed Git operation, indicated by
the `reason` and `message` fields.

## Events inspection with kubectl

```shell
kubectl events -n monitoring --for helmreleaase/kube-prom-stack
The events associated with a Flux resource can be queried using `kubectl events`
command:

```console
$ kubectl events -n flux-system --for kustomization/flux-system
LAST SEEN TYPE REASON OBJECT MESSAGE
58m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 448.00332ms, next run in 10m0s
48m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 486.826649ms, next run in 10m0s
38m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 502.282127ms, next run in 10m0s
28m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 543.745587ms, next run in 10m0s
18m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 465.177441ms, next run in 10m0s
8m27s Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 494.543068ms, next run in 10m0s
```

This shows all the events associated with the queried resource in an hour.

## Events inspection with flux CLI

The events associated with a Flux resource can be queried using the `flux
events` CLI command:

```console
$ flux events --for Kustomization/flux-system
darkowlzz marked this conversation as resolved.
Show resolved Hide resolved
LAST SEEN TYPE REASON OBJECT MESSAGE
52m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 506.467ms, next run in 10m0s
42m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 531.072726ms, next run in 10m0
32m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 506.673992ms, next run in 10m0
22m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 512.255817ms, next run in 10m0
12m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 507.521248ms, next run in 10m0
2m31s Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 448.00332ms, next run in 10m0s
```

This can also be used to watch all the events issues by the Flux controllers
across all the namespaces:

```console
$ flux events --all-namespaces --watch
NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE
flux-system 34m (x3 over 154m) Normal GitOperationSucceeded GitRepository/flux-system no changes since last reconcilation: observed revision 'main@sha1:4d768edba5d409feb60870dd3b0ac0d307299898'
flux-system 54m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 486.814878ms, next run in 10m0s
flux-system 44m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 486.203813ms, next run in 10m0s
flux-system 34m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 512.160373ms, next run in 10m0s
flux-system 24m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 543.806383ms, next run in 10m0s
flux-system 14m Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 524.293527ms, next run in 10m0s
flux-system 4m5s Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 522.671955ms, next run in 10m0s
flux-system 47s Normal ReconciliationSucceeded Kustomization/flux-system Reconciliation finished in 523.892245ms, next run in 10m0s
flux-system 34m Normal ReconciliationSucceeded Kustomization/monitoring-configs Reconciliation finished in 104.609707ms, next run in 1h0m0s
flux-system 42s Normal ReconciliationSucceeded Kustomization/monitoring-configs Reconciliation finished in 90.70521ms, next run in 1h0m0s
flux-system 34m Normal ReconciliationSucceeded Kustomization/monitoring-controllers Reconciliation finished in 118.651968ms, next run in 1h0m0s
flux-system 39s Normal ReconciliationSucceeded Kustomization/monitoring-controllers Reconciliation finished in 132.34839ms, next run in 1h0m0s
monitoring 34m (x3 over 154m) Normal ArtifactUpToDate HelmChart/monitoring-kube-prometheus-stack artifact up-to-date with remote revision: '48.3.3'
monitoring 34m (x3 over 154m) Normal ArtifactUpToDate HelmChart/monitoring-loki-stack artifact up-to-date with remote revision: '2.9.11'
```

Refer to the [`flux events`](/flux/cmd/flux_events/) CLI docs to learn more
about it.


[kubernetes-events]: https://kubernetes.io/docs/reference/kubernetes-api/cluster-resources/event-v1/
Loading