Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stack Monitoring] Kibana monitoring with Metricbeat and Filebeat as sidecars #4618

Merged
merged 34 commits into from
Jul 12, 2021

Conversation

thbkrkr
Copy link
Contributor

@thbkrkr thbkrkr commented Jul 6, 2021

Adds a new monitoring field to the Kibana resource to configure one or two different Elasticsearch references to set up stack monitoring with Metricbeat and log delivery with Filebeat. The referenced ES are used to send the data collected by the beats.

Very similar to #4528.

apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: alpha
  namespace: production
spec:
  version: 7.14.0-SNAPSHOT
  monitoring:
    metrics:
      elasticsearchRefs: 
        - name: monitoring
          namespace: observability
    logs:
      elasticsearchRefs:
        - name: monitoring
          namespace: observability
YAML example for testing
apiVersion: v1
kind: Namespace
metadata:
  name: production
---
apiVersion: v1
kind: Namespace
metadata:
  name: observability
---
#######################################################################
# Monitored Elasticsearch
#######################################################################
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: alpha
  namespace: production
spec:
  version: 7.14.0-SNAPSHOT
  # --------------------- #
  monitoring:
    metrics:
      elasticsearchRefs:
        - name: monitoring
          namespace: observability
    logs:
      elasticsearchRefs:
        - name: monitoring
          namespace: observability
  # --------------------- #
  nodeSets:
  - name: master
    count: 2
    config:
      node.store.allow_mmap: false
    # podTemplate:
    #   spec:
    #     containers:
    #       - name: metricbeat
    #         resources:
    #           limits:
    #             memory: 150Mi
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: alpha
  namespace: production
spec:
  version: 7.14.0-SNAPSHOT
  count: 1
  # --------------------- #
  monitoring:
    metrics:
      elasticsearchRefs: 
        - name: monitoring
          namespace: observability
    logs:
      elasticsearchRefs:
        - name: monitoring
          namespace: observability
  # --------------------- #
  elasticsearchRef:
    name: alpha
    namespace: production
---
#######################################################################
# Monitoring clusters
#######################################################################
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: monitoring
  namespace: observability
spec:
  version: 7.14.0-SNAPSHOT
  nodeSets:
  - name: master
    count: 2
    config:
      node.store.allow_mmap: false
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: monitoring
  namespace: observability
spec:
  version: 7.14.0-SNAPSHOT
  count: 1
  config:
    xpack.monitoring.ui.enabled: false
  elasticsearchRef:
    name: monitoring
    namespace: observability

Implementation notes:

  • Add a new KbMonitoringAssociation to manage association between the "1 Kibana / n monitoring Elasticsearch clusters" association
  • Update HasMonitoring to make the validations and the base Metricbeat config build generic

Limitations:

  • Minimum supported Stack version is 7.14.0 (to benefit from ES_LOG_STYLE=file)
  • With a custom Kibana image, you have to override the podTemplate to define the custom beat images.
  • The monitored Kibana is not deployed until monitoring ES clusters are not ready yet or the association are not configured
  • monitoring.[metrics|logs].elasticsearchRefs accepts only one Elasticsearch reference. It's a slice to future proof the API for Elastic agent

Relates to #4183.

@thbkrkr thbkrkr added >feature Adds or discusses adding a feature to the product v1.7.0 labels Jul 6, 2021
@thbkrkr thbkrkr force-pushed the kibana-stack-monitoring branch from 3a0fd41 to 5043bde Compare July 7, 2021 22:25
@thbkrkr thbkrkr force-pushed the kibana-stack-monitoring branch 2 times, most recently from 6b1b961 to 9f9c07f Compare July 7, 2021 22:43
@thbkrkr thbkrkr force-pushed the kibana-stack-monitoring branch from 9f9c07f to dc46d1e Compare July 8, 2021 08:08
@thbkrkr thbkrkr marked this pull request as ready for review July 8, 2021 09:05
@sebgl
Copy link
Contributor

sebgl commented Jul 8, 2021

I have checked out the branch and am trying with this manifest:

apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: kb
spec:
  version: 7.14.0-SNAPSHOT
  count: 1
  elasticsearchRef:
    name: quickstart
  monitoring:
    metrics:
      elasticsearchRefs: 
        - name: quickstart
    logs:
      elasticsearchRefs:
        - name: quickstart
---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: quickstart
spec:
  version: 7.14.0-SNAPSHOT
  nodeSets:
  - name: default
    count: 1
    config:
      node.store.allow_mmap: false

I'm getting this error in the logs:

	ERROR	manager.eck-operator.controller.kibana-controller	Reconciler error	{"service.version": "1.7.0-SNAPSHOT+7c474bed", "name": "kb", "namespace": "default", "error": "Secret \"quickstart-es-internal-users\" not found", "errorCauses": [{"error": "Secret \"quickstart-es-internal-users\" not found"}]}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/Users/sebgl/work/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.2/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/Users/sebgl/work/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.2/pkg/internal/controller/controller.go:214

And don't have stack monitoring metrics visible in Kibana :(
Am I doing something wrong?

Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great that you could factorize Kibana and Elasticsearch monitoring into a single package!
I couldn't make things work (see my other comment).

pkg/controller/association/controller/kb_monitoring.go Outdated Show resolved Hide resolved

// buildMetricbeatBaseConfig builds the base configuration for Metricbeat with the Elasticsearch or Kibana modules used
// to collect metrics for Stack Monitoring
func buildMetricbeatBaseConfig(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's unit test this?

pkg/controller/common/stackmon/monitoring/monitoring.go Outdated Show resolved Hide resolved
}
}
return associations
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's unit test those functions?

pkg/controller/kibana/controller.go Outdated Show resolved Hide resolved
pkg/controller/kibana/stackmon/filebeat.yml Outdated Show resolved Hide resolved
pkg/controller/kibana/stackmon/kb_config.go Show resolved Hide resolved
pkg/controller/kibana/stackmon/kb_config.go Outdated Show resolved Hide resolved
pkg/controller/kibana/stackmon/sidecar.go Outdated Show resolved Hide resolved
test/e2e/test/checks/monitoring.go Outdated Show resolved Hide resolved
@thbkrkr
Copy link
Contributor Author

thbkrkr commented Jul 8, 2021

And don't have stack monitoring metrics visible in Kibana :(

I forgot to mention that you need to also enable monitoring on the Elasticsearch associated to the monitored Kibana to be able to take advantage of the Kibana Monitoring UI (https://www.elastic.co/guide/en/kibana/current/monitoring-data.html#monitoring-data).
If you just enable monitoring on Kibana, you will just have metrics and logs collection without the monitoring view.

I wondered if we should force or validate this and I think it should be left open because it can make sense to just be interested in metrics and logs collection without the monitoring view.

Am I doing something wrong?

No, you just tested with an elasticsearchRef with no namespace and it revealed a bug. I've always tested with namespaces 🤦. I will fix this.

Also, it will not work to monitor Kibana if it hasn't an associated Elasticsearch because we need to be able to retrieve the internal user dedicated for metrics collection to configure the Metricbeat kibana module. That on the other hand, I think, it makes more sense to be validated.

@thbkrkr thbkrkr force-pushed the kibana-stack-monitoring branch from 55246e0 to 30579ac Compare July 8, 2021 18:37
@sebgl
Copy link
Contributor

sebgl commented Jul 9, 2021

I forgot to mention that you need to also enable monitoring on the Elasticsearch associated to the monitored Kibana to be able to take advantage of the Kibana Monitoring UI (https://www.elastic.co/guide/en/kibana/current/monitoring-data.html#monitoring-data).
If you just enable monitoring on Kibana, you will just have metrics and logs collection without the monitoring view.
I wondered if we should force or validate this and I think it should be left open because it can make sense to just be interested in metrics and logs collection without the monitoring view.

Considering it's a "limitation" / behaviour of the stack and not of ECK I think it's fine to leave as is indeed 👍

Also, it will not work to monitor Kibana if it hasn't an associated Elasticsearch because we need to be able to retrieve the internal user dedicated for metrics collection to configure the Metricbeat kibana module. That on the other hand, I think, it makes more sense to be validated.

👍

Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a few tests it seems to work fine 👍
Left a few last comments, this is almost ready to 🚢

pkg/apis/kibana/v1/webhook.go Outdated Show resolved Hide resolved
pkg/controller/kibana/stackmon/sidecar.go Show resolved Hide resolved
Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Jul 9, 2021

I've enable audit logs and configure Filebeat to collect an audit logs file but I forgot to configure the audit logger to write to a file.

We discussed this and decided to configure everything so that the audit logs delivery is ready, but leaving the user the choice to enable it. So, by default, audit logs are off.

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Jul 9, 2021

I find a small bug in SetAssociationStatusMap and I kinda messed up the E2E test during a last refactor. I fixed the two but there is one strange thing left, the metricbeat index is not populated with Kibana monitoring. I need to double-check if it's normal or not when I come back on the 19th.

I leave the PR open if you still want to check and test but I think it's ok to be merged.

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Jul 9, 2021

jenkins test this please


func (c StackMonitoringChecks) CheckMetricbeatIndex() test.Step {
return test.Step{
Name: "Check that documents are indexed in one metricbeat-* index",
Copy link
Contributor

@barkbay barkbay Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Kibana metrics are expected to be found in .monitoring-kibana- when xpack.enabled is set to true:

metricbeat-7.14.0-2021.07.12-000001 0 p STARTED  0   208b 10.55.208.22 test-kb-mon-metrics-lf6v-es-masterdata-1
metricbeat-7.14.0-2021.07.12-000001 0 r STARTED  0   208b 10.55.210.23 test-kb-mon-metrics-lf6v-es-masterdata-0
.monitoring-kibana-7-mb-2021.07.12  0 r STARTED  8 59.5kb 10.55.208.22 test-kb-mon-metrics-lf6v-es-masterdata-1
.monitoring-kibana-7-mb-2021.07.12  0 p STARTED  8 59.4kb 10.55.210.23 test-kb-mon-metrics-lf6v-es-masterdata-0

}
return test.StepList{
c.CheckBeatSidecars(),
//c.CheckMetricbeatIndex(), TODO: investigate if it's normal that there is no document in this index when es monitoring is off
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment, I'm not sure to understand why enabling es monitoring would move the stack metrics from .monitoring-* to metricbeat-*. I'll try to investigate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I enable Elasticsearch monitoring I get the expected .monitoring-es- index, but there is also a metricbeat- index created:

.monitoring-kibana-7-mb-2021.07.12  0     p      STARTED  142 183.4kb 10.55.210.25 monitoring-es-default-1
.monitoring-kibana-7-mb-2021.07.12  0     r      STARTED  142   166kb 10.55.209.29 monitoring-es-default-2
.monitoring-es-7-mb-2021.07.12      0     p      STARTED  839     1mb 10.55.210.25 monitoring-es-default-1
.monitoring-es-7-mb-2021.07.12      0     r      STARTED  195 682.8kb 10.55.208.23 monitoring-es-default-0
metricbeat-7.14.0-2021.07.12-000001 0     r      STARTED   24 151.6kb 10.55.208.23 monitoring-es-default-0
metricbeat-7.14.0-2021.07.12-000001 0     p      STARTED   24 151.6kb 10.55.209.29 monitoring-es-default-2

Having a look at these 24 documents in metricbeat-7.14.0-2021.07.12-000001 they are actually all errors which, I guess, happened during metricbeat startup:

Time _index error.message
Jul 12, 2021 @ 13:28:09.677 metricbeat-7.14.0-2021.07.12-000001 error determining if connected Elasticsearch node is master: error making http request: Get "https://localhost:9200/_nodes/_local/nodes": dial tcp 127.0.0.1:9200: connect: connection refused

I think metricbeat-* should not be used in e2e tests to assess that monitoring is working as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b600503

@barkbay
Copy link
Contributor

barkbay commented Jul 12, 2021

If you just enable monitoring on Kibana, you will just have metrics and logs collection without the monitoring view.
I wondered if we should force or validate this and I think it should be left open because it can make sense to just be interested in metrics and logs collection without the monitoring view.

I hit the same issue, also I was not able to get metrics in the monitoring view, I guess it is because metrics are stored in .monitoring-kibana-7-mb-* instead of metricbeat-*.
I would be tempted to say that it should not be an ECK specific issue, and things should be improved in Kibana: if there are some data in any of the monitoring indices user should be able to see them.

@barkbay
Copy link
Contributor

barkbay commented Jul 12, 2021

jenkins test this please

16:02:02      integration.go:97: 
16:02:02          	Error Trace:	integration.go:97
16:02:02          	            				handler_integration_test.go:82
16:02:02          	Error:      	Received unexpected error:
16:02:02          	            	failed waiting for all runnables to end within grace period of 30s: context deadline exceeded
16:02:02          	Test:       	TestDynamicEnqueueRequest

@barkbay
Copy link
Contributor

barkbay commented Jul 12, 2021

jenkins test this please

16:42:02      integration.go:97: 
16:42:02          	Error Trace:	integration.go:97
16:42:02          	            				handler_integration_test.go:82
16:42:02          	Error:      	Received unexpected error:
16:42:02          	            	failed waiting for all runnables to end within grace period of 30s: context deadline exceeded
16:42:02          	Test:       	TestDynamicEnqueueRequest

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@barkbay barkbay merged commit fb8205a into elastic:master Jul 12, 2021
@barkbay barkbay changed the title Kibana monitoring with Metricbeat and Filebeat as sidecars [Stack Monitoring] Kibana monitoring with Metricbeat and Filebeat as sidecars Jul 20, 2021
@thbkrkr thbkrkr deleted the kibana-stack-monitoring branch August 31, 2021 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature Adds or discusses adding a feature to the product v1.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants