Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.13](backport #38553) Setting Prometheus Remote_write Period default value to 1m #38811

Merged
merged 2 commits into from
Apr 10, 2024

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Apr 10, 2024

  • Bug

Proposed commit message

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

  1. Install a local kind cluster
  2. Install prometheus in remote_write configuration in your k8s cluster
  3. Checkout this branch
  4. Navigate to beats/x-pack/metricbeat
  5. Build a new image with command PLATFORMS=linux/arm64 PACKAGES=docker mage package
  6. Load your image in your k8s cluster with
    kind load docker-image docker.elastic.co/beats/metricbeat:8.14.0 --name kind-cluster
  7. Install following metricbeat manifest
metricbeat manifest for prometheus
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metricbeat
  namespace: kube-system
  labels:
    k8s-app: metricbeat
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: metricbeat
  labels:
    k8s-app: metricbeat
rules:
- apiGroups: [""]
  resources:
  - nodes
  - namespaces
  - events
  - pods
  - services
  - persistentvolumes
  - persistentvolumeclaims
  verbs: ["get", "list", "watch"]
# Enable this rule only if planing to use Kubernetes keystore
#- apiGroups: [""]
#  resources:
#  - secrets
#  verbs: ["get"]
- apiGroups: ["extensions"]
  resources:
  - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources:
  - statefulsets
  - deployments
  - replicasets
  - daemonsets
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources:
  - jobs
  - cronjobs
  verbs: ["get", "list", "watch"]
- apiGroups: ["storage.k8s.io"]
  resources:
    - storageclasses
  verbs: ["get", "list", "watch"]
- apiGroups:
  - ""
  resources:
  - nodes/stats
  verbs:
  - get
- nonResourceURLs:
  - "/metrics"
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat
  # should be the namespace where metricbeat is running
  namespace: kube-system
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat-kubeadm-config
  namespace: kube-system
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: metricbeat
subjects:
- kind: ServiceAccount
  name: metricbeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: kube-system
roleRef:
  kind: Role
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat-kubeadm-config
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: kube-system
roleRef:
  kind: Role
  name: metricbeat-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-config
  namespace: kube-system
  labels:
    k8s-app: metricbeat
data:
  metricbeat.yml: |-
    metricbeat.config.modules:
      # Mounted `metricbeat-daemonset-modules` configmap:
      path: ${path.config}/modules.d/*.yml
      # Reload module configs as they change:
      reload.enabled: false

    processors:
      - add_cloud_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    logging.level: debug

    output.elasticsearch:
      hosts: ['https://elasticsearch:9200']
      ssl.verification_mode: none
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-modules
  namespace: kube-system
  labels:
    k8s-app: metricbeat
data:
  prometheus.yml: |-
    - module: prometheus
      metricsets: ["remote_write"]
      host: "0.0.0.0"
      port: "9201"
      use_types: true
      rate_counters: true
---
# Deploy a Metricbeat instance per node for node metrics retrieval
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: metricbeat
  namespace: kube-system
  labels:
    k8s-app: metricbeat
spec:
  selector:
    matchLabels:
      k8s-app: metricbeat
  template:
    metadata:
      labels:
        k8s-app: metricbeat
    spec:
      serviceAccountName: metricbeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: metricbeat
        #image: docker.elastic.co/beats/metricbeat:8.14.0-SNAPSHOT
        image: docker.elastic.co/beats/metricbeat:8.14.0
        imagePullPolicy: Never
        args: [
          "-c", "/etc/metricbeat.yml",
          "-e",
          "-system.hostfs=/hostfs",
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: elasticsearch
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          value: changeme
        - name: ELASTIC_CLOUD_ID
          value:
        - name: ELASTIC_CLOUD_AUTH
          value:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 1500Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/metricbeat.yml
          readOnly: true
          subPath: metricbeat.yml
        - name: data
          mountPath: /usr/share/metricbeat/data
        - name: modules
          mountPath: /usr/share/metricbeat/modules.d
          readOnly: true
        - name: proc
          mountPath: /hostfs/proc
          readOnly: true
        - name: cgroup
          mountPath: /hostfs/sys/fs/cgroup
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: cgroup
        hostPath:
          path: /sys/fs/cgroup
      - name: config
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-config
      - name: modules
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-modules
      - name: data
        hostPath:
          # When metricbeat runs as non-root user, this directory needs to be writable by group (g+w)
          path: /var/lib/metricbeat-data
          type: DirectoryOrCreate
---

Note: In case of dropping metrics you might need to increase mapping limit from inside Dev Console:

PUT .ds-metricbeat-8.14.0-2024.03.28-000001/_settings
{
  "index.mapping.total_fields.limit": 99999
}

Related issues

Screenshots

Screenshot 2024-03-28 at 2 43 15 PM

Logs

With no period set: (default applies that is period:60s)

k logs -n kube-system metricbeat-nmq8d -f | grep -i 'Period for counter'
{"log.level":"info","@timestamp":"2024-03-28T12:38:39.557Z","log.origin":{"function":"github.com/elastic/beats/v7/x-pack/metricbeat/module/prometheus/remote_write.remoteWriteEventsGeneratorFactory","file.name":"remote_write/data.go","file.line":52},"message":”Period for counter: 1m0s","service.name":"metricbeat","ecs.version":"1.6.0"}

With period: 108s

k logs -n kube-system metricbeat-nr68p -f | grep -i 'Period for counter'
{"log.level":"info","@timestamp":"2024-03-28T12:41:45.434Z","log.origin":{"function":"github.com/elastic/beats/v7/x-pack/metricbeat/module/prometheus/remote_write.remoteWriteEventsGeneratorFactory","file.name":"remote_write/data.go","file.line":52},"message":”Period for counter: 1m48s","service.name":"metricbeat","ecs.version":"1.6.0"}

With period: 35m

k logs -n kube-system metricbeat-k5ssl -f | grep -i 'Period for counter'
{"log.level":"info","@timestamp":"2024-03-28T12:39:36.342Z","log.origin":{"function":"github.com/elastic/beats/v7/x-pack/metricbeat/module/prometheus/remote_write.remoteWriteEventsGeneratorFactory","file.name":"remote_write/data.go","file.line":52},"message":”Period for counter: 35m0s","service.name":"metricbeat","ecs.version":"1.6.0"}

This is an automatic backport of pull request #38553 done by [Mergify](https://mergify.com).

* Correcting period for Prometheus remote_write

* Update CHANGELOG.next.asciidoc

(cherry picked from commit e798bb1)
@mergify mergify bot requested a review from a team as a code owner April 10, 2024 08:57
@mergify mergify bot added the backport label Apr 10, 2024
@mergify mergify bot requested review from gizas and MichaelKatsoulis and removed request for a team April 10, 2024 08:57
@mergify mergify bot assigned gizas Apr 10, 2024
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Apr 10, 2024
@elasticmachine
Copy link
Collaborator

elasticmachine commented Apr 10, 2024

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Duration: 75 min 38 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@gizas gizas requested a review from tetianakravchenko April 10, 2024 13:34
@gizas gizas added the Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team label Apr 10, 2024
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Apr 10, 2024
@gizas gizas enabled auto-merge (squash) April 10, 2024 14:26
@gizas gizas merged commit 86b50ec into 8.13 Apr 10, 2024
29 checks passed
@gizas gizas deleted the mergify/bp/8.13/pr-38553 branch April 10, 2024 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants