Skip to content

Latest commit

 

History

History
221 lines (167 loc) · 12.4 KB

kube-burner.md

File metadata and controls

221 lines (167 loc) · 12.4 KB

Kube-burner

What is kube-burner?

Kube-burner is a tool that allows a user to perform scalability tests across Kubernetes and OpenShift clusters by creating thousands of objects. Kube-burner is developed in it's own repository at https://github.com/cloud-bulldozer/kube-burner The ripsaw integration here is meant to run only some workloads useful to measure certain performance KPIs of a cluster.

Running kube-burner

Given that you followed instructions to deploy operator. Kube-burner needs an additional serviceaccount and clusterrole to run. Available at kube-burner-role.yml You can modify kube-burner's cr.yaml to fit your requirements.


Supported workloads

Ripsaw's kube-burner integration supports the following workloads:

  • cluster-density: This workload is a cluster density focused test that creates a set of Deployments, Builds, Secret, Services and Routes across the cluster. This is a namespaced workload, meaning that kube-burner will create as many namespaces with these objects as the configured job_iterations. Each iteration of this workload creates the following objects:

    • 12 imagestreams
    • 3 buidconfigs
    • 6 builds
    • 1 deployment with 2 pod replicas (sleep) mounting two secrets and two configmaps each. deployment-2pod
    • 2 deployments with 1 pod replicas (sleep) mounting two secrets and two configmaps. deployment-1pod
    • 3 services, one pointing to deployment-2pod, and other two pointing to deployment-1pod.
    • 3 route. 1 pointing to the service deployment-2pod and other two pointing to deployment-1pod
    • 10 secrets
    • 10 configmaps
  • kubelet-density: Creates a single namespace with a number of Deployments equal to job_iterations. Each iteration of this workload creates the following object:

    • 1 pod. (sleep)
  • kubelet-density-heavy. Creates a single namespace with a number of applications equals to job_iterations. This application consists on two deployments (a postgresql database and a simple client that generates some CPU load) and a service that is used by the client to reach the database. Each iteration of this workload creates the following objects:

    • 1 deployment holding a postgresql database
    • 1 deployment holding a client application for the previous database
    • 1 service pointing to the postgresl database
  • max-namespaces: This workload is a cluster limits focused test which creates maximum possible namespaces across the cluster. This is a namespaced workload, meaning that kube-burner will create as many namespaces with these objects as the configured job_iterations.

    • 1 deployment holding a postgresql database
    • 5 deployments consisting of a client application for the previous database
    • 1 service pointing to the postgresl database
    • 10 secrets
  • max-services: This workload is a cluster limits focused test which creates maximum possible services per namespace. It will create a single namespace, each iteration of this workload will populate that namespace with these objects:

    • 1 simple application deployment (hello-openshift)
    • 1 service pointing to the previous deployment

The workload type is specified by the parameter workload from the args object of the configuration. Each workload supports several configuration parameters, detailed in the configuration section

Configuration

All kube-burner's workloads support the following parameters:

  • workload: Type of kube-burner workload. As mentioned before, allowed values are cluster-density, kubelet-density and kubelet-density-heavy
  • default_index: Elasticsearch index name. Defaults to ripsaw-kube-burner
  • job_iterations: How many iterations to execute of the specified kube-burner workload
  • qps: Limit object creation queries per second. Defaults to 5
  • burst: Maximum burst for throttle. Defaults to 10
  • image: Allows to use an alternative kube-burner container image. Defaults to quay.io/cloud-bulldozer/kube-burner:latest
  • wait_when_finished: Makes kube-burner to wait for all objects created to be ready/completed before index metrics and finishing the job. Defaults to true
  • pod_wait: Wait for all pods to be running before moving forward to the next job iteration. Defaults to false
  • verify_objects: Verify object count after running each job. Defaults to true
  • error_on_verify: Exit with rc 1 before indexing when object verification fails. Defaults to false
  • log_level: Kube-burner log level. Allowed info and debug. Defaults to info
  • node_selector: Pods deployed by the different workloads use this nodeSelector. This parameter consists of a dictionary like:
node_selector:
  key: node-role.kubernetes.io/master
  value: ""

Where key defaults to node-role.kubernetes.io/worker and value defaults to empty string ""

  • cleanup: Delete old namespaces for the selected workload before starting a new benchmark. Defaults to true
  • wait_for: List containing the objects Kind to wait for at the end of each iteration or job. This parameter only applies the cluster-density workload. If not defined wait for all objects. i.e: wait_for: ["Deployment"]
  • job_timeout: Kube-burner job timeout in seconds. Defaults to 3600 .Uses the parameter activeDeadlineSeconds
  • pin_server and tolerations: Detailed in the section Pin to server and tolerations
  • step: Prometheus step size, useful for long benchmarks. Defaults to 30s
  • metrics_profile: kube-burner metric profile that indicates what prometheus metrics kube-burner will collect. Defaults to metrics.yaml in kubelet-density workloads and metrics-aggregated.yaml in the remaining. Detailed in the Metrics section of this document
  • runtime_class : If this is set, the benchmark-operator will apply the runtime_class to the podSpec runtimeClassName.

kube-burner is able to collect complex prometheus metrics and index them in a ElasticSearch instance. This feature can be configured by the prometheus object of kube-burner's CR.

spec:
  prometheus:
    es_server: http://foo.esserver.com:9200
    prom_url: https://prometheus-k8s.openshift-monitoring.svc.cluster.local:9091
    prom_token: prometheusToken
  workload:

Where:

Note: It's possible to index documents in an authenticated ES instance using the notation http(s)://[username]:[password]@[address]:[port] in the url parameter.

Metrics

kube-burner is able to collect Prometheus metrics using the time range of the benchmark. There are two metric profiles available at the moment.

  • metrics.yaml: This metric profile is indicated for benchmarks executed in small clusters. Since it gets metrics for several system pods from each node. Otherwise, we can reduce the number of indexed metrics (at the expense of granularity) with the parameter step.
  • metrics-aggregated.yaml: This metric profile is indicated for benchmarks in large clusters. Since the metrics from the worker nodes and the infra nodes are aggregated and only metrics from master nodes are collected individually. Also the parameter step can be used to reduce the number of metrics (at the expense of granularity) that will be indexed.

By default the metrics.yaml profile is used in kubelet-density workloads and metrics-aggregated.yaml in the remaining. You can change this profile with the variable metrics_profile.

Note: Metrics collection and indexing is enabled when setting prometheus prom_url

Pin to server and tolerations

It's possible to pin kube-burner pod to a certain node using the pin_server parameter. This parameter is used in the job template as:

{% if workload_args.pin_server is defined %}
{% for label, value in  workload_args.pin_server.items() %}
        {{ label | replace ("_", "-" )}}: {{ value }}
{% endfor %}
{% else %}
      node-role.kubernetes.io/worker: ""
{% endif %}

That is to say, by default kube-burner runs in worker nodes. With the above we could configure the workload to run in infra labeled nodes with:

workload:
  args:
    pin_server: {"node-role.kubernetes.io/infra": ""}

It's also possible to configure scheduling tolerations for the kube-burner pod. To do just pass a list with the desired tolerations as in the code snippet below:

workload:
  args:
    tolerations:
    - key: role
      value: worker
      effect: NoSchedule

Using a remote configuration for kube-burner

Apart from the pre-defined workloads available in this integration with kube-burner, it's possible to make kube-burner to fetch a remote configuration file, from a remote http server. This mechanism can be used by pointing the variable remote_config to the desired remote configuration file:

workload:
  args:
    remote_config: https://your.domain.org/kube-burner-config.yaml

Keep in mind that the object templated declared in this remote configuration file need to be pointed to a remote source as well so that kube-burner will also be able to fetch them. i.e

    objects:

      - objectTemplate: https://your.domain.org/templates/pod.yml
        replicas: 1

In addition to using remote configurations for kube-burner, it's also possible to use a remote metrics profile. It can be configured with the variable remote_metrics_profile

workload:
  args:
    remote_metrics_profile: https://your.domain.org/metrics-profile.yaml

Alerting

Kube-burner includes an alerting mechanism able to evaluate Prometheus expressions at the end of the latest Kube-burner's job. This alerting mechanism is based on a configuration file known as alert-profile. Similar to other configuration files. We can make usage of this feature in this Ripsaw's integration. Similar to other configuration files, this alert-profile can also be fetched from a remote location, this time configured by the variable remote_alert_profile.

workload:
  args:
    remote_alert_profile: https://your.domain.org/alerting-profile.yaml

And this file looks like:

# etcd alarms

- expr: avg_over_time(histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket[2m]))[5m:]) > 0.015
  description: 5 minutes avg. etcd fsync latency on {{$labels.pod}} higher than 10ms {{$value}}
  severity: error

- expr: avg_over_time(histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket[5m]))[5m:]) > 0.1
  description: 5 minutes avg. etcd netowrk peer round trip on {{$labels.pod}} higher than 100ms {{$value}}
  severity: error

- expr: increase(etcd_server_leader_changes_seen_total[2m]) > 0
  description: etcd leader changes observed
  severity: error

Where expr holds the Prometheus expression to evaluate and description holds a description of the alert. It supports different severities:

  • info: Prints an info message with the alarm description to stdout. By default all expressions have this severity.
  • warning: Prints a warning message with the alarm description to stdout.
  • error: Prints a error message with the alarm description to stdout and makes kube-burner rc = 1
  • critical: Prints a fatal message with the alarm description to stdout and exits execution inmediatly with rc != 0

More information can be found at the Kube-burner docs site.