Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load Test chart #112

Merged
merged 2 commits into from
Dec 4, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions load-test/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
6 changes: 6 additions & 0 deletions load-test/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v2
name: load-test
description: A Helm chart for prometheus load generator.
type: application
version: 0.1.0
appVersion: 0.1.0
62 changes: 62 additions & 0 deletions load-test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# New Relic's Prometheus Load Generator

## Chart Details

This chart will deploy a prometheus load generator.

## Configuration

| Parameter | Description | Default |
|------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|
| `numberServicesPerDeploy` | Number of services per deployment to create | |
| `deployments.*` | List of specification of the deployments to create | `[]` |
| `deployments.latency` | time in millisecond the /metric endpoint will wait before answering | `0` |
| `deployments.latencyVariation` | ± latency variation % | `0` |
| `deployments.metrics` | Metric file to download | Average Load URL* |
| `deployments.maxRoutines` | Max number of goroutines the prometheus mock server will open (if 0 no limit is imposed) | `0` |

*Average load URL: https://gist.githubusercontent.com/paologallinaharbur/a159ad779ca44fb9f4ff5b006ef475ee/raw/f5d8a5e7350b8d5e1d03f151fa643fb3a02cd07d/Average%2520prom%2520output

## Resources created

Number of targets created `numberServicesPerDeploy * len(deployments)`
Each service has the label `prometheus.io/scrape: "true"` that is automatically detected by nri-prometheus

Resources are generated automatically according the following specifications
- Name of deployment: `<name>-lat<latency>-latvar<latencyVar>-<deployindex>`
- Name of service: `<name>-lat<latency>-latvar<latencyVar>-<deployindex>-<serviceindex>`

When increasing the number of targets and the size the error is shown `Request Entity Too Large 413`
Adding in the environment variables of POMI seems to solve it reducing the payload
```
- name: EMITTER_HARVEST_PERIOD
value: 200ms
```

## Example

Then, to install this chart, run the following command:

```sh
helm install load ./load-test --values ./load-test/values.yaml -n newrelic
```

Notice that when generating a high number of services it is possible the helm command fails to create/delete all resources leaving an unstable scenario.

To overcome this issue `helm install load ./load-test --values ./load-test/values.yaml -n newrelic | kubectl apply -f -` proved to be more reliable.

## Sample prometheus outputs

Test prometheus metrics, by default the deployments download the average output sample:

- https://gist.githubusercontent.com/paologallinaharbur/125cca06b5c717503c7672766e3667fe/raw/67070882bee890a9e060189cff1ef316745a652b/Small%2520Prom%2520payload Small Payload 10 Timeseries
- https://gist.githubusercontent.com/paologallinaharbur/a159ad779ca44fb9f4ff5b006ef475ee/raw/f5d8a5e7350b8d5e1d03f151fa643fb3a02cd07d/Average%2520prom%2520output Average Payload 500 Timeseries
- https://gist.githubusercontent.com/paologallinaharbur/f03818327921754efc5a997894467ff9/raw/c61168c1d2ea8bde6580144ada6f739fb40a7bbf/Large%2520Prom%2520output Big payload 1000 Timeseries


## Compare with real data

To compare the average size of the payload scraped by pomi you can run `SELECT average(nr_stats_metrics_total_timeseries_by_target) FROM Metric where clusterName='xxxx' SINCE 30 MINUTES AGO TIMESERIES`$
and get the number of timeseries sent (the average payload here counts 500)

To compare the average time a target takes in order to answer `SELECT average(nr_stats_integration_fetch_target_duration_seconds) FROM Metric where clusterName='xxxx' SINCE 30 MINUTES AGO FACET target LIMIT 500`
10 changes: 10 additions & 0 deletions load-test/templates/NOTES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
THIS CHART IS MEANT FOR LOAD TESTING ONLY

It created {{ .Values.numberServicesPerDeploy }} services per each Deployment
It created {{ len .Values.deployments }} deployments

Name of deployment: `<name>-lat<latency>-latvar<latencyVar>-<deployindex>`
Name of service: `<name>-lat<latency>-latvar<latencyVar>-<deployindex>-<serviceindex>`

Number of targets created numberServicesPerDeploy*len(deployments)
Each service has the label `prometheus.io/scrape: "true"` that is automatically detected by nri-prometheus
14 changes: 14 additions & 0 deletions load-test/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "load-test.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "load-test.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}

68 changes: 68 additions & 0 deletions load-test/templates/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
{{- $replicaCount := .Values.replicaCount -}}
{{- $chartName := .Chart.Name -}}
{{- $namespace := .Values.namespace -}}

{{- $index := 0 -}}


{{- range .Values.deployments }}
{{- $index = add1 $index -}}
{{- $latency := default "0" .latency -}}
{{- $latencyVariation := default "0" .latencyVariation -}}
{{- $indexString := toString $index -}}

{{- $uniqueDeployName := printf "%s-lat%s-latvar%s-index%s" .name $latency $latencyVariation $indexString -}}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ $uniqueDeployName }}
namespace: {{ $namespace }}
labels:
app.kubernetes.io/name: {{ $uniqueDeployName }}
spec:
replicas: {{ $replicaCount }}
selector:
matchLabels:
app.kubernetes.io/name: {{ $uniqueDeployName }}
template:
metadata:
labels:
app.kubernetes.io/name: {{ $uniqueDeployName }}
spec:
serviceAccountName: "default"
containers:
- name: {{ $chartName }}
image: roobre/mockexporter:latest
imagePullPolicy: IfNotPresent
env:
- name: LATENCY
value: {{ $latency | quote}}
- name: LATENCY_VARIATION
value: {{ $latencyVariation | quote}}
- name: METRICS
value: "/metrics/metrics.sample"
- name: MAX_ROUTINES
value: {{ .maxRoutines | default "0" | quote}}
- name: ADDR
value: ":80"
ports:
- name: http
containerPort: 80
protocol: TCP
volumeMounts:
- name: metricsdir
mountPath: /metrics
initContainers:
- name: installmetrics
image: roobre/mockexporter:latest
command: [ "/bin/sh","-c" ]
args:
- wget {{ .metrics | default "https://gist.githubusercontent.com/paologallinaharbur/a159ad779ca44fb9f4ff5b006ef475ee/raw/f5d8a5e7350b8d5e1d03f151fa643fb3a02cd07d/Average%2520prom%2520output" | quote}} -O /metrics/metrics.sample
volumeMounts:
- name: metricsdir
mountPath: "/metrics"
volumes:
- name: metricsdir
emptyDir: {}
---
{{- end }}
37 changes: 37 additions & 0 deletions load-test/templates/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
{{- $values := .Values -}}
{{- $numberServices := .Values.numberServicesPerDeploy | int }}
{{- $numberDeploy := .Values.numberOfDeployments | int }}
{{- $namespace := .Values.namespace -}}
{{- $index := 0 -}}


{{- range .Values.deployments }}
{{- $index = add1 $index -}}
{{- $latency := default "0" .latency -}}
{{- $latencyVariation := default "0" .latencyVariation -}}
{{- $indexString := toString $index -}}

{{- $uniqueDeployName := printf "%s-lat%s-latvar%s-index%s" .name $latency $latencyVariation $indexString -}}

{{- range untilStep 0 $numberServices 1 }}

apiVersion: v1
kind: Service
metadata:
name: {{ $uniqueDeployName }}-{{ . }}
namespace: {{ $namespace }}
labels:
prometheus.io/scrape: "true"
app.kubernetes.io/name: {{ $uniqueDeployName }}
spec:
type: ClusterIP
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
selector:
app.kubernetes.io/name: {{ $uniqueDeployName }}
---
{{- end }}
{{- end }}
73 changes: 73 additions & 0 deletions load-test/values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Due to the high volume helm could fail to generate all the needed resources in small clusters due to time-out
# Somethimes `helm template [..] | kubectl apply -f -` seems to be more performant

# When increasing the number of targets and the size the error is shown `Request Entity Too Large 413`
# Adding in the environment variables of POMI seems to solve it reducing the payload
# - name: EMITTER_HARVEST_PERIOD
# value: 200ms

# Number of targets created numberServicesPerDeploy*len(deployments)
# Each service has the label `prometheus.io/scrape: "true"` that is automatically detected by nri-prometheus

# Resources are generated automatically according the following specifications
# Name of deployment: `<name>-lat<latency>-latvar<latencyVar>-<deployindex>`
# Name of service: `<name>-lat<latency>-latvar<latencyVar>-<deployindex>-<serviceindex>`

# Test prometheus metrics, by default the deployments download the average output sample:
# https://gist.githubusercontent.com/paologallinaharbur/125cca06b5c717503c7672766e3667fe/raw/67070882bee890a9e060189cff1ef316745a652b/Small%2520Prom%2520payload Small Payload
# https://gist.githubusercontent.com/paologallinaharbur/a159ad779ca44fb9f4ff5b006ef475ee/raw/f5d8a5e7350b8d5e1d03f151fa643fb3a02cd07d/Average%2520prom%2520output Average Payload
# https://gist.githubusercontent.com/paologallinaharbur/f03818327921754efc5a997894467ff9/raw/c61168c1d2ea8bde6580144ada6f739fb40a7bbf/Large%2520Prom%2520output Big payload
#
# To compare the average size of the payload scraped by pomi you can run `SELECT average(nr_stats_metrics_total_timeseries_by_target) FROM Metric SINCE 30 MINUTES AGO TIMESERIES`$
# and get the number of timeseries sent (the average payload here counts 400)


numberServicesPerDeploy: 100 # Total number service created: numberServicesPerDeploy*len(deployments)
deployments: # Total number deployments created: len(deployments)
- name: one # required (uniqueness is assured by adding an index)
latency: "0" # not required
latencyVariation: "0" # not required
metrics: "" # not required
#maxRoutines: "1" #not required
- name: two
latency: "0"
latencyVariation: "0"
metrics: ""
#maxRoutines: "1"
- name: three
latency: "0"
latencyVariation: "0"
metrics: ""
#maxRoutines: "1"
- name: four
latency: "0"
latencyVariation: "0"
metrics: ""
#maxRoutines: "1"
- name: five
latency: "0"
latencyVariation: "0"
metrics: ""
#maxRoutines: "1"
- name: six
latency: "0"
latencyVariation: "0"
metrics: ""
#maxRoutines: "1"
- name: seven
latency: "0"
latencyVaration: "0"
metrics: ""
#maxRoutines: "1"
- name: eight
latency: "0"
latencyVariation: "0"
metrics: ""
#maxRoutines: "1"

# ---------------------------- No need to modify this

namespace: "newrelic"
replicaCount: 1
nameOverride: ""
fullnameOverride: "load-test"