-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kube-burner refactor + HyperShift multi-endpoint #545
Kube-burner refactor + HyperShift multi-endpoint #545
Conversation
b5cf530
to
4d00695
Compare
8abef84
to
c5ac9df
Compare
Should we remove https://github.com/cloud-bulldozer/e2e-benchmarking/blob/master/workloads/kube-burner/grafana-agent.yaml as it will no longer be required with prom multi-endpoint support? Also, we have a cluster-density-ms workload in e2e, is it work creating a wrapper for that in kube-burner? |
cluster-density-ms is similar to cluster-density with the difference that its metrics-profile is rendered on the fly to grab metrics from thanos. With this new implementation Thanos won't be required anymore. |
329d75c
to
6f8de5a
Compare
|
065ed2b
to
17a07c2
Compare
Wrapper to run kube-burner using the new --metrics-endpoints flag from kube-burner that allows to grab metrics and evaluate alerts from different Prometheus endpoints: For the HyperShift scenario work we need: - OBO Stack: This metrics scraped from this endpoint are metrics from Hosted CPs, such as etcd/API latencies, etc. As this endpoint is not public exposed by default, the script sets up a route to allow kube-burner to reach it. No authentication is required currently. - Management cluster Prometheus: The metrics we use from this endpoint are mgmt cluster container metrics (from the hosted control-plane namespace) and worker node metrics (these ones are required to measure the usage in the worker nodes hosting the HCP) - Hosted cluster Prometheus: From here we scrape data-plane container metrics, and also metrics from kube-state-metrics that are mostly used to count and get resources from the cluster. Signed-off-by: Raul Sevilla <rsevilla@redhat.com>
Signed-off-by: Raul Sevilla <rsevilla@redhat.com>
Signed-off-by: Raul Sevilla <rsevilla@redhat.com>
17a07c2
to
ed0c7cf
Compare
Gotcha, this workload is still not integrated within the kube-burner OCP wrapper. |
Unfortunately, the numbers don't line up in a way that makes that fit exactly. I don't remember the complete delta but it was large enough for us to make this a separate entity in the first place. |
Here is the actual resource from prod managed-service telemetry data, we run P75 load. |
@dry923 IIRC, you mentioned something about the expression used to obtain the HCP namespace. Mind you refresh my mind? |
b484cff
to
ac42d2d
Compare
Signed-off-by: Raul Sevilla <rsevilla@redhat.com>
ac42d2d
to
6eb305e
Compare
### Cluster-density and cluster-density-v2 | ||
|
||
- **ITERATIONS**: Defines the number of iterations of the workload to run. No default value | ||
- **CHURN**: Enables workload churning. By default is true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since CHURN is enabled by default, can you kindly add the default CHURN_DURATION as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey!, it's already possible to customize churning options by using the EXTRA_FLAGS variable.
i.e:
$ export EXTRA_FLAGS="--churn-duration=1d --churn-percent=5 --churn-delay=5m"
$ ITERATIONS=500 WORKLOAD=cluster-density-v2 ./run.sh
The reason I didn't add this variable, and others was to keep this implementation as simple as possible and not to start adding more and more variables w/o control as the previous kube-burner e2e-benchmarking implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having the default CHURN values documented here would make life easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll add some examples to the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Raul!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, only a small nit on the docs -- it would be good to have the default churn values documented here... Need this change to stop sitting in a holding pattern.
Signed-off-by: Raul Sevilla <rsevilla@redhat.com>
668fe28
to
2249947
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Signed-off-by: Raul Sevilla <rsevilla@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* HyperShift multi-endpoint Wrapper to run kube-burner using the new --metrics-endpoints flag from kube-burner that allows to grab metrics and evaluate alerts from different Prometheus endpoints: For the HyperShift scenario work we need: - OBO Stack: This metrics scraped from this endpoint are metrics from Hosted CPs, such as etcd/API latencies, etc. As this endpoint is not public exposed by default, the script sets up a route to allow kube-burner to reach it. No authentication is required currently. - Management cluster Prometheus: The metrics we use from this endpoint are mgmt cluster container metrics (from the hosted control-plane namespace) and worker node metrics (these ones are required to measure the usage in the worker nodes hosting the HCP) - Hosted cluster Prometheus: From here we scrape data-plane container metrics, and also metrics from kube-state-metrics that are mostly used to count and get resources from the cluster. Signed-off-by: Raul Sevilla <rsevilla@redhat.com> * Add docs Signed-off-by: Raul Sevilla <rsevilla@redhat.com> * Add QPS, BURST and GC variables Signed-off-by: Raul Sevilla <rsevilla@redhat.com> * Update kube-apiserver metric expressions Signed-off-by: Raul Sevilla <rsevilla@redhat.com> * Bump kube-burner version Signed-off-by: Raul Sevilla <rsevilla@redhat.com> * Improve EXTRA_FLAGS docs Signed-off-by: Raul Sevilla <rsevilla@redhat.com> * Use regex for HCP_NAMESPACE Signed-off-by: Raul Sevilla <rsevilla@redhat.com> --------- Signed-off-by: Raul Sevilla <rsevilla@redhat.com>
Description
Wrapper to run kube-burner using the OCP wrapper, also providing support for the new --metrics-endpoints flag that allows to grab metrics and evaluate alerts from different Prometheus endpoints, needed for HyperShift clusters:
For the HyperShift scenario work we need:
Example output