contrib/mixin: Generate rules, fix tests #13671

mrueg · 2022-02-06T21:59:05Z

Primary intent of this PR is to have a rendered rules file that we can include in kube-prometheus-stack.
Additionally I added the following pieces:

Add Makefile
Make tests runnable
Add generated rule manifest file

After changing the time series (series only seems to support plain time series) in the test.yaml so the tests run, those two still fail:

promtool test rules test.yaml
Unit Testing:  test.yaml
  FAILED:
    alertname:etcdHighNumberOfLeaderChanges, time:10m, 
        exp:"[Labels:{alertname=\"etcdHighNumberOfLeaderChanges\", job=\"etcd\", severity=\"warning\"} Annotations:{description=\"etcd cluster \\\"etcd\\\": 4 leader changes within the last 15 minutes. Frequent elections may be a sign of insufficient resources, high network latency, or disruptions by other components and should be investigated.\", summary=\"etcd cluster has high number of leader changes.\"}]", 
        got:"[]"
    alertname:etcdExcessiveDatabaseGrowth, time:10m, 
        exp:"[Labels:{alertname=\"etcdExcessiveDatabaseGrowth\", job=\"etcd\", severity=\"warning\"} Annotations:{message=\"etcd cluster \\\"etcd\\\": Observed surge in etcd writes leading to 50% increase in database size over the past four hours, please check as it might be disruptive.\"}]", 
        got:"[]"

make: *** [Makefile:15: test] Error 1

Let me know if anyone has time to provide a fix (I tried to fix them, but for some reason my ideas didn't work).

@ptabor

codecov-commenter · 2022-02-06T22:37:24Z

Codecov Report

Merging #13671 (f668026) into main (986a2b5) will decrease coverage by 0.22%.
The diff coverage is n/a.

❗ Current head f668026 differs from pull request most recent head 70f8524. Consider uploading reports for the commit 70f8524 to get more accurate results

@@            Coverage Diff             @@
##             main   #13671      +/-   ##
==========================================
- Coverage   72.91%   72.68%   -0.23%     
==========================================
  Files         465      465              
  Lines       37947    37858      -89     
==========================================
- Hits        27668    27518     -150     
- Misses       8514     8562      +48     
- Partials     1765     1778      +13

Flag	Coverage Δ
all	`72.68% <ø> (-0.23%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
server/proxy/grpcproxy/register.go	`69.76% <0.00%> (-9.31%)`	⬇️
client/v3/namespace/watch.go	`87.87% <0.00%> (-6.07%)`	⬇️
raft/rafttest/node.go	`95.00% <0.00%> (-5.00%)`	⬇️
api/etcdserverpb/raft_internal_stringer.go	`76.78% <0.00%> (-4.94%)`	⬇️
client/v3/leasing/cache.go	`87.77% <0.00%> (-3.89%)`	⬇️
server/etcdserver/api/rafthttp/msgappv2_codec.go	`69.56% <0.00%> (-3.48%)`	⬇️
server/etcdserver/api/v3rpc/member.go	`93.54% <0.00%> (-3.23%)`	⬇️
server/lease/leasehttp/http.go	`62.77% <0.00%> (-2.92%)`	⬇️
server/etcdserver/api/v3election/election.go	`66.66% <0.00%> (-2.78%)`	⬇️
client/v3/experimental/recipes/key.go	`75.34% <0.00%> (-2.74%)`	⬇️
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 986a2b5...70f8524. Read the comment docs.

serathius · 2022-02-07T19:41:33Z

contrib/mixin/manifests/etcd-prometheusRules.yaml

@@ -0,0 +1,147 @@
+groups:


Please don't add generated files to repo

That was the main intent, have it here so we can use it in https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/hack/sync_prometheus_rules.py#L63

Then we should add a test that ensures that this file is up to date, so that PRs that change rules always need to regenerate this file.

I would like a second opinion of fact that we are adding this file. @tomwilkie @lilic

I agree with @serathius here, you should rarely want to use a generic generated yaml and often override configs like namespaces and other fields. and @mrueg it might be better to sync on the non-generated files rather than yaml?

In general I agree with not having rendered output in a repository (as it might run out of sync), this approach follows what https://github.com/prometheus-operator/kube-prometheus/tree/main/manifests and https://github.com/monitoring-mixins/website have been doing and had been practice in the etcd-3.4 and previous releases https://raw.githubusercontent.com/etcd-io/website/master/content/en/docs/v3.4/op-guide/etcd3_alert.rules.yml

@brancz @paulfantom might be more up to date on this, but from what I remember there was talk about removing the generated manifests to discourage folks from applying those directly instead consuming the jsonnet.

Yes, we still have that issue in kube-prometheus and we are looking for alternatives to remove generated files from the repository. However, this is not a top priority right now.
As for mixins, https://monitoring.mixins.dev/ is regenerating etcd mixin daily and allows for consumption in YAML format (YAML files available in here)

As for using it in https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack, why not add jsonnet execution there and customize mixin to the needs of the helm chart? This way you can even add more mixins and consume them directly.

Thanks for providing additional context @paulfantom! I wasn't aware that you're looking into phasing out generated files. In general, I think from a documentation perspective a rendered file is more readable than the jsonnet. If the etcd maintainers would be okay, we could add it back to the website repo?

For https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack this currently relies on someone's own setup when running the script and providing the PR, thus I wanted to avoid to require additional tooling there. Eventually, it might be good to look into adding more mixins there if we need more and then have common tooling for it.

contrib/mixin/Makefile

serathius · 2022-02-07T19:50:17Z

Those tests were broken as they were not run automatically. Please integrate this tests in GitHub CI tests or there is no sense in fixing them.

mrueg · 2022-02-07T23:01:59Z

Tests are included now in github workflow. Still failing on the two tests.

.github/workflows/rules.yaml

serathius · 2022-02-08T11:38:39Z

Tests are included now in github workflow. Still failing on the two tests.

I would disable those two tests for now so we can at least have other tests running and preventing breakage. If you don't want to fix them in this PR, then please comment them out and file an issue for it so someone else can pick it up.

mrueg · 2022-02-08T12:36:49Z

I fixed one test and will take a look at the other one later. Thanks for the quick feedback!

serathius · 2022-02-10T11:24:41Z

Hey @mrueg, could you split the test fix into separate PR so we can merge it?

tomwilkie · 2022-02-10T14:42:32Z

Hi! Sorry for the late reply. It may be easier to use mixtool generate to compile the mixin, it will output yaml etc: https://github.com/monitoring-mixins/mixtool

* Add Makefile * Make tests runnable * Add generated rule manifest file Signed-off-by: Manuel Rüger <manuel@rueg.eu>

mrueg · 2022-02-10T15:19:07Z

.gitignore

@@ -14,6 +14,7 @@
 *.test
 hack/tls-setup/certs
 .idea
+/contrib/mixin/manifests


I added the folder to .gitignore here so it doesn't get added to the repo if someone generates it.

mrueg · 2022-02-10T15:24:45Z

contrib/mixin/manifests/etcd-prometheusRules.yaml

@@ -0,0 +1,147 @@
+groups:


Thanks for providing additional context @paulfantom! I wasn't aware that you're looking into phasing out generated files. In general, I think from a documentation perspective a rendered file is more readable than the jsonnet. If the etcd maintainers would be okay, we could add it back to the website repo?

For https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack this currently relies on someone's own setup when running the script and providing the PR, thus I wanted to avoid to require additional tooling there. Eventually, it might be good to look into adding more mixins there if we need more and then have common tooling for it.

mrueg · 2022-02-10T15:30:09Z

@serathius I have removed the rendered file
@tomwilkie thanks, I'll take a look! Could it be used in a generic github action that renders and publishes artifacts?

tomwilkie · 2022-02-10T15:41:04Z

@tomwilkie thanks, I'll take a look! Could it be used in a generic github action that renders and publishes artifacts?

I think that'd be a great idea - we're using it in a few places already (eg https://github.com/grafana/jsonnet-libs/blob/master/Makefile#L24). Its also got a linter that applied various lint rules to both the dashboards and the alerts that you might find useful.

serathius

Thanks for fixing tests. Great Job!

serathius · 2022-02-10T17:55:36Z

cc @spzala @ptabor

spzala

lgtm
Thanks @mrueg

serathius reviewed Feb 7, 2022

View reviewed changes

contrib/mixin/Makefile Outdated Show resolved Hide resolved

serathius reviewed Feb 7, 2022

View reviewed changes

contrib/mixin/Makefile Outdated Show resolved Hide resolved

serathius reviewed Feb 7, 2022

View reviewed changes

contrib/mixin/Makefile Outdated Show resolved Hide resolved

mrueg force-pushed the mixin-generate-manifests branch 5 times, most recently from 70f8524 to 6b78793 Compare February 7, 2022 22:53

serathius reviewed Feb 8, 2022

View reviewed changes

.github/workflows/rules.yaml Outdated Show resolved Hide resolved

serathius reviewed Feb 8, 2022

View reviewed changes

.github/workflows/rules.yaml Outdated Show resolved Hide resolved

mrueg force-pushed the mixin-generate-manifests branch 2 times, most recently from 082044b to a514863 Compare February 8, 2022 12:33

mrueg force-pushed the mixin-generate-manifests branch from a514863 to 4bfafa0 Compare February 9, 2022 22:57

contrib/mixin: Generate rules, fix tests

72c33d8

* Add Makefile * Make tests runnable * Add generated rule manifest file Signed-off-by: Manuel Rüger <manuel@rueg.eu>

mrueg force-pushed the mixin-generate-manifests branch from 4bfafa0 to 72c33d8 Compare February 10, 2022 15:17

mrueg commented Feb 10, 2022

View reviewed changes

serathius approved these changes Feb 10, 2022

View reviewed changes

spzala approved these changes Feb 15, 2022

View reviewed changes

serathius merged commit e814f6f into etcd-io:main Feb 15, 2022

mrueg mentioned this pull request May 30, 2022

[kube-prometheus-stack] Use kube-prometheus rules for etcd prometheus-community/helm-charts#2090

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contrib/mixin: Generate rules, fix tests #13671

contrib/mixin: Generate rules, fix tests #13671

mrueg commented Feb 6, 2022 •

edited

Loading

codecov-commenter commented Feb 6, 2022 •

edited

Loading

serathius Feb 7, 2022

mrueg Feb 7, 2022

serathius Feb 8, 2022

lilic Feb 10, 2022

mrueg Feb 10, 2022

lilic Feb 10, 2022

paulfantom Feb 10, 2022 •

edited

Loading

paulfantom Feb 10, 2022 •

edited

Loading

mrueg Feb 10, 2022 •

edited

Loading

serathius commented Feb 7, 2022

mrueg commented Feb 7, 2022

serathius commented Feb 8, 2022

mrueg commented Feb 8, 2022

serathius commented Feb 10, 2022

tomwilkie commented Feb 10, 2022

mrueg Feb 10, 2022

mrueg Feb 10, 2022 •

edited

Loading

mrueg commented Feb 10, 2022

tomwilkie commented Feb 10, 2022

serathius left a comment

serathius commented Feb 10, 2022

spzala left a comment

contrib/mixin: Generate rules, fix tests #13671

contrib/mixin: Generate rules, fix tests #13671

Conversation

mrueg commented Feb 6, 2022 • edited Loading

codecov-commenter commented Feb 6, 2022 • edited Loading

Codecov Report

serathius Feb 7, 2022

Choose a reason for hiding this comment

mrueg Feb 7, 2022

Choose a reason for hiding this comment

serathius Feb 8, 2022

Choose a reason for hiding this comment

lilic Feb 10, 2022

Choose a reason for hiding this comment

mrueg Feb 10, 2022

Choose a reason for hiding this comment

lilic Feb 10, 2022

Choose a reason for hiding this comment

paulfantom Feb 10, 2022 • edited Loading

Choose a reason for hiding this comment

paulfantom Feb 10, 2022 • edited Loading

Choose a reason for hiding this comment

mrueg Feb 10, 2022 • edited Loading

Choose a reason for hiding this comment

serathius commented Feb 7, 2022

mrueg commented Feb 7, 2022

serathius commented Feb 8, 2022

mrueg commented Feb 8, 2022

serathius commented Feb 10, 2022

tomwilkie commented Feb 10, 2022

mrueg Feb 10, 2022

Choose a reason for hiding this comment

mrueg Feb 10, 2022 • edited Loading

Choose a reason for hiding this comment

mrueg commented Feb 10, 2022

tomwilkie commented Feb 10, 2022

serathius left a comment

Choose a reason for hiding this comment

serathius commented Feb 10, 2022

spzala left a comment

Choose a reason for hiding this comment

mrueg commented Feb 6, 2022 •

edited

Loading

codecov-commenter commented Feb 6, 2022 •

edited

Loading

paulfantom Feb 10, 2022 •

edited

Loading

paulfantom Feb 10, 2022 •

edited

Loading

mrueg Feb 10, 2022 •

edited

Loading

mrueg Feb 10, 2022 •

edited

Loading