-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multicluster support for Grafana #3405
Comments
For the maintainers, I'd also like to express a willingness to contribute to this project, at the moment this is to solicit some feedback on the problem and the proposed solution. |
Adding an optional cluster dropdown to the dashboards sounds like a solid idea. Would it be possible to hide if someone's not using thanos? This subject in particular feels like something that we could do a lot on. I'd love to get a multi-cluster reference architecture together so that we can think a little bit more about which pieces can really be improved. |
Thanos isn't the factor that makes this an issue - it's just a mechanism to consolidate metrics into cheap storage and solve some issues with federation. My description was more to demonstrate one architecture that this could benefit. If we did this with jsonnet (which is pretty standard for this domain AFAIK) then something like the following would achieve multi-cluster support (of course, there could be large expansions on this (datasources etc):
It probably mean potentially moving this into another repo (purely for noise). |
As a side note, it would also be possible from the same jsonnet package to perform a full linkerd2 install. This would expand installation options beyond |
@grampelberg I've created a repo to demonstrate this functionality - although for a disclaimer I've only migrated one panel for one dashboard. I'd be interested in getting your feedback as to whether or not this is something that I should spend some time on. The readme should hopefully be self explanatory. |
That's pretty cool! Where is the source being fetched (as someone who knows nothing about jsonnet)? |
The source for the namespace? |
|
Not sure I'm following the question? |
Oh, this is building the dashboard for grafana - not taking the existing |
Yeah precisely. You could patch it, but if we go to the effort of building this with jsonnet we may as well go all in and rebuild them with more customisation |
I'm really hesitant to bring I've gone down the "dashboards in my own grafana" path a couple times and the only sustainable solution was export from the linkerd grafana, import into your own. Documenting jsonnet as a way to patch/build the dashboards I'm 100% behind. Honestly, I'm pretty happy just updating these dashboards to be cluster aware, especially if we just detect the cluster label and hide the dropdown for folks who don't have it setup that way. |
Well, a couple of thoughts:
|
Both options sound great to me. What are you thinking around storing the JSON output? All the dashboards are already in JSON - https://github.com/linkerd/linkerd2/tree/master/grafana/dashboards |
Yep, I’ve used those and ended up here 😂 Obviously linkerd needs these as part of the install (you’ve seen my other issue about unbundling). That suggests there may be more options:
|
For this kind of advanced usage, I don't think it makes sense to allow configuration at this level as part of the install for all the install methods. To your point, the unbundling stuff is definitely important and the correct (tm) way to go. I would love to provide the tools and docs to get the dashboards for a specific version and configure as you see fit. That's kinda where the kustomize install documentation was going. |
I think this all leads to having it in a separate repo. Does that sound sensible? If so, I’d suggest the following: I’ll continue with some work on this tomorrow (I’m on GMT). I’ll get a single dashboard complete using what I’ve prepared already for preview. You can chat internally about where this lives. I’m happy to host on my github account if it’s your preference but I’m also happy to pass it over if you see fit. We build out the remaining dashboards and once we’re at parity, look at a release that ties in with linkerd’s trunk. If you guys have any resource to throw at it that would be most welcome but I appreciate you may not. I’m happy to continue this from here and maintain involvement, including any documentation that may be involved in proposing this as a production recommendation. |
That sounds like a fantastic plan to me! |
👋 @grampelberg - just a courtesy note to mention I'm looking at some other things so haven't had the chance to circle back around to this yet in any meaningful way - I'll do so in due course and update you on here when I can carve that time out. |
Looking forward to it! |
Hi @grampelberg, I didn't get the opportunity to dive into this further until this weekend. Anyways, I've done quite a bit of work this weekend getting a lot of the boilerplate set up and the top line dashboard generated via jsonnet: https://github.com/andrew-waters/linkerd2-mixin If you'd like to test it, grab the repo and there should be some instructions for deps. You can then run a It's particularly worth pointing out https://github.com/andrew-waters/linkerd2-mixin/blob/master/config.libsonnet - this is where all the config happens and when you use the repo as a library (ie in another project), you can customise all the way down from the parent. |
Cool! I'm gonna need to spend some more time getting this into my brain. @zaharidichev you're doing a lot of thinking about multi-cluster right now, mind taking a look? Also, @Pothulapati, check it out! This might be interesting around the grafana chart work we're doing right now. |
@andrew-waters That's Awesome! Thank you so much for doing this. As you know right now, we have the dashboard config statically pushed into a config map. This would allow some great extensibility. Right now, I am working on a way to have add-ons for Linkerd2, and then move out grafana and prometheus as add-ons. Once we do that, We can maybe update the grafana charts with the mixin generated config. WDYT? |
@Pothulapati sounds great - I have a related issue over at #3406 which sounds similar to the work you are doing on add-ons and whilst not a prerequisite, certainly helps in decoupling. If you have any issues I can track on that side, please let me know. In terms of how we apply the mixin, that's definitely a solid sequence. That will also give me the chance to get the other dashboards migrated across to the mixin for own testing and we can also consider how they are built as part of CI. Just to get some idea on commitments, do you have some rough ideas on your own timescale? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
Feature Request
What problem are you trying to solve?
At the moment, linkerd2 supplies some hard coded dashboards for Grafana.
These work very well when viewed on the cluster, but don't take account for use cases where a single pane of glass is required with the ability to drill down into metrics for that particular cluster.
How should the problem be solved?
The standard way of presenting this data is to have a variable within the dashboard that allows
cluster: all
to be selected, giving you an aggregated view of your clusters whilst also allowing selection by the cluster name.In order to achieve this, the metrics collected by prometheus need to add an external label (the cluster name). For example, using kube-prometheus, one could could write:
This would then give the prometheus instance being queried enough information about it's targets.
Projects like
kubernetes-mixins
allow for the dynamic generation of dashboards using jsonnet. This allows them to support multi-clusters with very little changes for the operator.This would make a solid addition to the operational offering linkerd2 gives.
An image at the bottom of this issue is how this would be presented.
jsonnet is the proposed language to write this in and it could be done in an external repo to reduce noise within the linkerd2 repo. I'd propose
github.com/linkerd/linkerd2-mixins
. Note that this could also offer the ability to install linkerd2 via jsonnet instead of helm which again increases the offering.It's also proposed that linkerd2 could keep the hard coded dashboards and introduce the output from the new application as part of CI. This way operators could quickly grab the manifests.
Any alternatives you've considered?
Manually creating these dashboards is unnecessary work although it's viable.
How would users interact with this feature?
The text was updated successfully, but these errors were encountered: