Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring your own prometheus #16

Merged
merged 11 commits into from
May 18, 2020
Merged

Bring your own prometheus #16

merged 11 commits into from
May 18, 2020

Conversation

Pothulapati
Copy link
Contributor

@Pothulapati Pothulapati commented Apr 15, 2020

Rendered: https://github.com/Pothulapati/rfc/blob/prom/design/0003-byop.md

Bring your own prometheus

This RFC aims to move prometheus as an add-on i.e not mandated but advised.
This includes moving the linkerd-prometheus manifests as a sub-chart and
allowing more configuration (more on this later), along with documentation
and tooling to use Linkerd with an existing prometheus installation.

Problem Statement

Lots of users already have a exisiting prometheus installation in their
cluster and don't like managing another prometheus instance especially
because they could be resource intensive and painful. They usually
have a prometheus installation paired up with a long term monitroing
solutions like Cortex,InfluxDB, or third party solutions like Grafana
Cloud, etc. In those cases prometheus is just a temporary component, that
keeps pushing metrics into their long term solution, making users want to
re-use their exisitng prometheus.

The following are the list of problems that we plan to solve:

  • Hard requirement of linkerd-prometheus.

  • Configuring the current linkerd-prometheus for remote-writes, etc is
    hard, because the prometheus configmap is overridden after upgrades.

  • Not enough documentation and tooling for bringing your own prometheus
    use-case.

Once the RFC is implemented, Users should not only be able to use their
exisiting prometheus, but configuration of the linkerd-prometheus for
remote-writes, etc should also be easy.

Signed-off-by: Tarun Pothulapati tarunpothulapati@outlook.com

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

Once the RFC is implemented, Apart from the default prometheus enabled installation Users should be
able to use their exising prometheus instance, to also scrape metrics from the proxy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind outlining the UX side of this a little? We want to make sure that none of the existing linkerd tooling breaks (instead using their prometheus for it all).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added

### User Expereince

Currently, `public-api`, `grafana` and `heartbeat` are the only three
components which are configured with a `prometheus-url` parameter,
All the Dashboards and CLI are powered through the `public-api`.
This allows us to swap prometheus without breaking any exisiting linkerd
tooling. Though it is expected that the `scrape-config` is configured
correctly, when this path is choosen.

Are you looking for anything more?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During implementation, I think we should audit all Prometheus queries from Linkerd, to ensure they are scoped to job="linkerd-proxy", or similar. We don't want Linkerd accidentally querying for non-Linkerd metrics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that makes sense, I will definitely see if there are any places where we use metrics that are scraped from stuff outside the scrape-config provided in documentation, and raise issues (if there are any)

design/0003-byop.md Outdated Show resolved Hide resolved
By moving Prometheus as an add-on, We can make it as an optional component that can be used both through Helm and Linkerd CLI.
The add-on model would well with upgrades, etc as per [#3955](https://github.com/linkerd/linkerd2/pull/3955). This is done by
moving the prometheus components into a separate sub-chart in the linkerd repo
and updating the code. The same health checks are also expected, but as a separate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're gonna need a different set of health checks for BYO prometheus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, Theres a section later in the doc about Linkerd check tooling. This was about having the current linkerd checks that we do for control-plane components for the prometheus add-on too.

name: linkerd-prometheus
url:
image:
global:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it possible to load all these from files? Then folks could just manage configmaps and not actually change the chart in any way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the first step is to extract our current Prometheus configmap into a separate yaml template (currently it lives together with the prom service and deployment). That configmap is currently being reworked in linkerd/linkerd2#4220, which adds the ability to plug-in recording and alerting rules through separate configmaps (no other prom configs items are pluggable like that from what I can tell).
Then this new Prometheus addon values.yaml could just have a few minimal entries like enabled, image, etc, and the new entries from that PR to configure rules. We could also have an entry to entirely override the provided Prometheus configmap with an external one, in which case the user would need to be advised to follow the scrape configs from the default configmap.
Makes sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @alpeb mentioned, I integrated stuff from linkerd/linkerd2#4220 into this prometheus field. It dosen't help a lot, as only rules, etc can be added as files and other configuration like remote_write, alerting etc feels better to be taken directly.

Then this new Prometheus addon values.yaml could just have a few minimal entries like enabled, image, etc, and the new entries from that PR to configure rules.

Users can always choose to ignore fields that he dont want to fill!

We could also have an entry to entirely override the provided Prometheus configmap with an external one, in which case the user would need to be advised to follow the scrape configs from the default configmap.

This is interesting but it essentially feels like a BYOP case? and requires complication on maintaing the linkerd-addons configmap.


## Using Exisiting Prometheus

First, Documentation will be provided at `linkerd.io` to guide users on adding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it is in-scope for this, but it would be awesome if we could make the scrap config update automatically like we've done for the CLI args and help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I've spoke to @alpeb, This was definitely considered in-scope for the problem. We could definitely write some CLI tools for supporting, or even automatically updating the prom config with the current scrape-config, but having documentation on what the current scrape-config is, how to add it, etc should be good right?

design/0003-byop.md Outdated Show resolved Hide resolved
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
|---------|------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| true | "" | Install linkerd-prometheus, and configure control-plane components to use it. |
| false | "prometheus.xyz" | No linkerd-prometheus, control-plane components will be configured to url |
| true | "prometheus.xyz" | Install linkerd-prometheus, but stil configure control plane to use different url, useful when linkerd-prometheus is configured to remote-write into long term storage solution at url, Thus powering dashboard and CLI through long term storage. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really interesting. Would it make sense to have some validation in this case ensuring that remote-write is setup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that make sense.Should we consider it in the hot path i.e to make the public-api perform the check when initializing or have Linkerd check tooling that checks this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be checked at install/upgrade time by adding a Helm validation like what we do in indentity.yaml for ensuring a cert is provided. I can't recall if that also gets applied during CLI install/upgrade; if not, then that'd be something to consider in the CLI scripts.

Comment on lines 121 to 126
- Check if the required scrape-configs are present in proemetheus cm.
( It should be noted that, the prometheus could be present in an differents
namespace, etc)

- Check if the common metrics are queriable in prometheus, by using the
end-point i.e `prometheus.url` from the `linkerd-values` cm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A downside of trying to check the configmap is that we might not know which one it is, nor have access to it.

A downside of querying prometheus is that that might not give us enough info to verify the latest scrape config has been adopted (some subtle scrape config changes might not be reflected easily as a query result). I wonder if we could add some sort of version number in the scrape config that could be reflected as a query result...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A downside of trying to check the configmap is that we might not know which one it is, nor have access to it.

Hmmm, we can mitigate this, by making the user pass the configmap name to the CLI, but there will still be the access problem but the good side should be that we can also have a command which tires to update the scrape config to the current version's scrape config.

I wonder if we could add some sort of version number in the scrape config that could be reflected as a query result...

This is definitely interesting. but from what I searched, I couldn't find a way to add a param into the scrape-config but can we have a metric on the scrape-config version? If yes who would be the right candidate to check and add for the metric? Also, wondering if there would be any inconsistencies on the actual config, and this metric.

design/0003-byop.md Outdated Show resolved Hide resolved
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
@Pothulapati
Copy link
Contributor Author

A POC of this rfc is available here, just to make sure there are no blockers.

Copy link
Member

@siggy siggy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great!

design/0003-byop.md Outdated Show resolved Hide resolved

Once the RFC is implemented, Apart from the default prometheus enabled installation Users should be
able to use their exising prometheus instance, to also scrape metrics from the proxy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During implementation, I think we should audit all Prometheus queries from Linkerd, to ensure they are scoped to job="linkerd-proxy", or similar. We don't want Linkerd accidentally querying for non-Linkerd metrics.

Comment on lines 93 to 97
First, Documentation will be provided at `linkerd.io` to guide users on
adding the `scrape-configs` required, to get the metrics from linkerd
proxies, control plane components, etc. This can be found
[here](https://github.com/linkerd/linkerd2/blob/master/charts/linkerd2/templates/prometheus.yaml#L26).
This can also be automated like we already do with Linkerd CLI help and args.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree some automation via CLI is useful. Some kind of CLI command that outputs the Prometheus ConfigMap necessary to scrape Linkerd proxies. We don't change that scrape config often, but we'll want to ensure that the ConfigMap provided to users matches their Linkerd installation.

Additionally, we should have some documentation on the storage requirements for Linkerd metrics in Prometheus. We workaround this today by having a 6h retention time, but this could become a significant issue for existing Prometheus installations with longer retention times. This could motivate customizations to the scrape config (like scrape interval changes), to help decrease storage requirements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this. On the additional documentation part, I definitely think we have some stuff in mind around retention times, etc, and more can be gathered as we understand problems when users use their existing prometheus.

design/0003-byop.md Outdated Show resolved Hide resolved
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Pothulapati , I think this covers all we have in mind, and should be enough to start implementation 👍

@admc admc requested review from adleong, grampelberg and olix0r May 6, 2020 16:07
Copy link
Contributor

@grampelberg grampelberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

have a prometheus installation paired up with a long term monitroing
solutions like Cortex,InfluxDB, or third party solutions like Grafana
Cloud, etc. In those cases prometheus is just a temporary component, that
keeps pushing metrics into their long term solution, making users want to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I really understand what users don't like about the existence of more than one Prometheus in their cluster. What is the actual downside to having Linkerd-prometheus as a temporary storage for metrics before they are pulled into long-term storage via federation or remote-writes? Does the linkerd-prometheus have a high resource cost? Can we quantify this? Is there some greater operational burden to having the linkerd-prometheus exist? what is that burden?

Copy link
Contributor Author

@Pothulapati Pothulapati May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, linkerd-prometheus isn't any different from the normal installation, and dosen't also get any specific tweaks for Linkerd. I definitely think users should be using the linkerd-prometheus as it dosen't do anything different and the operational burden shouldn't matter that much if there is already a Prometheus and long-term solution. We will be suggesting the same in the documentation.

On the resource cost,
We have the following in ha mode:

prometheusResources:
  cpu:
    limit: "4"
    request: 300m
  memory:
    limit: 8192Mi
    request: 300Mi

This isn't a lot but we have definitely seen cases where prometheus was being OOM, etc. I don't think this will be any better in the BYOP case, and the problem could be further amplified.

I think, It's just one of those obv questions that users stumble upon initially "Can I use my existing prometheus?" but it would later make sense as they start using, to have separate one for Linkerd.

IMO the main value of this rfc is separating out prometheus from the contorl-plane components in charts, configuration and having a life-cycle of its own for updating configuration, etc. BYOP is just another benefit that we are getting out of this. WDYT?


The following are the list of problems that we plan to solve:

- Hard requirement of linkerd-prometheus.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the actual problem that this solves, though? Why is hard requirement of linkerd-prometheus bad?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good decision technically as there are clear separation of concerns here but a lot of people consider it as a shortcoming and have been asking the bringing their own prometheus use-case.


- Hard requirement of linkerd-prometheus.

- Configuring the current linkerd-prometheus for remote-writes, etc is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

- Configuring the current linkerd-prometheus for remote-writes, etc is
hard, because the prometheus configmap is overridden after upgrades.

- Not enough documentation and tooling for bringing your own prometheus
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's very important that we give a clear rubric in the documentation that allows users to figure out if BYO prometheus is something that they should do. I think that for the (vast?) majority of users, even users that are already running a prometheus, using linkerd-prometheus in conjunction with their own prometheus is the right thing to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, Totally. and we will definitely have to highlight it in the documentation of BYOP case.

have the relevant scrape-configs and default configuration.


### Prior art
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to describe alternative solutions that we considered and explored and why they were not suitable.

For examples:

  • agressive tuning of linkerd-prometheus
  • exposing more tuning knobs of linkerd-prometheus
  • dropping labels

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exposing more tuning knobs of linkerd-prometheus

This rfc address this by exposing the global field at prometheus.global where most tuning around retention period, etc can be configured.

All the tuning stuff though being very helpful to make the linkerd-prometheus more stable, managable and not run into situations like OOM, etc when there is sudden spikes in queries, etc still won't address the user's question of getting their own prometheus, which we definitely heard a lot! (definitely not suggestible though)

I added this suggestion. Feel free to comment, if you would like me to add anything!

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the motivation and goals for this project and I think everyone involved does as well, but I want to call out that I don't think the problem statement in this RFC accurately reflects that. I don't know if that should be a blocker to this moving forward or not.

The design plan looks great to me.


## Problem Statement (Step 1)

[problem-statement]: #problem-statement
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this problem statement misses the point, at least as far as I understand it.

Having an existing prometheus installation is not, on its own, a good reason to omit linkerd-prometheus. Furthermore, in the vast majority of cases, there is no significant resource or operational cost to running linkerd-prometheus. This is true when linkerd-prometheus is used on its own for short-term retention and when linkerd-prometheus is paired with another solution for long-term storage.

As far as I'm aware the only good reason for bringing your own prometheus is if you operate at a large scale or have very specific prometheus configuration requirements and need to do extensive tuning or configuration of prometheus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants