Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to set data retention / ILM policies for integrations #986

Closed
1 of 4 tasks
mostlyjason opened this issue Aug 13, 2021 · 15 comments · Fixed by #1194
Closed
1 of 4 tasks

Document how to set data retention / ILM policies for integrations #986

mostlyjason opened this issue Aug 13, 2021 · 15 comments · Fixed by #1194
Assignees
Labels
needs-input Team:Fleet Label for the Fleet team v7.16.0

Comments

@mostlyjason
Copy link
Contributor

mostlyjason commented Aug 13, 2021

Description

When users create integrations, they control the retention of their data through ILM policies set on index templates. It's not obvious how to do this because we abstract away these lower level concepts in the UI. We should document how these concepts connect to integrations and how users can set their data retention.

Collaboration

  • The docs team will lead producing the content
  • The product team will provide the initial content and the docs team will edit / review
  • The docs team will define with product team the structure and location, and the product team will provide the initial content
  • Other (please describe)

Contact Person:

(We need to have a contact person in the product/development team to provide information about how the item to be documented works.)
@hop-dev

Suggested Target Release

7.16

Stakeholders

(Please list any stakeholders for this issue.)
@mostlyjason @leehinman

Related: elastic/kibana#108554

@mostlyjason mostlyjason added the Team:Fleet Label for the Fleet team label Aug 13, 2021
@mostlyjason
Copy link
Contributor Author

FYI @jethr0null since we discussed this in the data lifecycle working group

@hop-dev
Copy link
Contributor

hop-dev commented Sep 7, 2021

On elastic/kibana#108554 there has been some discussion about a more basic approach for customising data retention across all namespaces using the supplied @custom component templates.

Currently by default, each data stream that is part of a package has the metrics index lifecycle policy specified in the matching index template, this means that the policy cannot be overridden by specifying it in the *@custom component template. One exception to this default behaviour is if a package defines their own ILM policy as part of installation (for example the APM package).

Nicholas raised moving the ILM policy to a component template on this PR which would enable a user to override with their own policy. In which case we would be able to consider documenting this option.

The more comprehensive (and complicated!) approach will be to use index templates which are scoped to the namespace, I'll follow with more detailed steps.

@hop-dev
Copy link
Contributor

hop-dev commented Sep 7, 2021

<Deleted an outdated set of steps here, superseded by my comment below>

@hop-dev
Copy link
Contributor

hop-dev commented Sep 7, 2021

@jen-huang on step 4 (iv.) above I recommend removing managed and managed_by when a user is overriding the index template for an integrations data stream (as the template is now managed by the user) do you agree with this?

@andresrc
Copy link
Contributor

andresrc commented Sep 9, 2021

@mostlyjason can you describe the expected collaboration here? We recommend using the provided issue template.

@mostlyjason
Copy link
Contributor Author

@andresrc I updated the description to include the new template format.

@leehinman I tagged you as a stakeholder in the data lifecycle mgmt working group. The goal for this it to document the current process for setting ILM policies for integrations, and we'll refine the UX for this in a later release. Would be nice to get your review on the content above.

@hop-dev
Copy link
Contributor

hop-dev commented Sep 22, 2021

@mostlyjason @joshdover I have laid out instructions for the two options we have here in a very rough way here:

https://gist.github.com/hop-dev/3e7798fd06c13a9acf36594aa797d4ba

before I create draft documentation we need to decide whether we are going to document option 1 or 2 (or both), I'll include a high level description here but more detail can be found in the gist above:

  • option 1 - use *@custom component templates to set an ILM policy for a data stream across all namespaces, this option is only available for integrations installed by kibana 7.16.

  • option 2 - use index templates to set an ILM policy for one (or all) namespaces, slightly more complicated but version agnostic I believe.

Both options involve the user making changes per-datastream which isn't ideal but I don't think can be avoided.

@mostlyjason
Copy link
Contributor Author

@hop-dev @joshdover Offering users two different ways seems like we are making the decision process more complex for users. Is there a way we can recommend a single solution that solves most users' needs? When we implement our medium term solution in the UI, which of this approaches will we implement? That might be best from a long term perspective.

From a user needs perspective, I think we need the ability to have different ILM policies per namespace. The whole point of using namespaces is to partition different use cases or teams. For example, I want to retain my production logs for a week but my test environment logs for 24 hours. Another example is that team A wants to retain data for a week and team B wants it for 24 hours.

I'd advocate either going the index template route or change how we set up the component templates to allow namespace support. What is the UX impact of this decision or is it purely technical? Would it impact the ability to upgrade templates over time? Would it help to set up some time to discuss the options live?

@joshdover
Copy link
Contributor

I'd advocate either going the index template route or change how we set up the component templates to allow namespace support. What is the UX impact of this decision or is it purely technical? Would it impact the ability to upgrade templates over time? Would it help to set up some time to discuss the options live?

In order to support namespace-specific customizations, we'll need to create a duplicate index templates and component templates for each namespace x data stream. Where we are right now, I doubt we want to be creating these by default since it will result in a very large number of templates for each integration that is installed. There are both technical and UX issues with creating such a large number of templates out of the box:

  • It creates UX issues in the Index Management UI that are likely solvable (eg. filtering out managed templates by default), however it also adds quite a bit of noise to the /_cat/templates API and other template APIs. We could also consider filtering out managed templates from such APIs by default, but it would be considered a breaking change in Elasticsearch and would need to be considered very carefully.
  • From a technical standpoint, I believe there is an upper bound to the amount of data that Elasticsearch's "cluster state" can reliably handle. This is where all cluster-wide data is stored, including index templates. That said, I think we need to benchmark this or get some assistance from the Elasticsearch team here to determine the actual limitations.

The workaround would be for Fleet to create the namespace-specific templates on-demand as users opt-in to customizations that require them. This is essentially what option (2) is documenting how to do manually.

All that said, I think we need to test these assumptions and really see at which point these UX and Elasticsearch performance problems begin to arise. We can then determine how to proceed with the the long-term path forward. Our implementation would be far simpler, less likely to break, and easier to support and document if we created all the namespace-specific template in all scenarios. IMO, we should try to find a long-term path forward that allows that option.

When we implement our medium term solution in the UI, which of this approaches will we implement? That might be best from a long term perspective.

Back to this question, I think the answer right now is we'll almost certainly need to either have namespace-specific templates or modify how templates work in some way to avoid that need but still be able to deliver this feature (which feels unlikely to me at this point).

I think it'd be ok to proceed with option (2), even though it is more complicated. I suspect that we'll need to have logic that detects if a user has a template that overrides what they set in our UI (once we have it) and we can handle any manually created templates based on these docs in the same way as other user-created templates.

@joshdover
Copy link
Contributor

Yesterday we discussed the different options and need for further investigating for the long-term solution here, but we didn't make a decision on what to do for phase 0 for 7.16. @mostlyjason any opposition to moving forward with option (2)? I believe it's likely to be compatible with any long-term option we end up choosing and gives user on 7.16 the most flexibility to use namespace-specific ILM policies.

@hop-dev
Copy link
Contributor

hop-dev commented Oct 11, 2021

@joshdover @mostlyjason and I have just met and discussed the way forward:

Decisions

  1. We will document option 2 as it offers namespace specific settings
  2. We will instruct the user to create a component template using the <type>-<dataset>-<namespace>@custom naming convention e.g metrics-system.network-production@custom
  3. When duplicating the index template, we will instruct the user to keep it set as "managed"
  4. Currently our index templates are priority 200, we will instruct users to set the namespace specific index template to 250

Notes for future implementation

  1. We are relying on the user applying their custom settings using the @custom component template, however we may have to detect and warn the user if they ave applied settings to the index template itself?

@hop-dev
Copy link
Contributor

hop-dev commented Oct 15, 2021

@mostlyjason @joshdover here is my first draft of the guide: https://docs.google.com/document/d/1t29yHm5rHHUNTiI4DjgowUiI48LoK3X33SkpssP6ocY/edit?usp=sharing

@joshdover
Copy link
Contributor

@dedemorton I believe these docs are ready for you to take from here ^. Let us know if you need anything more from us.

One thing we need to know before merging the related PR for adding the link to the UI is where in the Fleet Guide we'll be linking to. Would it be possible to determine the page URL & anchor id before we write the docs so we can add the link in the UI?

@mostlyjason
Copy link
Contributor Author

Just dropping a link to a doc page that should probably be updated https://www.elastic.co/guide/en/fleet/7.15/data-streams.html#data-streams-ilm

@bmorelli25 bmorelli25 self-assigned this Oct 21, 2021
@bmorelli25
Copy link
Member

Hey everyone! I'll be the technical writer helping out with this issue. Thanks for all of the work so far on this.

@hop-dev Thanks for the draft! I'll take a look and let you know if I need anything else.

@joshdover The link Jason provided is probably a safe bet. If we need to move the content later I can set up redirects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-input Team:Fleet Label for the Fleet team v7.16.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants